Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dedicated ML/heavyweight container for remote processing #1489

Open
stnokott opened this issue Nov 18, 2024 · 6 comments
Open

Dedicated ML/heavyweight container for remote processing #1489

stnokott opened this issue Nov 18, 2024 · 6 comments

Comments

@stnokott
Copy link

stnokott commented Nov 18, 2024

TL;DR

Suggestion: move machine learning and other heavyweight processing into a dedicated container. This will decouple those processes from the backend, allowing users to run them on a more powerful machine in cases where the media is stored on low-end hardware.

Similar to Immich's Remote Machine Learning.

Explanation

Let me preface this by saying that facial recognition and similar features are what I look for most in a self-hosted phot management software.
In that section, I think that Librephotos shines the brightest among other popular choices such as Immich or Photoprism.

There is one missing feature in Librephotos though that prevents me from using it for my large photo collection: Remote Machine Learning.
All my photos are stored on my NAS which has lots of storage space, but very low processing power. Besides that, I have a desktop PC with low storage, but very powerful hardware.

Now since my photo/video collection exceeds the storage space on my desktop PC by a large margin, I need to run it on the NAS. This means that indexing photos in Librephotos is very slow since I need to resort to the low-end NAS hardware.

Immich has a "feature" called "Remote Machine Learning" which is essentially just the machine learning container being decoupled from the backend.

This allows me to index my Immich photo collection on my desktop PC, finishing after about ~3h whereas the same collection takes ~12h in Librephotos, running on the NAS.

Yes, this has the downside that media files need to transferred away from the storage location for processing, but the higher the disparity between hardware capabilities in the storage and the processing machine, the more effective it becomes.

The complexity of implementing this change depends on how tightly backend and heavyweight processing code is coupled in Librephotos, so hopefully it won't cause too much trouble...

@derneuere
Copy link
Member

Do you have more information, which tasks are slow? They should be in the worker logs in the admin dashboard.

The current system works the following way: All heavyweight processing is happening in separate flask processes and communicate with http with the backend. The only task which does not happen in a separate task is clustering and classification of faces and the processing of regular images and videos.

How I would implement this:

Add an environment variable, which makes the LibrePhotos container run as a machine learning container instead. This env would block the usual endpoints and expose direct endpoints to the processing tasks.

On your main system would be a setting in the admin dashboard, where you could configure where your processing server is located.

Tasks, which should be externalized, are tasks which take longer than the network latency. This would currently be LLM generation, calculating clip_embeddings, calculating face embeddings and finding face locations, generating thumbnails for RAW images, image captioning and tags.

I do not think that generating thumbnails for regular images, would benefit from this approach, as the transfer of the image takes probably longer than calculating it on the slower machine. The same is true for extracting EXIF data and reverse geocoding images. Video processing could maybe benefit, but they have to be externalized as a separate service, which is currently not the case.

@derneuere derneuere transferred this issue from LibrePhotos/librephotos-docker Nov 18, 2024
@stnokott
Copy link
Author

Thanks for the swift reply!

Running a full rescan now so I can give you accurate metrics. Will take some time though, I'll keep you posted.

@stnokott
Copy link
Author

stnokott commented Nov 20, 2024

In the meantime, I have a question:

You said:

Tasks, which should be externalized, are tasks which take longer than the network latency. This would currently be LLM generation, calculating clip_embeddings, calculating face embeddings and finding face locations, generating thumbnails for RAW images, image captioning and tags.

I do not think that generating thumbnails for regular images, would benefit from this approach, as the transfer of the image takes probably longer than calculating it on the slower machine.

In my eyes, this isn't a question of which tasks should be externalized or not. This is a yes/no question.

If I configure my LibrePhotos installation so that it uses an external processor on a powerful machine, why would I only want to run the long-running tasks on that machine? The images have to be transferred over the network for the heavyweight tasks, sure, but once they're there, we can run all the tasks on that machine.

This seems like the most logical approach to me and also should be easier to implement and to maintain rather than having
to decide on which tasks to run where.

@stnokott
Copy link
Author

Unfortunately, I can't report proper runtimes due to execution errors in the jobs, an unrelated issue.

Can you support the feature request without those details?

@derneuere
Copy link
Member

If your goal is to run all processing tasks on one machine while saving the database and thumbnails on another, you might consider running the database and storing thumbnails on your NAS, while hosting LibrePhotos and the frontend on the desktop PC.

The immich feature only externalizes a couple of tasks e.g. smart search and face detection, but all other processing of the image are done on the original server.

Our backend has the following architecture: Django, with a background task system and API that communicates with the frontend and then flask services which run on different features e.g. LLM generation, clip embedding etc.

Currently, the flask services already communicate over a REST API and are decoupled, which makes it easy to put them on another machine and then call them for certain image processing tasks.

Processing the whole image and saving it to the database and NAS, would mean, that we have to support scaling horizontally with multiple machines, but I do not know if our background task system supports that.

@stnokott
Copy link
Author

stnokott commented Dec 1, 2024

Running the front- and backend on my desktop PC doesn't work well for me unfortunately.

I want the frontend to be available 24/7, without having to boot up my PC anytime I want to access LibrePhotos.
Since I only run the indexing jobs for LibrePhotos manually when I add new photos, in that case I can boot up my PC, let the indexing run and shut it down again.

All other components need to run permanently for LibrePhotos to work as far as I understand. The heavyprocess jobs are the only component I can decouple from the rest, that's where this feature request originated from.

If your backend doesn't support that, then I'm ok with that. Feel free to close this issue in that case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants