Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize some of the import work #64

Open
Montellese opened this issue Jan 28, 2020 · 2 comments
Open

Parallelize some of the import work #64

Montellese opened this issue Jan 28, 2020 · 2 comments

Comments

@Montellese
Copy link
Owner

Right now all import jobs / tasks are performed sequentially with the rational being that each of them eventually talks to the same media provider and eventually writes to the same database. Therefore these interactions must be exclusive. But if we look at the steps involved in an import job some of them could be parallelized with other steps from the next job:

  • local items retrieval
  • remote items retrieval (uses the (add-on) importer to talk to the media provider)
  • determine changeset
  • synchronization (writes to database)
  • cleanup

We must prevent that

  • multiple remote item retrieval steps using the same importer are running simultaneously
  • multiple synchronization steps are writing to the same database simultaneously

But theoretically it would be possible to e.g. start the next media import job as soon as the remote item retrieval step of the previous import job has finished.

@Montellese
Copy link
Owner Author

We could even parallelize part of the determine changeset and synchronization steps with the remote items retrieval step (at least adding new and updating changed items).

@Montellese
Copy link
Owner Author

media_import_19.1-Matrix_performance_async contains a CMediaImportThreadedItemsSynchronizationJob with the necessary adjustments to the following tasks to support parallel / asynchronous processing:

  • CMediaImportItemsRetrievalTask now notifies an observer if new items have been retrieved from a media provider
  • CMediaImportChangesetAsyncTask listens to newly retrieved items, gets and processes them. It then notifies an observer about the processed items.
  • CMediaImportSynchronisationAsyncTask listens to newly processed items, gets and writes them to the database.

Furthermore the media importers have to be adjusted to pass retrieved media items back to Kodi in chunks instead of passing all of them at the end. This is already possible outside of this work.

Parallelizing these three steps can provide a performance improvement because most of the changeset and synchronization task work is done while the media importer is still retrieving media items from the media provider. Only very little work has to be done at the end to finalize the changeset and synchronization tasks and clean up everything.

But it also comes with a downside: Progress reporting is very tricky because the tasks are running in parallel so it's hard to say how much work is left to be done. The only one who knows the total number of media items to process early on is the media importer. But it cannot pass this information on to Kodi and doesn't know how long the changeset and synchronization tasks are taking. From experience the changeset and synchronization tasks are very fast so maybe the best approach would be to limit progress reporting to the media importer task (CMediaImportItemsRetrievalTask).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant