Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide tools for parallel processing #5414

Open
RemiLehe opened this issue Oct 24, 2024 · 2 comments
Open

Provide tools for parallel processing #5414

RemiLehe opened this issue Oct 24, 2024 · 2 comments
Assignees
Labels
component: documentation Docs, readme and manual component: post-processing post-processing tools machine / system Machine or system-specific issue

Comments

@RemiLehe
Copy link
Member

There are several steps.

Parallel processing on one node with openPMD-api

Provide example/instructions on how to process fields/particles in the WarpX documentation.

  • For fields: provide example on how to load the fields, coarsen them and plot them. Or also: compute total field energy (proxy for laser energy)
  • For particles: manual filtering and histogramming
  • Provide example on which dependencies to install

(We should decide exactly where in the documentation this should go.)

Parallel processing on several nodes with openPMD-api

Provide example/instructions on how to:

  • allocate a Dask cluster (e.g. through JupyterHub)
  • make sure that Dask is using the cluster

Parallel processing on several nodes with openPMD-viewer

Provide an interface to use a Dask cluster in openPMD-viewer:

ts = OpenPMDTimeSeries( path_to_data, dask_cluster )

Parallel processing with GPUs

Provide instructions of all of the above worflows, but with RAPIDS objects, that run the DASK workflows on GPU (e.g. using cuDF)

Other tools (e.g. multiprocessing)

@EZoni EZoni added the component: post-processing post-processing tools label Oct 24, 2024
@EZoni EZoni self-assigned this Oct 24, 2024
@RemiLehe
Copy link
Member Author

Note that NERSC's JupyterHub has a Dask kernel, which I think is meant for this type of interactive, parallel work:
Screenshot 2024-11-12 at 4 29 59 PM

@ax3l
Copy link
Member

ax3l commented Jan 22, 2025

That is a good point to document.

We need to close the gap of what the infrastructure provides to what we want to run:

I think we need something like an overhaul of the Perlmutter Jupyter environment (also, let's add a script to set it up) and likely a small mini-notebook that starts up Dask nodes.

@ax3l ax3l added component: documentation Docs, readme and manual machine / system Machine or system-specific issue labels Jan 22, 2025
@ax3l ax3l self-assigned this Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component: documentation Docs, readme and manual component: post-processing post-processing tools machine / system Machine or system-specific issue
Projects
None yet
Development

No branches or pull requests

3 participants