Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA]: Multiplex leiden clustering #4828

Open
2 tasks done
niklasmueboe opened this issue Dec 11, 2024 · 2 comments
Open
2 tasks done

[FEA]: Multiplex leiden clustering #4828

niklasmueboe opened this issue Dec 11, 2024 · 2 comments
Labels
? - Needs Triage Need team to review and classify feature request New feature or request

Comments

@niklasmueboe
Copy link

Is this a new feature, an improvement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

Critical (currently preventing usage)

Please provide a clear description of problem this feature solves

The (most used) Leiden implementation in leidenalg supports multiplex clustering, where you can cluster multiple graphs with the same vertices jointly. In the field of single-cell transcriptomics and spatially resolved transcriptomics this can be used to cluster multi-modality data (as done in muon) or to jointly cluster cells based on their features and spatial neighborhoods (as done in SpatialLeiden).
With the increasing datasets (hundred thousands to millions of cells/vertices), runtime for Leiden clustering on the CPU becomes a limiting factor for exploring various parameter combinations.

Describe your ideal solution

The leiden function should support a list (or similar) of graphs as input. Therefore, also the resolution parameter would need to be extended to support a resolution for each graph (layer). Furthermore, a new parameter that gives a weight to each layer corresponding to its "importance" would be needed.

Describe any alternatives you have considered

No response

Additional context

No response

Code of Conduct

  • I agree to follow cuGraph's Code of Conduct
  • I have searched the open feature requests and have found no duplicates for this feature request
@niklasmueboe niklasmueboe added ? - Needs Triage Need team to review and classify feature request New feature or request labels Dec 11, 2024
@abs51295
Copy link

abs51295 commented Dec 11, 2024

I would also consider adding support for directed weighted graphs since scanpy.tl.leiden uses a directed weighted graph with leidenalg package.(Nevermind since they are moving to igraph). Also, support for fixing the membership labels for a part of the graph is useful when dealing with merging of two different datasets: https://www.nature.com/articles/s41598-020-71805-1.

@ChuckHastings
Copy link
Collaborator

This is something we can explore. Within our current cugraph framework, we could potentially support this as follows:

  • Define edge types for each layer (number the layers from 0 to n)
  • Create a variation of the Leiden algorithm that considers the layers
  • Allow for different resolution values for each layer
  • Allow for certain layers to be ignored (so you don't have to recreate the graph in different scenarios

Does this seem like a reasonable approach? We would need to determine when to address this in our road map.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify feature request New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants