Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cagra_hnsw serialization when dataset is not part of index #591

Merged
merged 9 commits into from
Jan 30, 2025

Conversation

tfeher
Copy link
Contributor

@tfeher tfeher commented Jan 20, 2025

After calling build(), ideally the CAGRA index contains both the dataset and the graph. But when we do not have sufficient device memory, then only the graph is returned. In such case we need to pass the dataset explicitly to the serialization routines.

For serialization in HNSW format, in case we have flat hierarchy, the dataset was not passed. This PR fixes this problem by adding an optional dataset argument to cagra::serialize_to_hnswlib.

Furthermore, to improve execution time, we change from writing a single element to writing a single row of the graph and dataset at time.

Additionally, debug messages for tracking data saving time are added.

@tfeher tfeher requested a review from a team as a code owner January 20, 2025 11:48
@tfeher tfeher self-assigned this Jan 20, 2025
@github-actions github-actions bot added the cpp label Jan 20, 2025
@tfeher tfeher added bug Something isn't working non-breaking Introduces a non-breaking change and removed cpp labels Jan 20, 2025
@tfeher tfeher requested a review from divyegala January 20, 2025 11:48
@github-actions github-actions bot added the cpp label Jan 20, 2025
@cjnolet
Copy link
Member

cjnolet commented Jan 24, 2025

@tfeher changes look good but there's a docs build failure.

@cjnolet
Copy link
Member

cjnolet commented Jan 30, 2025

/merge

@rapids-bot rapids-bot bot merged commit 0dd7bde into rapidsai:branch-25.02 Jan 30, 2025
61 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cpp non-breaking Introduces a non-breaking change
Projects
Development

Successfully merging this pull request may close these issues.

3 participants