-
Notifications
You must be signed in to change notification settings - Fork 130
[WIP] Add disk2disk serialization foe ACE Algorithm #1410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
mfoerste4
wants to merge
34
commits into
rapidsai:branch-25.12
Choose a base branch
from
mfoerste4:ace_serialize
base: branch-25.12
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Adds the out-of-tree ACE method of @anaruse. This assumes graphs smaller than host memory. - Adds disk_enabled` and `graph_build_dir` parameters to select ACE method.
- Use partitions instead of clusters in ACE to distinguish between ACE clusters and regular KNN graph building clusters.
- Introduced dynamic configuration of nprobes and nlists for IVF-PQ based on partition size to improve KNN graph construction. - Added logging for both IVF-PQ and NN-Descent parameters to provide better insights during the graph building process. - Ensured default parameters are set when no specific graph build parameters are provided.
- Added logic to identify and merge small partitions that do not meet the minimum size requirement for stable KNN graph construction.
- Replaced `disk_enabled` and `graph_build_dir` with `ace_npartitions` and `ace_build_dir` in the parameter parsing logic. - Updated function signatures and documentation to clarify the new partitioning approach for ACE builds.
- Introduced new functions for reordering and storing datasets on disk, optimizing for NVMe performance. - Clarified namings.
…gement - Added a `file_descriptor` class to manage file descriptors with RAII, ensuring proper resource cleanup. - Updated file handling in `ace_write_large_file` and `ace_reorder_and_store_dataset` to use the new wrapper. - Improved error handling and logging for file operations. - Enhanced input validation in `build_ace` function for better robustness.
…tion handling - Introduced a minimum partition size parameter to improve partition stability. - Replaced standard K-means with balanced K-means for more even partition sizes. - Implemented logic to reassign vectors from small partitions to the nearest larger ones. - Added detailed logging for partition statistics and warnings for imbalances.
- Introduced `ace_read_large_file` function for efficient reading of large files in chunks. - Improved error handling and logging in file operations. - Refactored existing file handling in `ace_write_large_file` and `ace_reorder_and_store_dataset` to utilize the new reading function.
- Introduced methods to check if the index is stored on disk and to retrieve the file directory. - Added functionality to set disk storage parameters within the index structure. - Updated the `build_ace` function to set the disk-based index when use_disk = true.
Resolves rapidsai#1344 Authors: - Anupam (https://github.com/aamijar) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#1402
Currently only IVF-PQ can be used as the graph building algorithm (NN Descent does not support Cosine). As a result, we are limited by IVF-PQ's restriction of data to be of float / half type for the Cosine metric. This PR also fixes an in-place data modification that was being done by IVF-PQ. Opportunities for optimization: NN Descent to support Cosine and compute dataset norms only once -- during NN Descent. Re-use those for CAGRA. [UPDATE 08/21/2025]: NN Descent now support Cosine. This PR allows the initial CAGRA graph to be built by both methods -- IVF_PQ, NN_DESCENT. The IVF_PQ restriction on data types holds, but uint8 and int8 can be supported with NN Descent as the graph building algorithm. ITERATIVE CAGRA SEARCH is currently disabled for Cosine. [UPDATE 09/23/2025]: This PR also adds Cosine support for IVF_PQ with uint8 / int8 inputs. The above mentioned restriction with IVF_PQ has been removed. So with this PR CAGRA supports Cosine wholly, for float, uint8 and int8 inputs. ITERATIVE_SEARCH however still has some issues as the graph building method with the Cosine metric and has been disabled. [UPDATE 09/25/2025]: Binary size comparison for libcuvs.so (CUDA 12.9, x86): branch-25.10: 1154.42 MB This PR: 1160.73 MB Total CAGRA testing time: branch-25.10: ``` Start 10: NEIGHBORS_ANN_CAGRA_FLOAT_UINT32_TEST 19/37 Test rapidsai#10: NEIGHBORS_ANN_CAGRA_FLOAT_UINT32_TEST ... Passed 825.43 sec Start 11: NEIGHBORS_ANN_CAGRA_HELPERS_TEST 20/37 Test rapidsai#11: NEIGHBORS_ANN_CAGRA_HELPERS_TEST ........ Passed 0.58 sec Start 12: NEIGHBORS_ANN_CAGRA_HALF_UINT32_TEST 21/37 Test rapidsai#12: NEIGHBORS_ANN_CAGRA_HALF_UINT32_TEST .... Passed 663.97 sec Start 13: NEIGHBORS_ANN_CAGRA_INT8_UINT32_TEST 22/37 Test rapidsai#13: NEIGHBORS_ANN_CAGRA_INT8_UINT32_TEST .... Passed 397.57 sec Start 14: NEIGHBORS_ANN_CAGRA_UINT8_UINT32_TEST 23/37 Test rapidsai#14: NEIGHBORS_ANN_CAGRA_UINT8_UINT32_TEST ... Passed 408.16 sec ``` This PR: ``` Start 10: NEIGHBORS_ANN_CAGRA_FLOAT_UINT32_TEST 19/37 Test rapidsai#10: NEIGHBORS_ANN_CAGRA_FLOAT_UINT32_TEST ... Passed 1830.34 sec Start 11: NEIGHBORS_ANN_CAGRA_HELPERS_TEST 20/37 Test rapidsai#11: NEIGHBORS_ANN_CAGRA_HELPERS_TEST ........ Passed 0.45 sec Start 12: NEIGHBORS_ANN_CAGRA_HALF_UINT32_TEST 21/37 Test rapidsai#12: NEIGHBORS_ANN_CAGRA_HALF_UINT32_TEST .... Passed 1444.14 sec Start 13: NEIGHBORS_ANN_CAGRA_INT8_UINT32_TEST 22/37 Test rapidsai#13: NEIGHBORS_ANN_CAGRA_INT8_UINT32_TEST .... Passed 973.64 sec Start 14: NEIGHBORS_ANN_CAGRA_UINT8_UINT32_TEST 23/37 Test rapidsai#14: NEIGHBORS_ANN_CAGRA_UINT8_UINT32_TEST ... Passed 1010.46 sec ``` [UPDATE 09/30/2025]: Updates to CAGRA C++ tests according to the latest PR reviews. New total CAGRA testing time: branch-25.10: ``` Start 9: NEIGHBORS_ANN_CAGRA_TEST_BUGS 18/37 Test rapidsai#9: NEIGHBORS_ANN_CAGRA_TEST_BUGS ........... Passed 16.99 sec Start 10: NEIGHBORS_ANN_CAGRA_FLOAT_UINT32_TEST 19/37 Test rapidsai#10: NEIGHBORS_ANN_CAGRA_FLOAT_UINT32_TEST ... Passed 803.64 sec Start 11: NEIGHBORS_ANN_CAGRA_HELPERS_TEST 20/37 Test rapidsai#11: NEIGHBORS_ANN_CAGRA_HELPERS_TEST ........ Passed 0.49 sec Start 12: NEIGHBORS_ANN_CAGRA_HALF_UINT32_TEST 21/37 Test rapidsai#12: NEIGHBORS_ANN_CAGRA_HALF_UINT32_TEST .... Passed 667.89 sec Start 13: NEIGHBORS_ANN_CAGRA_INT8_UINT32_TEST 22/37 Test rapidsai#13: NEIGHBORS_ANN_CAGRA_INT8_UINT32_TEST .... Passed 420.49 sec Start 14: NEIGHBORS_ANN_CAGRA_UINT8_UINT32_TEST 23/37 Test rapidsai#14: NEIGHBORS_ANN_CAGRA_UINT8_UINT32_TEST ... Passed 429.57 sec ``` This PR: ``` Start 9: NEIGHBORS_ANN_CAGRA_TEST_BUGS 18/37 Test rapidsai#9: NEIGHBORS_ANN_CAGRA_TEST_BUGS ........... Passed 26.62 sec Start 10: NEIGHBORS_ANN_CAGRA_FLOAT_UINT32_TEST 19/37 Test rapidsai#10: NEIGHBORS_ANN_CAGRA_FLOAT_UINT32_TEST ... Passed 973.23 sec Start 11: NEIGHBORS_ANN_CAGRA_HELPERS_TEST 20/37 Test rapidsai#11: NEIGHBORS_ANN_CAGRA_HELPERS_TEST ........ Passed 0.43 sec Start 12: NEIGHBORS_ANN_CAGRA_HALF_UINT32_TEST 21/37 Test rapidsai#12: NEIGHBORS_ANN_CAGRA_HALF_UINT32_TEST .... Passed 702.02 sec Start 13: NEIGHBORS_ANN_CAGRA_INT8_UINT32_TEST 22/37 Test rapidsai#13: NEIGHBORS_ANN_CAGRA_INT8_UINT32_TEST .... Passed 491.65 sec Start 14: NEIGHBORS_ANN_CAGRA_UINT8_UINT32_TEST 23/37 Test rapidsai#14: NEIGHBORS_ANN_CAGRA_UINT8_UINT32_TEST ... Passed 541.43 sec ``` Fixes rapidsai#1288 Fixes rapidsai#389 Authors: - Tarang Jain (https://github.com/tarang-jain) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Tamas Bela Feher (https://github.com/tfeher) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#197
9db4847
to
f5ed9ac
Compare
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a serialization routine that allows combination of dataset, graph, and mapping as per step (3) of #1404. The data will be combined on-the-fly while streamed from disk to disk while trying to minimize the required host memory.
It is build on top of #1404 . More details to follow.
CC @tfeher , @julianmi