Skip to content

Serialize Vamana index with SSD sector alignment per MSFT DiskANN format, generate quantized dataset for integration with DiskANN #846

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jamxia155
Copy link
Contributor

(This supersedes PR!703)

Added an optional input flag to cuvs::neighbors::vamana::serialize to dump an input cuvs Vamana index to file with SSD sector alignment. File format follows MSFT DiskANN.

Using the sector-aligned option also writes out the quantized dataset computed using user-supplied PQ codebooks file and rotation matrix file.

Copy link

copy-pr-bot bot commented Apr 25, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@cjnolet
Copy link
Member

cjnolet commented Apr 25, 2025

/ok to test d0aabc6

@jamxia155 jamxia155 marked this pull request as ready for review April 25, 2025 20:08
@jamxia155 jamxia155 requested a review from a team as a code owner April 25, 2025 20:08
@divyegala
Copy link
Member

Approving binary size addition to libcuvs.so of ~8.3 MB given the feature importance. @bkarsin @jamxia155 to file a follow-on issue to work on identifying any potential for size reduction.

@jamxia155 jamxia155 requested a review from tarang-jain June 18, 2025 18:15
@cjnolet cjnolet removed the request for review from jameslamb July 2, 2025 22:38
@cjnolet
Copy link
Member

cjnolet commented Jul 2, 2025

/merge

@cjnolet
Copy link
Member

cjnolet commented Jul 9, 2025

/merge

@rapids-bot rapids-bot bot merged commit 0719080 into rapidsai:branch-25.08 Jul 15, 2025
52 of 53 checks passed
punAhuja pushed a commit to SearchScale/cuvs that referenced this pull request Jul 16, 2025
…mat, generate quantized dataset for integration with DiskANN (rapidsai#846)

(This supersedes [PR!703](rapidsai#703))

Added an optional input flag to `cuvs::neighbors::vamana::serialize` to dump an input cuvs Vamana index to file with SSD sector alignment. File format follows [MSFT DiskANN](https://github.com/microsoft/DiskANN/blob/main/src/disk_utils.cpp).

Using the sector-aligned option also writes out the quantized dataset computed using user-supplied PQ codebooks file and rotation matrix file.

Authors:
  - James Xia (https://github.com/jamxia155)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Tarang Jain (https://github.com/tarang-jain)
  - Corey J. Nolet (https://github.com/cjnolet)
  - Gil Forsyth (https://github.com/gforsyth)

URL: rapidsai#846
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci CMake cpp improvement Improves an existing functionality non-breaking Introduces a non-breaking change Python
Development

Successfully merging this pull request may close these issues.

7 participants