Skip to content

Conversation

@julianmi
Copy link
Contributor

This PR adds a direct hnsw::build API that uses the ACE (Augmented Core Extraction) algorithm to build HNSW indexes on the GPU. ACE enables building HNSW indexes for datasets too large to fit in GPU memory by partitioning the data and building sub-indexes.

CC @tfeher

C++ API

  • Added hnsw::build() function with ACE parameters for direct HNSW index construction. This serializes an HNSW index to disk if use_disk is true.
  • Added hnsw::graph_build_params::ace_params struct with configurable options:
    • npartitions - number of partitions for parallel build
    • ef_construction - index quality parameter
    • build_dir - directory for disk-based build artifacts
    • use_disk - force disk-based storage mode
  • Implemented proper serialization/deserialization for disk-backed HNSW indexes
  • Added C++ tests in ann_hnsw_ace.cuh

C API

  • Added cuvsHnswBuild function with ACE parameters
  • Added C tests in ann_hnsw_ace.cu

Python

  • Added hnsw.AceParams class for configuring ACE builds
  • Added Python tests in test_hnsw_ace.py

Java

  • Added HnswAceParams class
  • Added Java tests in HnswAceBuildAndSearchIT.java

Documentation

  • Added cuvs_hnsw section to the parameter tuning guide with ACE parameters

Example

  • Added hnsw_ace_example.cu demonstrating the build → deserialize → search workflow

- Added `cuvsHnswAceParams` structure for ACE configuration.
- Implemented `cuvsHnswBuild` function to facilitate index construction using ACE.
- Updated HNSW index parameters to include ACE settings.
- Created new tests for HNSW index building and searching using ACE.
- Updated documentation to reflect the new ACE parameters and usage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant