Skip to content

Comments

[diskann] Add note on the selection of PruneKind in graph::config::Builder.#734

Merged
hildebrandmw merged 2 commits intomainfrom
mhildebr/builder-docs
Feb 6, 2026
Merged

[diskann] Add note on the selection of PruneKind in graph::config::Builder.#734
hildebrandmw merged 2 commits intomainfrom
mhildebr/builder-docs

Conversation

@hildebrandmw
Copy link
Contributor

Add a note to diskann::graph::config::Builder::new() highlighting the conversion from diskann_vector::distance::Metric to diskann::graph::config::PruneKind.

Incorrectly selecting PruneKind can have a profoundly negative impact on recall.

@hildebrandmw hildebrandmw requested review from a team and Copilot February 6, 2026 17:59
@hildebrandmw hildebrandmw added the documentation Improvements or additions to documentation label Feb 6, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds developer-facing documentation to diskann::graph::config::Builder::new() to warn that selecting the wrong PruneKind for a given distance metric can significantly harm graph quality/recall, and shows an example of deriving PruneKind from diskann_vector::distance::Metric.

Changes:

  • Add a rustdoc “Note” section to Builder::new() explaining that PruneKind selection depends on the distance function.
  • Add a rustdoc example showing how to convert Metric into PruneKind when constructing a config::Builder.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov-commenter
Copy link

codecov-commenter commented Feb 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.31%. Comparing base (f3fcaae) to head (1a6ae7b).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #734      +/-   ##
==========================================
+ Coverage   89.01%   90.31%   +1.30%     
==========================================
  Files         428      428              
  Lines       78294    78294              
==========================================
+ Hits        69692    70710    +1018     
+ Misses       8602     7584    -1018     
Files with missing lines Coverage Δ
diskann/src/graph/config/mod.rs 98.07% <ø> (ø)

... and 39 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@hildebrandmw hildebrandmw merged commit 3055358 into main Feb 6, 2026
20 checks passed
@hildebrandmw hildebrandmw deleted the mhildebr/builder-docs branch February 6, 2026 20:20
hildebrandmw added a commit that referenced this pull request Feb 13, 2026
## What's Changed

### API Breaking Changes
* Remove the `experimental_avx512` feature. by @hildebrandmw in
#732
* Use VirtualStorageProvider::new_overlay(test_data_root()) in tests by
@Copilot in #726
* save and load max_record_size and leaf_page_size for bftrees by
@backurs in #724
* [multi-vector] Verify `Standard` won't overflow in its constructor. by
@hildebrandmw in #757
* VirtualStorageProvider: Make new() private, add new_physical by
@Copilot in #764
* [minmax] Refactor full query by @arkrishn94 in
#770
* Bump diskann-quantization to edition 2024. by @hildebrandmw in
#772

### Additions
* [multi-vector] Enable cloning of `Mat` and friends. by @hildebrandmw
in #759
* adding bftreepaths in mod.rs by @backurs in
#775
* [quantization] Add `as_raw_ptr`. by @hildebrandmw in
#774

### Bug Fixes
* Fix `diskann` compilation without default-features and add CI tests.
by @hildebrandmw in #722

### Docs and Comments
* Updating the benchmark README to use diskann-benchmark by @bryantower
in #709
* Fix doc comment: Windows line endings are \r\n not \n\r by @Copilot in
#717
* Fix spelling errors in streaming API documentation by @Copilot in
#715
* Add performance diagnostic to `diskann-benchmark` by @hildebrandmw in
#744
* Add agents.md onboarding guide for coding agents by @Copilot in
#765
* [doc] Fix lots of little typos in `diskann-wide` by @hildebrandmw in
#771

### Performance
* [diskann-wide] Optimize `load_simd_first` for 8-bit and 16-bit element
types. by @hildebrandmw in #747

### Dependencies
* Bump bytes from 1.11.0 to 1.11.1 by @dependabot[bot] in
#723
* [diskann] Add note on the selection of `PruneKind` in
`graph::config::Builder`. by @hildebrandmw in
#734
* [diskann-providers] Remove the LRU dependency and make `vfs` and
`serde_json` optional. by @hildebrandmw in
#733

### Infrastructure
* Add initial QEMU tests for `diskann-wide`. by @hildebrandmw in
#719
* [CI] Skip coverage for Dependabot. by @hildebrandmw in
#725
* Add miri test coverage to CI workflow by @Copilot in
#729
* [CI] Add minimal ARM checks by @hildebrandmw in
#745
* Enable CodeQL security analysis by @Copilot in
#754

## New Contributors
* @backurs made their first contribution in
#724
* @arkrishn94 made their first contribution in
#770

**Full Changelog**:
0.45.0...0.46.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants