Adding GraphParams to be able to save graph parameters of index to SavedParams#786
Adding GraphParams to be able to save graph parameters of index to SavedParams#786
Conversation
… vector type to SavedParams
There was a problem hiding this comment.
Pull request overview
This PR adds GraphParams struct to persist graph configuration parameters (l_build, alpha, backedge_ratio, vector_dtype) alongside the BfTreeProvider index. The changes enable saving and loading these parameters as part of the index's SavedParams, allowing the DiskANNIndex configuration to be reconstructed on load.
Changes:
- Introduced
GraphParamsstruct with fields for l_build, alpha, backedge_ratio, and vector_dtype - Added optional
graph_paramsfield toBfTreeProvider,BfTreeProviderParameters, andSavedParams - Updated save/load implementations to persist and restore
graph_params - Updated all tests and documentation examples to set
graph_params: None
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| diskann-providers/src/model/graph/provider/async_/bf_tree/provider.rs | Added GraphParams struct, integrated graph_params field into BfTreeProvider/BfTreeProviderParameters/SavedParams, updated save/load logic, updated all doc examples and tests |
| diskann-providers/src/model/graph/provider/async_/bf_tree/mod.rs | Exported GraphParams in public API |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #786 +/- ##
==========================================
- Coverage 89.00% 89.00% -0.01%
==========================================
Files 428 431 +3
Lines 78417 78455 +38
==========================================
+ Hits 69795 69828 +33
- Misses 8622 8627 +5
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
diskann-providers/src/model/graph/provider/async_/bf_tree/provider.rs
Outdated
Show resolved
Hide resolved
# DiskANN v0.47.0
## Summary
* This version contains a major breaking change to the search interface
of `DiskANNIndex`. Please read the upgrade instructions below.
* An Aarch64 Neon has been added to `diskann-wide`.
* Various bug-fixes and code-quality improvements.
## Changes to Search
The search interface has been unified around a single `index.search()`
entry point using the `Search` trait.
The old per-search-type methods on `DiskANNIndex` (`search`,
`search_recorded`, `range_search`, `multihop_search`) have been removed
and replaced by typed parameter structs that carry their own search
logic.
### What Changed
| Removed | Replacement |
|------------------------------------------------------------|--------------------------------------------------------------|
| `SearchParams` struct | `diskann::graph::search::Knn` |
| `RangeSearchParams` struct | `diskann::graph::search::Range` |
| `SearchParamsError` | `diskann::graph::KnnSearchError` |
| `RangeSearchParamsError` | `diskann::graph::RangeSearchError` |
| `index.search(&strategy, &ctx, &query, ¶ms, &mut out)` |
`index.search(knn, &strategy, &ctx, &query, &mut out)` |
| `index.search_recorded(..., &mut recorder)` |
`index.search(RecordedKnn::new(knn, &mut recorder), ...)` |
| `index.range_search(&strategy, &ctx, &query, ¶ms)` |
`index.search(range, &strategy, &ctx, &query, &mut ())` |
| `index.multihop_search(..., &label_eval)` |
`index.search(MultihopSearch::new(knn, &label_eval), ...)` |
| `index.diverse_search(...)` | `index.search(Diverse::new(knn,
diverse_params), ...)` |
**`flat_search`** remains an inherent method on `DiskANNIndex`
Its `search_params` argument changed from `&SearchParams` to `&Knn`.
### Upgrade Instructions
#### 1. k-NN Search (`search`)
**Before:**
```rust
use diskann::graph::SearchParams;
let params = SearchParams::new(10, 100, None)?;
let stats = index.search(&strategy, &ctx, &query, ¶ms, &mut output).await?;
```
**After:**
```rust
use diskann::graph::{Search, search::Knn};
let params = Knn::new(10, 100, None)?;
// Note: params is now the FIRST argument (moved before strategy)
let stats = index.search(params, &strategy, &ctx, &query, &mut output).await?;
```
Key differences:
- `SearchParams` -> `Knn` (import from `diskann::graph::search::Knn`)
- `SearchParamsError` -> `KnnSearchError` (import from
`diskann::graph::KnnSearchError`)
- Search params moved to the **first** argument of `index.search()`
- `k_value`, `l_value` fields are now private; use `.k_value()`,
`.l_value()` accessors (return `NonZeroUsize`)
#### 2. Recorded/Debug Search (`search_recorded`)
**Before:**
```rust
use diskann::graph::SearchParams;
let params = SearchParams::new(10, 100, None)?;
let stats = index
.search_recorded(&strategy, &ctx, &query, ¶ms, &mut output, &mut recorder)
.await?;
```
**After:**
```rust
use diskann::graph::{Search, search::{Knn, RecordedKnn}};
let params = Knn::new(10, 100, None)?;
let recorded = RecordedKnn::new(params, &mut recorder);
let stats = index.search(recorded, &strategy, &ctx, &query, &mut output).await?;
```
#### 3. Range Search (`range_search`)
**Before:**
```rust
use diskann::graph::RangeSearchParams;
let params = RangeSearchParams::new(None, 100, None, 0.5, None, 1.0, 1.0)?;
let (stats, ids, distances) = index
.range_search(&strategy, &ctx, &query, ¶ms)
.await?;
```
**After:**
```rust
use diskann::graph::{
Search,
search::Range,
RangeSearchOutput,
};
// Simple form:
let params = Range::new(100, 0.5)?;
// Or full options form:
let params = Range::with_options(None, 100, None, 0.5, None, 1.0, 1.0)?;
// Note: output buffer is `&mut ()` — results come back in the return type
let result: RangeSearchOutput<_> = index
.search(params, &strategy, &ctx, &query, &mut ())
.await?;
// Access results:
let stats = result.stats;
let ids = result.ids; // Vec<O>
let distances = result.distances; // Vec<f32>
```
Key differences:
- `RangeSearchParams` -> `Range` (import from
`diskann::graph::search::Range`)
- `RangeSearchParamsError` -> `RangeSearchError` (import from
`diskann::graph::RangeSearchError`)
- Return type changed from `(SearchStats, Vec<O>, Vec<f32>)` to
`RangeSearchOutput<O>` (a struct with `.stats`, `.ids`, `.distances`
fields)
- Pass `&mut ()` as the output buffer
- Field `starting_l_value` -> constructor arg `starting_l` (accessor:
`.starting_l()`)
- Field `initial_search_slack` -> constructor arg `initial_slack`
(accessor: `.initial_slack()`)
- Field `range_search_slack` -> constructor arg `range_slack` (accessor:
`.range_slack()`)
#### 4. Multihop / Label-Filtered Search (`multihop_search`)
**Before:**
```rust
use diskann::graph::SearchParams;
let params = SearchParams::new(10, 100, None)?;
let stats = index
.multihop_search(&strategy, &ctx, &query, ¶ms, &mut output, &label_eval)
.await?;
```
**After:**
```rust
use diskann::graph::{Search, search::{Knn, MultihopSearch}};
let knn = Knn::new(10, 100, None)?;
let params = MultihopSearch::new(knn, &label_eval);
let stats = index.search(params, &strategy, &ctx, &query, &mut output).await?;
```
Key differences:
- `MultihopSearch` wraps a `Knn` -> label evaluator into a single params
object
- The label evaluator is part of the params, not a separate argument
#### 5. Flat Search (unchanged method, new param type)
**Before:**
```rust
use diskann::graph::SearchParams;
let params = SearchParams::new(10, 100, None)?;
index.flat_search(&strategy, &ctx, &query, &filter, ¶ms, &mut output).await?;
```
**After:**
```rust
use diskann::graph::search::Knn;
let params = Knn::new(10, 100, None)?;
index.flat_search(&strategy, &ctx, &query, &filter, ¶ms, &mut output).await?;
```
Only the parameter type changed (`SearchParams` -> `Knn`).
### Import Path Changes
| Old | New |
|------------------------------------------|--------------------------------------------------------|
| `diskann::graph::SearchParams` | `diskann::graph::search::Knn` |
| `diskann::graph::RangeSearchParams` | `diskann::graph::search::Range`
|
| `diskann::graph::SearchParamsError` | `diskann::graph::KnnSearchError`
|
| `diskann::graph::RangeSearchParamsError` |
`diskann::graph::RangeSearchError` |
| — | `diskann::graph::search::MultihopSearch` (new) |
| — | `diskann::graph::search::RecordedKnn` (new) |
| — | `diskann::graph::search::Diverse` (new, feature-gated) |
| — | `diskann::graph::Search` (trait, re-exported) |
| — | `diskann::graph::RangeSearchOutput` (re-exported) |
## Change List
* copy bftrees from the snapshot location to the save location by
@backurs in #783
* (RFC) Refactor search interface with unified SearchDispatch trait by
@narendatha in #773
* Make queue.closest_notvisited() safe and update call sites by @arrayka
in #787
* git ignore: Ignore local settings for claude code AI agent by @arrayka
in #789
* Enabling flag support in codecov by @arrayka in
#790
* Increase unit test coverage for diskann-tools crate by @Copilot in
#763
* Neon MVP by @hildebrandmw in
#777
* Adding GraphParams to be able to save graph parameters of index to
SavedParams by @backurs in #786
## New Contributors
* @narendatha made their first contribution in
#773
**Full Changelog**:
0.46.0...v0.47.0
This PR addresses the following issue:
We want to save alpha, l_build, backedge_ratio and vector_dtype somewhere and the best place to do it (in my opinion) is SavedParams.
For that we need to save GraphParams in BfTreeProviderParameters and in BfTreeProvider. This is what this PR does.