Skip to content

Comments

Use VirtualStorageProvider::new_overlay(test_data_root()) in tests#726

Merged
arrayka merged 3 commits intomainfrom
copilot/update-unit-tests-to-use-virtualstorageprovider
Feb 10, 2026
Merged

Use VirtualStorageProvider::new_overlay(test_data_root()) in tests#726
arrayka merged 3 commits intomainfrom
copilot/update-unit-tests-to-use-virtualstorageprovider

Conversation

Copy link
Contributor

Copilot AI commented Feb 6, 2026

Test files were using hardcoded workspace root paths (env!("CARGO_MANIFEST_DIR").parent().unwrap()) instead of the centralized test_data_root() helper from diskann-utils.

Changes

  • Standardized test data resolution: Replaced all hardcoded workspace_root patterns with test_data_root() across 17 test functions in 7 files
  • Fixed path prefixes: Removed /test_data/ prefix from test file paths since test_data_root() already resolves to the test_data directory
  • Updated dependencies: Added testing feature for diskann-utils in dev-dependencies

Before

let workspace_root = std::path::PathBuf::from(env!("CARGO_MANIFEST_DIR"))
    .parent()
    .unwrap()
    .to_path_buf();
let storage_provider = VirtualStorageProvider::new_overlay(workspace_root);
let file = "/test_data/sift/siftsmall_learn.bin";

After

let storage_provider = VirtualStorageProvider::new_overlay(test_data_root());
let file = "/sift/siftsmall_learn.bin";

Files affected:

  • diskann-providers/src/index/diskann_async.rs
  • diskann-providers/src/storage/{pq_storage.rs, index_storage.rs}
  • diskann-providers/src/utils/{normalizing_util.rs, kmeans.rs}
  • diskann-providers/src/model/pq/{fixed_chunk_pq_table.rs, pq_construction.rs}
Original prompt

This section details on the original issue you should resolve

<issue_title>Use VirtualStorageProvider::new_overlay(test_data()) in tests</issue_title>
<issue_description>Update the unit tests so that any hardcoded test_data in string paths and workspace_root usages are replaced with VirtualStorageProvider::new_overlay(test_data_root()).

Here is an example of expected pattern:

        let storage_provider = VirtualStorageProvider::new_overlay(test_data_root());
        let dataset_file = "/sift/siftsmall_learn.bin";
        let mut file = storage_provider.open_reader(dataset_file).unwrap();

Ensure that all unit tests still pass.

Original comment:

Something we should also do (maybe in another PR) is to replace all uses of test_data with the file system resolution in diskann-utils. I think a few fell through the cracks when test data was consolidated.

Originally posted by @hildebrandmw in #700</issue_description>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot AI and others added 2 commits February 6, 2026 06:53
Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>
Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>
Copilot AI changed the title [WIP] Update unit tests to use VirtualStorageProvider for test data Use VirtualStorageProvider::new_overlay(test_data_root()) in tests Feb 6, 2026
Copilot AI requested a review from arrayka February 6, 2026 07:14
@arrayka arrayka marked this pull request as ready for review February 6, 2026 07:29
@arrayka arrayka requested review from a team and Copilot February 6, 2026 07:29
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR standardizes test data resolution across the diskann-providers crate by replacing hardcoded workspace root path patterns with the centralized test_data_root() helper function from diskann-utils. The changes are part of a broader effort (following PR #700) to ensure tests use the virtual filesystem overlay pattern consistently and avoid direct filesystem writes during testing.

Changes:

  • Replaced std::path::PathBuf::from(env!("CARGO_MANIFEST_DIR")).parent().unwrap() patterns with test_data_root() across 17 test functions
  • Updated test file path constants and variables to remove the /test_data/ prefix (since test_data_root() already points to the test_data directory)
  • Added diskann-utils with "testing" feature to dev-dependencies to access the test_data_root() function

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
diskann-providers/src/utils/normalizing_util.rs Updated 1 test function to use test_data_root() and removed /test_data/ prefix from paths
diskann-providers/src/utils/kmeans.rs Updated 1 test function to use test_data_root(), removed unused PathBuf import, and fixed path prefix
diskann-providers/src/storage/pq_storage.rs Updated 4 test functions and 3 const path declarations to use test_data_root() with corrected path prefixes
diskann-providers/src/storage/index_storage.rs Updated 1 test function to use test_data_root() and corrected file path
diskann-providers/src/model/pq/pq_construction.rs Updated 4 test functions and const path declarations to use test_data_root() with corrected prefixes
diskann-providers/src/model/pq/fixed_chunk_pq_table.rs Updated 3 test functions to use test_data_root() and removed /test_data/ prefix from paths
diskann-providers/src/index/diskann_async.rs Updated 3 test functions and 1 const path declaration to use test_data_root() with corrected paths
diskann-providers/Cargo.toml Added diskann-utils with "testing" feature to dev-dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 89.00%. Comparing base (f3fcaae) to head (ddd0e45).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #726      +/-   ##
==========================================
- Coverage   89.01%   89.00%   -0.01%     
==========================================
  Files         428      428              
  Lines       78294    78234      -60     
==========================================
- Hits        69692    69632      -60     
  Misses       8602     8602              
Files with missing lines Coverage Δ
diskann-providers/src/index/diskann_async.rs 96.35% <100.00%> (-0.02%) ⬇️
...ann-providers/src/model/pq/fixed_chunk_pq_table.rs 95.15% <100.00%> (-0.07%) ⬇️
diskann-providers/src/model/pq/pq_construction.rs 92.53% <100.00%> (-0.07%) ⬇️
diskann-providers/src/storage/index_storage.rs 99.71% <100.00%> (-0.01%) ⬇️
diskann-providers/src/storage/pq_storage.rs 88.23% <100.00%> (-0.47%) ⬇️
diskann-providers/src/utils/kmeans.rs 96.16% <100.00%> (-0.03%) ⬇️
diskann-providers/src/utils/normalizing_util.rs 72.17% <100.00%> (-0.94%) ⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

@arkrishn94 arkrishn94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@arrayka arrayka merged commit 3674a49 into main Feb 10, 2026
26 checks passed
@arrayka arrayka deleted the copilot/update-unit-tests-to-use-virtualstorageprovider branch February 10, 2026 16:19
hildebrandmw added a commit that referenced this pull request Feb 13, 2026
## What's Changed

### API Breaking Changes
* Remove the `experimental_avx512` feature. by @hildebrandmw in
#732
* Use VirtualStorageProvider::new_overlay(test_data_root()) in tests by
@Copilot in #726
* save and load max_record_size and leaf_page_size for bftrees by
@backurs in #724
* [multi-vector] Verify `Standard` won't overflow in its constructor. by
@hildebrandmw in #757
* VirtualStorageProvider: Make new() private, add new_physical by
@Copilot in #764
* [minmax] Refactor full query by @arkrishn94 in
#770
* Bump diskann-quantization to edition 2024. by @hildebrandmw in
#772

### Additions
* [multi-vector] Enable cloning of `Mat` and friends. by @hildebrandmw
in #759
* adding bftreepaths in mod.rs by @backurs in
#775
* [quantization] Add `as_raw_ptr`. by @hildebrandmw in
#774

### Bug Fixes
* Fix `diskann` compilation without default-features and add CI tests.
by @hildebrandmw in #722

### Docs and Comments
* Updating the benchmark README to use diskann-benchmark by @bryantower
in #709
* Fix doc comment: Windows line endings are \r\n not \n\r by @Copilot in
#717
* Fix spelling errors in streaming API documentation by @Copilot in
#715
* Add performance diagnostic to `diskann-benchmark` by @hildebrandmw in
#744
* Add agents.md onboarding guide for coding agents by @Copilot in
#765
* [doc] Fix lots of little typos in `diskann-wide` by @hildebrandmw in
#771

### Performance
* [diskann-wide] Optimize `load_simd_first` for 8-bit and 16-bit element
types. by @hildebrandmw in #747

### Dependencies
* Bump bytes from 1.11.0 to 1.11.1 by @dependabot[bot] in
#723
* [diskann] Add note on the selection of `PruneKind` in
`graph::config::Builder`. by @hildebrandmw in
#734
* [diskann-providers] Remove the LRU dependency and make `vfs` and
`serde_json` optional. by @hildebrandmw in
#733

### Infrastructure
* Add initial QEMU tests for `diskann-wide`. by @hildebrandmw in
#719
* [CI] Skip coverage for Dependabot. by @hildebrandmw in
#725
* Add miri test coverage to CI workflow by @Copilot in
#729
* [CI] Add minimal ARM checks by @hildebrandmw in
#745
* Enable CodeQL security analysis by @Copilot in
#754

## New Contributors
* @backurs made their first contribution in
#724
* @arkrishn94 made their first contribution in
#770

**Full Changelog**:
0.45.0...0.46.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move test_data_root() to diskann-providers\src\storage\virtual_storage_provider.rs Use VirtualStorageProvider::new_overlay(test_data()) in tests

5 participants