Skip to content

Comments

Enforce no real filesystem writes during unit tests#700

Merged
arrayka merged 6 commits intomainfrom
u/arrayka/remove_physicalfs_from_tests
Jan 23, 2026
Merged

Enforce no real filesystem writes during unit tests#700
arrayka merged 6 commits intomainfrom
u/arrayka/remove_physicalfs_from_tests

Conversation

@arrayka
Copy link
Contributor

@arrayka arrayka commented Jan 23, 2026

Why

  • To avoid writes to the real filesystem in unit tests.

What

  • Removed PhysicalFS references from tests and replaced with VirtualStorageProvider::new_overlay()
  • Added vfs::PhysicalFS::new to the list of disallowed methods in clippy.toml

@arrayka arrayka requested a review from hildebrandmw January 23, 2026 08:42
@arrayka arrayka changed the title Remove PhysicalFS references from tests and replace with VirtualStorageProvider::new_overlay() Enforce no real filesystem writes during unit tests Jan 23, 2026
Copy link
Contributor

Copilot AI commented Jan 23, 2026

@arrayka I've opened a new pull request, #701, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 3 commits January 23, 2026 01:49
* Initial plan

* Address PR feedback: revert comments and refactor test helper functions

Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>

* Revert test_index_path() to return String and remove .to_string_lossy()

Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>

* Remove test_index_os_path() and use test_index_path() instead

Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>

* Revert commit 90934ad

Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>
pub fn benchmark_copy_aligned_data(c: &mut Criterion) {
let tmp_dir = TempDir::with_prefix(BENCHMARK_ID).expect("Failed to create temporary directory");
// Use physical file system rather than memory for testing the actual disk read/write
#[allow(clippy::disallowed_methods)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use

#[expect(clippy::disallowed_methods, "reason")]

to do two things:

  1. expect will complain if disallowed method is not actually used (helping with maintanence)
  2. The following reason helps document why an exception is made.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

pub fn benchmark_copy_aligned_data_iai() {
let tmp_dir = TempDir::with_prefix(BENCHMARK_ID).expect("Failed to create temporary directory");
// Use physical file system rather than memory for testing the actual disk read/write
#[allow(clippy::disallowed_methods)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here with expect instead of allow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@arrayka arrayka merged commit b7d9be9 into main Jan 23, 2026
8 checks passed
@arrayka arrayka deleted the u/arrayka/remove_physicalfs_from_tests branch January 23, 2026 22:48
arrayka added a commit that referenced this pull request Feb 10, 2026
)

Test files were using hardcoded workspace root paths
(`env!("CARGO_MANIFEST_DIR").parent().unwrap()`) instead of the
centralized `test_data_root()` helper from diskann-utils.

## Changes

- **Standardized test data resolution**: Replaced all hardcoded
workspace_root patterns with `test_data_root()` across 17 test functions
in 7 files
- **Fixed path prefixes**: Removed `/test_data/` prefix from test file
paths since `test_data_root()` already resolves to the test_data
directory
- **Updated dependencies**: Added `testing` feature for diskann-utils in
dev-dependencies

### Before
```rust
let workspace_root = std::path::PathBuf::from(env!("CARGO_MANIFEST_DIR"))
    .parent()
    .unwrap()
    .to_path_buf();
let storage_provider = VirtualStorageProvider::new_overlay(workspace_root);
let file = "/test_data/sift/siftsmall_learn.bin";
```

### After
```rust
let storage_provider = VirtualStorageProvider::new_overlay(test_data_root());
let file = "/sift/siftsmall_learn.bin";
```

Files affected:
- `diskann-providers/src/index/diskann_async.rs`
- `diskann-providers/src/storage/{pq_storage.rs, index_storage.rs}`
- `diskann-providers/src/utils/{normalizing_util.rs, kmeans.rs}`
- `diskann-providers/src/model/pq/{fixed_chunk_pq_table.rs,
pq_construction.rs}`

<!-- START COPILOT ORIGINAL PROMPT -->



<details>

<summary>Original prompt</summary>

> 
> ----
> 
> *This section details on the original issue you should resolve*
> 
> <issue_title>Use VirtualStorageProvider::new_overlay(test_data()) in
tests</issue_title>
> <issue_description>Update the unit tests so that any hardcoded
test_data in string paths and workspace_root usages are replaced with
`VirtualStorageProvider::new_overlay(test_data_root())`.
> 
> Here is an example of expected pattern:
> ```
> let storage_provider =
VirtualStorageProvider::new_overlay(test_data_root());
>         let dataset_file = "/sift/siftsmall_learn.bin";
> let mut file = storage_provider.open_reader(dataset_file).unwrap();
> ```
> Ensure that all unit tests still pass.
> 
> Original comment:
> > Something we should also do (maybe in another PR) is to replace all
uses of `test_data` with the file system resolution in `diskann-utils`.
I think a few fell through the cracks when test data was consolidated.
> 
> _Originally posted by @hildebrandmw in
[#700](https://github.com/microsoft/DiskANN/pull/700/changes#r2722426616)_</issue_description>
> 
> ## Comments on the Issue (you are @copilot in this section)
> 
> <comments>
> </comments>
> 


</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

- Fixes #704

<!-- START COPILOT CODING AGENT TIPS -->
---

💡 You can make Copilot smarter by setting up custom instructions,
customizing its development environment and configuring Model Context
Protocol (MCP) servers. Learn more [Copilot coding agent
tips](https://gh.io/copilot-coding-agent-tips) in the docs.

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>
arrayka added a commit that referenced this pull request Feb 11, 2026
## Completed Changes

- [x] Made `VirtualStorageProvider::new()` fully private (not just
doc-hidden)
- [x] Updated all unit tests in virtual_storage_provider.rs to use
`new_memory()`
- [x] Updated all usages across diskann-providers, diskann-disk, and
diskann-tools to use specific constructors
- [x] Added `VirtualStorageProvider::new_physical` to clippy.toml
disallowed-methods list
- [x] Removed all unused imports (MemoryFS, OverlayFS)
- [x] Made test helper functions generic over filesystem type where
needed
- [x] **Fixed doc tests:**
  - Added missing `use std::io::Write;` import to doc examples
- Removed doc examples that write to physical filesystem (new_physical
and new_overlay)
  - Simplified to only show `new_memory()` usage
  - Removed unnecessary turbofish notation in test code
- [x] Verified all checks pass:
  - ✅ cargo fmt --all --check
  - ✅ cargo clippy --locked --workspace --all-targets --no-deps
  - ✅ cargo test --doc --workspace
  - ✅ All unit tests pass

All PR review feedback has been addressed.

<!-- START COPILOT ORIGINAL PROMPT -->



<details>

<summary>Original prompt</summary>

> 
> ----
> 
> *This section details on the original issue you should resolve*
> 
> <issue_title>VirtualStorageProvider: encourage use of `new_overlay` or
`new_memory`</issue_title>
> <issue_description>Make `VirtualStorageProvider::new` private.
> Also add VirtualStorageProvider<PhysicalFS>::new_physical<P:
AsRef<std::path::Path>>(path: P).
> Expose only new_overlay(), new_memory(), new_physical().
> 
> Examples like these should be replaced with
VirtualStorageProvider::new_memory():
>         let fs = OverlayFS::new(&[MemoryFS::default().into()]);
>         let storage_provider = VirtualStorageProvider::new(fs);
> 
> Original post:
> > Do you see any value in making `VirtualStorageProvider::new` much
harder to call to encourage use of `new_overlay` or `new_memory`? Maybe
with like a `new_physical` and then linting that with clippy?
> 
> _Originally posted by @hildebrandmw in
[#700](https://github.com/microsoft/DiskANN/pull/700/changes#r2722436904)_</issue_description>
> 
> ## Comments on the Issue (you are @copilot in this section)
> 
> <comments>
> </comments>
> 


</details>



<!-- START COPILOT CODING AGENT SUFFIX -->

- Fixes #703

<!-- START COPILOT CODING AGENT TIPS -->
---

💬 We'd love your input! Share your thoughts on Copilot coding agent in
our [2 minute survey](https://gh.io/copilot-coding-agent-survey).

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: arrayka <1551741+arrayka@users.noreply.github.com>
Co-authored-by: arrayka <arrayka@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants