Skip to content

Conversation

@ajroetker
Copy link
Contributor

vfs is a new storage abstraction layer that decouples Scorch's filesystem operations from the core indexing logic. This enables Scorch to use different storage backends (filesystem, S3, etc.) without modification.

  • Directory interface abstracting filesystem operations
  • FSDirectory: Drop-in replacement using local filesystem
  • Backward compatible with existing Scorch indexes
  • Allows custom storage backend implementations
  • Gets closer to enabling cloud-native and distributed architectures

I've also created a demonstration of the efficacy of the abstraction using an s3 backed minio client with local caching if that's of interest.

`vfs` is a new storage abstraction layer that decouples Scorch's filesystem
operations from the core indexing logic. This enables Scorch to use different
storage backends (filesystem, S3, etc.) without modification.

- Directory interface abstracting filesystem operations
- FSDirectory: Drop-in replacement using local filesystem
- Backward compatible with existing Scorch indexes
- Allows custom storage backend implementations
- Gets closer to enabling cloud-native and distributed architectures
@abhinavdangeti
Copy link
Member

Thanks for the contribution @ajroetker . Pinging the team to weigh in on this.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces a Virtual File System (VFS) abstraction layer for the Scorch index implementation to support pluggable storage backends beyond the local filesystem.

  • Adds a Directory interface that abstracts filesystem operations (Open, Create, Remove, Rename, Stat, ReadDir, Lock, Unlock)
  • Implements FSDirectory as a filesystem-based implementation of the Directory interface
  • Integrates VFS into the Scorch index to use the abstraction layer instead of direct filesystem calls

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
index/scorch/vfs/directory.go Defines the Directory and FileInfo interfaces for abstracting filesystem operations
index/scorch/vfs/fs_directory.go Implements FSDirectory using the local filesystem with file locking via syscall.Flock
index/scorch/vfs/fs_directory_test.go Tests for FSDirectory covering basic operations, directory operations, locking, and concurrent reads
index/scorch/vfs/directory_test.go Test suite for Directory interface compliance testing
index/scorch/scorch.go Integrates VFS directory into Scorch, allowing custom directory implementations via config
index/scorch/persister.go Updates file operations to use VFS directory, fixes lock scope in removeOldZapFiles

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +232 to +244
t.Errorf("Failed to open file: %v", err)
return
}
defer r.Close()

data, err := io.ReadAll(r)
if err != nil {
t.Errorf("Failed to read file: %v", err)
return
}

if string(data) != string(testData) {
t.Errorf("Data mismatch in concurrent read")
Copy link

Copilot AI Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using t.Errorf from within a goroutine is not safe as testing.T methods are not goroutine-safe. Consider using a different mechanism to capture errors from goroutines, such as collecting errors in a channel or using t.Error with proper synchronization.

Copilot uses AI. Check for mistakes.
@ajroetker
Copy link
Contributor Author

I think after implementing more of the necessary migration, I'll need to transition the vfs into a different package abstraction, possibly bleve_index_api? So that zapx can use the VFS for opening segments with the vfs interface.

@ajroetker
Copy link
Contributor Author

@ajroetker
Copy link
Contributor Author

blevesearch/zapx#343

@ajroetker
Copy link
Contributor Author

@abhinavdangeti I've tested all these prs together locally and everything passes, not sure how y'all handle the cross repo stuff in ci but happy to change whatever.

@abhinavdangeti
Copy link
Member

@abhinavdangeti I've tested all these prs together locally and everything passes, not sure how y'all handle the cross repo stuff in ci but happy to change whatever.

You can update the go.mod entries for bleve_index_api and zapx/v16 to <your_branch> and run a go mod tidy. Commit in the go.mod, go.sum changes to this PR for ci to pass. When we decide to move ahead with this proposal, we'll merge the PRs in the necessary order and update go.mods again at the time.

@ajroetker
Copy link
Contributor Author

ajroetker commented Nov 11, 2025

@abhinavdangeti Awesome thank you! I've done that to the zapx and this PR!

I'll take a look at any issues as they come up.

AJ Roetker added 2 commits November 11, 2025 15:47
Complete the VFS abstraction work by implementing CopyTo for online index
backups and fixing a bug where merge cancellation wasn't checked between
tasks.

- Implement snapshot_index.go CopyTo to copy bolt + segments via VFS
- Add cancellation check in merge loop between tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants