-
Notifications
You must be signed in to change notification settings - Fork 698
(feat) add vfs: storage abstraction layer for Scorch
#2237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
(feat) add vfs: storage abstraction layer for Scorch
#2237
Conversation
`vfs` is a new storage abstraction layer that decouples Scorch's filesystem operations from the core indexing logic. This enables Scorch to use different storage backends (filesystem, S3, etc.) without modification. - Directory interface abstracting filesystem operations - FSDirectory: Drop-in replacement using local filesystem - Backward compatible with existing Scorch indexes - Allows custom storage backend implementations - Gets closer to enabling cloud-native and distributed architectures
|
Thanks for the contribution @ajroetker . Pinging the team to weigh in on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a Virtual File System (VFS) abstraction layer for the Scorch index implementation to support pluggable storage backends beyond the local filesystem.
- Adds a
Directoryinterface that abstracts filesystem operations (Open, Create, Remove, Rename, Stat, ReadDir, Lock, Unlock) - Implements
FSDirectoryas a filesystem-based implementation of the Directory interface - Integrates VFS into the Scorch index to use the abstraction layer instead of direct filesystem calls
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 13 comments.
Show a summary per file
| File | Description |
|---|---|
| index/scorch/vfs/directory.go | Defines the Directory and FileInfo interfaces for abstracting filesystem operations |
| index/scorch/vfs/fs_directory.go | Implements FSDirectory using the local filesystem with file locking via syscall.Flock |
| index/scorch/vfs/fs_directory_test.go | Tests for FSDirectory covering basic operations, directory operations, locking, and concurrent reads |
| index/scorch/vfs/directory_test.go | Test suite for Directory interface compliance testing |
| index/scorch/scorch.go | Integrates VFS directory into Scorch, allowing custom directory implementations via config |
| index/scorch/persister.go | Updates file operations to use VFS directory, fixes lock scope in removeOldZapFiles |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| t.Errorf("Failed to open file: %v", err) | ||
| return | ||
| } | ||
| defer r.Close() | ||
|
|
||
| data, err := io.ReadAll(r) | ||
| if err != nil { | ||
| t.Errorf("Failed to read file: %v", err) | ||
| return | ||
| } | ||
|
|
||
| if string(data) != string(testData) { | ||
| t.Errorf("Data mismatch in concurrent read") |
Copilot
AI
Nov 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using t.Errorf from within a goroutine is not safe as testing.T methods are not goroutine-safe. Consider using a different mechanism to capture errors from goroutines, such as collecting errors in a channel or using t.Error with proper synchronization.
|
I think after implementing more of the necessary migration, I'll need to transition the vfs into a different package abstraction, possibly bleve_index_api? So that zapx can use the VFS for opening segments with the vfs interface. |
|
@abhinavdangeti I've tested all these prs together locally and everything passes, not sure how y'all handle the cross repo stuff in ci but happy to change whatever. |
You can update the go.mod entries for bleve_index_api and zapx/v16 to |
|
@abhinavdangeti Awesome thank you! I've done that to the zapx and this PR! I'll take a look at any issues as they come up. |
Complete the VFS abstraction work by implementing CopyTo for online index backups and fixing a bug where merge cancellation wasn't checked between tasks. - Implement snapshot_index.go CopyTo to copy bolt + segments via VFS - Add cancellation check in merge loop between tasks
vfsis a new storage abstraction layer that decouples Scorch's filesystem operations from the core indexing logic. This enables Scorch to use different storage backends (filesystem, S3, etc.) without modification.I've also created a demonstration of the efficacy of the abstraction using an s3 backed minio client with local caching if that's of interest.