Introduce a caching mechanism for files in Searchable Snapshot Directory #49934
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Note: this draft pull request targets the
feature/searchable-snapshots
branchThis pull request introduces a simple caching mechanism that operates at the Lucene files level of searchable snapshot directories.
Several new classes are introduced or changed since #49651: the searchable snapshot directory (
SearchableSnapshotDirectory
) now contains a representation of the snapshotted shard files (SearchableSnapshotShard
) which allows to list the files or read a file from a specific snapshot.A basic implementation of a searchable snapshot shard is
BlobStoreSearchableSnapshotShard
which directly accesses a remote blob store repository to list or to read files. This implementation takes care of converting the names of Lucene files into blob names in the repository and to load the appropriate chunks of blobs (the implementation is still very raw and error prone and must be consolidate).Another implementation of a searchable snapshot shard is
CachedSearchableSnapshotShard
whichcaches segment (or portion) of file using a
CacheService
. This cache service uses the existing LRUorg.elasticsearch.common.cache.Cache
to cache file segments in memory. This cache is also very raw and should evolve to something more complex that caches segment of files on disk. TheCachedSearchableSnapshotShard
acts as aFilterSearchableSnapshotShard
so that it delegates the listing or the reading of files to another searchable snapshot shard in case of the segment of file to read is not present in cache (ie, a cache miss). When the segment of file to read requested by the searchable snapshot directory's index input is present in cache it is served directly.Finally, this pull request reuses the tests added in #49651 to test the searchable snapshot directory implementation by randomly use the cache or not.