Add searchable snapshots cache directory #50693

tlrx · 2020-01-07T08:26:25Z

Note: this pull request targets the feature/searchable-snapshots branch

This pull request adds a new CacheDirectory implementation which allows to cache ranges of bytes of Lucene files. This directory implementation uses a CacheService to retrieve and to read ranges of bytes from cached files stored on disk. The cached files are created as sparse files on disk and the availability of bytes ranges are tracked through a SparseFileTracker instance (thanks David!).

Co-Authored-By: David Turner <david.turner@elastic.co>

elasticmachine · 2020-01-07T08:26:27Z

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

tlrx · 2020-01-07T11:02:16Z

@elasticmachine run elasticsearch-ci/1

(Unrelated test failure)

DaveCTurner

I've done a first pass and left some comments and questions.

DaveCTurner · 2020-01-07T13:14:12Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+        Setting.Property.NodeScope);
+
+    public static final Setting<Boolean> SNAPSHOT_CACHE_INVALIDATE_ON_SHUTDOWN =
+        Setting.boolSetting("searchable.snapshot.cache.invalidate_on_shutdown", true, Setting.Property.NodeScope, Setting.Property.Dynamic);


Could we remove this and always invalidate the cache on shutdown? I see that this might be useful for some early experiments, but I think if we want to assert things about the content of the cache we can do so in an ESIntegTestCase while the node(s) are running.

Sure - I removed the setting and it now always invalidate on shutdown.

DaveCTurner · 2020-01-07T13:19:55Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+            this.evicted = false;
+        }
+
+        private long estimateWeight() {


I'm not very familiar with Cache but it looks like the weight is not expected to change during the lifetime of an entry which isn't the case with tracker.getLengthOfRanges(). I think for now it might be safer to use the length of the file.

I think for now it might be safer to use the length of the file.

I agree, I pushed this change

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

DaveCTurner · 2020-01-07T13:32:25Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+                boolean success = false;
+                try {
+                    logger.trace(() -> new ParameterizedMessage("creating new cache file for [{}] at [{}]", file.getFileName(), path));
+                    final FileChannel channel = FileChannel.open(path, CACHE_FILE_OPEN_OPTIONS);


If I understand correctly, this means every CacheEntry corresponds with an open file, even if the corresponding IndexInput is closed. I think this might cause problems at scale. Could we close the FileChannel whenever the IndexInput is closed?

DaveCTurner · 2020-01-07T13:47:16Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+                while (remaining > 0) {
+                    int len = (remaining < copyBufferSize) ? (int) remaining : copyBufferSize;
+                    logger.trace(() -> new ParameterizedMessage("reading {} bytes from [{}]", len, input));
+                    input.readBytes(copyBuffer, 0, len);


Will this work ok with non-filesystem blob stores? Ideally we'd send a single 32MB GET request and then write it locally in 8k chunks, but it's not obvious how we will tell input to prepare for a large sequential read like that.

I agree this is defeating the purpose of requesting large chunks of data from the underlying SearchableSnapshotIndexInput, which is today implemented as a BufferedIndexInput. With the current implementation only chunks of <<BufferedIndexInput's internal buffer size>> are requested (ie 1024).

I think that the right think to do is to have SearchableSnapshotIndexInput extends IndexInput directly and maintains an internal InputStream which is opened from the BlobContainer's readBlob(name, position, len) method where len is 32Mb.

The SearchableSnapshotIndexInput should take care of opening the InputStream from the required position (by using range bytes S3 requests), closing it and reopening it on IndexInput's seekings. It should also take care of draining/aborting the InputStream in case not all bytes were read.

DaveCTurner · 2020-01-07T13:56:41Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+    public void readFromCache(final Path filePath, final long fileLength, final CheckedSupplier<IndexInput, IOException> fileSource,
+                              final long position, final byte[] b, int off, int len) throws IOException {
+        long pos = position;
+        while (len > 0 && lifecycleState() == Lifecycle.State.STARTED) {


We might exit this loop with len > 0; if we do so I think we have to throw an exception.

Done in this change

DaveCTurner · 2020-01-07T14:04:38Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+        while (len > 0 && lifecycleState() == Lifecycle.State.STARTED) {
+            final CacheEntry cacheEntry = getOrAddToCache(filePath, fileLength, fileSource);
+            if (cacheEntry.tryIncRef() == false) {
+                continue;


I'm concerned about thrashing here. We're trying to handle the case where the entry is evicted between getOrAddToCache and tryIncRef, but if we have multiple threads all adding entries that get immediately evicted (might happen if the cache is too small and/or almost full) I think we could end up stuck here. Maybe it'd be better to give up on the cache in this case and read directly from the fileSource?

You're right, and this is a good suggestion.

I changed the test so that it randomly uses a smaller cache and changed the readFromCache() method so that it serves the read operation directly from the source if the picked up cache file is already evicted. It does a similar thing later in the process, if the cache file has been evicted just before writing the new range on disk.

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

…arch/xpack/searchablesnapshots/cache/CacheService.java Co-Authored-By: David Turner <david.turner@elastic.co>

tlrx · 2020-01-14T11:45:28Z

@DaveCTurner Thanks for your feedback. As said via another channel, I addressed the concerning issue about opened file handles by extracting the previous CacheEntry inner class into a CacheFile (no better name) outer class that implements RefCounted and opens/closes the file channel when needed. When a non-evicted CacheFile is used to read data, it is returned to the CacheBufferedIndexInput as a Releasable. It is then the responsibility of the IndexInput to notify that it is done with the last cache entry by releasing it when the input is closed.

Sadly, there is a file leak reported by the test framework and I haven't found the culprit yet:

``` java.lang.RuntimeException: file handle leaks: [FileChannel(/home/tanguy/Dev/elastic/elasticsearch-searchable-snapshots/x-pack/plugin/searchable-snapshots/build/testrun/test/temp/org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshotsIntegTests_5C32FE860A9D9738-001/tempDir-002/node_sd3/d0/indices/bRsLBbR5Q1-6TK2P2eDbtA/0/snapshots/2USI9vDEQau9K0LQ3abr7g/MF2_TsUVRxq1vcMHB_Mn5w), FileChannel(/home/tanguy/Dev/elastic/elasticsearch-searchable-snapshots/x-pack/plugin/searchable-snapshots/build/testrun/test/temp/org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshotsIntegTests_5C32FE860A9D9738-001/tempDir-002/node_sd4/d0/indices/WoK6PayuQL6_V9Q4yvWVzA/0/snapshots/v6rc2UOlSyOHq9TaCnUWcw/LA7wlmXZT0y4Rowdb3LplA), FileChannel(/home/tanguy/Dev/elastic/elasticsearch-searchable-snapshots/x-pack/plugin/searchable-snapshots/build/testrun/test/temp/org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshotsIntegTests_5C32FE860A9D9738-001/tempDir-002/node_sd3/d0/indices/hn_aBXC2TLaZnWHRP4hadg/0/snapshots/G7k5qu0UQYWd3__tc9e7AQ/OezzZN78SXe-g6tH064zkw)] at __randomizedtesting.SeedInfo.seed([5C32FE860A9D9738]:0) at org.apache.lucene.mockfile.LeakFS.onClose(LeakFS.java:63) at org.apache.lucene.mockfile.FilterFileSystem.close(FilterFileSystem.java:77) at org.apache.lucene.mockfile.FilterFileSystem.close(FilterFileSystem.java:78) at org.apache.lucene.util.TestRuleTemporaryFilesCleanup.afterAlways(TestRuleTemporaryFilesCleanup.java:228) at com.carrotsearch.randomizedtesting.rules.TestRuleAdapter$1.afterAlways(TestRuleAdapter.java:31) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:43) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at org.apache.lucene.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53) at org.apache.lucene.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:47) at org.apache.lucene.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:64) at org.apache.lucene.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:54) at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36) at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:368) at java.base/java.lang.Thread.run(Thread.java:834) Caused by: java.lang.Exception at org.apache.lucene.mockfile.LeakFS.onOpen(LeakFS.java:46) at org.apache.lucene.mockfile.HandleTrackingFS.callOpenHook(HandleTrackingFS.java:81) at org.apache.lucene.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:197) at org.apache.lucene.mockfile.HandleTrackingFS.newFileChannel(HandleTrackingFS.java:166) at java.base/java.nio.channels.FileChannel.open(FileChannel.java:292) at org.elasticsearch.xpack.searchablesnapshots.cache.CacheFile.tryIncRef(CacheFile.java:78) at org.elasticsearch.xpack.searchablesnapshots.cache.CacheService.readFromCache(CacheService.java:151) at org.elasticsearch.xpack.searchablesnapshots.cache.CacheDirectory$CacheBufferedIndexInput.readInternal(CacheDirectory.java:91) at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:342) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54) at org.apache.lucene.store.DataInput.readInt(DataInput.java:102) at org.apache.lucene.store.BufferedIndexInput.readInt(BufferedIndexInput.java:183) at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:194) at org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:255) at org.apache.lucene.codecs.lucene50.Lucene50CompoundReader.(Lucene50CompoundReader.java:81) at org.apache.lucene.codecs.lucene50.Lucene50CompoundFormat.getCompoundReader(Lucene50CompoundFormat.java:70) at org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:101) at org.apache.lucene.index.SegmentReader.(SegmentReader.java:84) at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:69) at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:61) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:640) at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:84) at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:137) at org.elasticsearch.index.engine.ReadOnlyEngine.open(ReadOnlyEngine.java:176) at org.elasticsearch.index.engine.ReadOnlyEngine.(ReadOnlyEngine.java:118) at org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshots.lambda$getEngineFactory$0(SearchableSnapshots.java:99) at org.elasticsearch.index.shard.IndexShard.innerOpenEngineAndTranslog(IndexShard.java:1592) at org.elasticsearch.index.shard.IndexShard.openEngineAndRecoverFromTranslog(IndexShard.java:1558) at org.elasticsearch.index.shard.StoreRecovery.lambda$restore$7(StoreRecovery.java:464) at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:89) at org.elasticsearch.repositories.blobstore.FileRestoreContext.lambda$restore$0(FileRestoreContext.java:159) at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) at org.elasticsearch.action.ActionListener.lambda$map$2(ActionListener.java:146) at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:63) at org.elasticsearch.action.support.GroupedActionListener.onResponse(GroupedActionListener.java:66) at org.elasticsearch.action.ActionListener$2.onResponse(ActionListener.java:89) at org.elasticsearch.action.ActionRunnable$1.doRun(ActionRunnable.java:46) at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:688) at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ... 1 more ```

any help, comment and feedback from you will be greatly appreciated.

DaveCTurner

As we discussed, I think there's a subtle issue with thread-safety in CacheBufferedIndexInput - see comments below.

DaveCTurner · 2020-01-14T13:17:54Z

...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java

+            this.ioContext = ioContext;
+            this.offset = offset;
+            this.end = offset + length;
+            this.clones = new ArrayList<>();


I think we clone and/or slice from different threads so we should make this threadsafe:

Suggested change

this.clones = new ArrayList<>();

this.clones = new CopyOnWriteArrayList<>();

DaveCTurner · 2020-01-14T13:18:13Z

...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java

+        @Override
+        public CacheBufferedIndexInput clone() {
+            final CacheBufferedIndexInput clone = (CacheBufferedIndexInput) super.clone();
+            clone.clones = new ArrayList<>();


I think we clone and/or slice from different threads so we should make this threadsafe:

Suggested change

clone.clones = new ArrayList<>();

clone.clones = new CopyOnWriteArrayList <>();

DaveCTurner · 2020-01-14T13:19:29Z

...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java

+
+        @Override
+        public void close() {
+            if (closed == false) {


Maybe use an AtomicBoolean for thread-safety, and then

Suggested change

if (closed == false) {

if (closed.compareAndSet(false, true)) {

Probably wise to assert closed.get() == false in the other methods too.

DaveCTurner · 2020-01-14T13:21:54Z

...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java

+        public void close() {
+            if (closed == false) {
+                closed = true;
+                if (releasable != null) {


We can avoid this null check:

Suggested change

if (releasable != null) {

Releasables.close(releasable);

DaveCTurner · 2020-01-14T13:27:05Z

...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java

+                if (releasable != null) {
+                    Releasables.close(releasable);
+                }
+                if (clones != null) {


I think this is always true.

DaveCTurner · 2020-01-14T13:27:22Z

...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java

+                    Releasables.close(releasable);
+                }
+                if (clones != null) {
+                    for (IndexInput clone : clones) {


Maybe IOUtils.close(clones)?

DaveCTurner · 2020-01-14T13:30:18Z

...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java

+            final CacheBufferedIndexInput clone = (CacheBufferedIndexInput) super.clone();
+            clone.clones = new ArrayList<>();
+            clone.releasable = null;
+            clones.add(clone);


I'm concerned that if we keep an IndexInput open for an extended period of time then we will accumulate a lot of clones here, all of which have been closed (or maybe just discarded). I think that e.g. with a compound segment we might keep it open for the life of the shard, and add at least one clone or slice for each search.

tlrx · 2020-01-20T11:51:50Z

@elasticmachine update branch

…hots-cache-directory

tlrx · 2020-01-20T12:57:58Z

@elasticmachine update branch

…hots-cache-directory

tlrx · 2020-01-20T13:50:20Z

@DaveCTurner I updated this branch according to our latest discussions. This is not yet perfect, but I think this is much more closer to how we think this cache system should be. I'm running more tests on my side, in the meantime that would be great if you could have a look to the changes.

…hannel inc ref

DaveCTurner

Great work. I will need to think about this more carefully tomorrow as the interlocking ref counting is pretty subtle, but I have done a superficial scan and think the structure looks about right.

DaveCTurner · 2020-01-20T17:22:16Z

...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java

+                CacheFile cacheFile = null;
+                try {
+                    cacheFile = getOrAcquire();
+                    final CacheFile.FileChannelRefCounted channelRef = cacheFile.getChannelRefCounter();


cacheFile may be null here if there's a concurrent eviction I think, but in that case we don't read directly.

Yes, this.cacheFile may be null and local variable not assigned as an exception should be thrown. I added an assertion to check non-nullity of the local cacheFile.

DaveCTurner · 2020-01-20T17:26:13Z

...ble-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheFile.java

+                    refCounter.decRef();
+                }
+            }
+            if (evictionListeners != null) {


nit: always true.

Right, this is a left over.

DaveCTurner · 2020-01-20T17:27:31Z

...ble-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheFile.java

+                try {
+                    ensureOpen();
+                    final Set<EvictionListener> newListeners = new HashSet<>(listeners);
+                    if (newListeners.add(Objects.requireNonNull(listener)) == false) {


Can we assert this (and also that listener != null)?

DaveCTurner · 2020-01-20T17:35:02Z

...ble-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheFile.java

+            }
+            if (evictionListeners != null) {
+                evictionListeners.forEach(listener -> listener.onEviction(this));
+            }


Can we assert here that listeners.isEmpty() and that refCounter.refCount() == 0?

I don't think we can? The close() method does not clean up the list of listeners but instead captures the registered listeners at closing time (end set the closed flag so that no new listeners can be registered). This is the duty of each listener to release and remove itself once it is notified of the eviction (and once the listener is done with any cache file resource)

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

...ble-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheFile.java

DaveCTurner · 2020-01-20T17:59:38Z

...ble-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheFile.java

+            channelRefCounter.deleteAfterClose();
+        } else {
+            try {
+                Files.deleteIfExists(file);


I got a test failure here due to WindowsFS rejecting deletes of files that are still open. I'm not quite sure why yet.

Raaaah. I ran the tests quite many times over the weekend and didn't get such failure. I'll try to reproduce and let you know my findings.

...src/test/java/org/elasticsearch/xpack/searchablesnapshots/SearchableSnapshotsIntegTests.java

tlrx · 2020-01-24T13:04:39Z

@DaveCTurner As said via another channel, this is ready for your review. I'm still running SearchableSnapshotsIntegTests in a loop and got 500+ tests execution passed successfully so far.

tlrx · 2020-01-25T14:46:44Z

@elasticmachine update branch

…hots-cache-directory

DaveCTurner

Looks great. I left only minor comments.

I think there might be value in extracting the parts of CacheBufferedIndexInput that only apply with parent == null since they're kind of a different concept from the actual IndexInput.

I have studied the locking in detail and see that we always obtain locks in the order CacheService -> CacheBufferedIndexInput -> CacheFile, so the deadlock situation looks good.

DaveCTurner · 2020-01-27T10:01:26Z

...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java

+
+        private @Nullable AtomicReference<CacheFile> cacheFile;
+        private @Nullable CacheBufferedIndexInput parent;
+        private ReleasableLock cacheEvictionLock;


I think we don't need a special kind of lock here any more, synchronized (this) should be ok here?

Right, synchronized blocks should be enough. I pushed 0aece73

DaveCTurner · 2020-01-27T10:03:07Z

...ble-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheFile.java

+
+    private volatile Set<EvictionListener> listeners;
+    private volatile FileChannel channel;
+    private volatile boolean closed;


Suggest renaming this to evicted now.

Makes sense, I pushed 48fa923

DaveCTurner · 2020-01-27T10:05:36Z

...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java

+                '}';
+        }
+
+        private int copySource(long start, long end, byte[] buffer, int offset) throws IOException {


Suggest renaming to something like readDirectly

This is a good suggestion, thanks. I pushed 391633f

DaveCTurner · 2020-01-27T10:06:35Z

...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java

+        @SuppressForbidden(reason = "Use positional writes on purpose")
+        void writeCacheFile(FileChannel fc, long start, long end) throws IOException {
+            assert assertFileChannelOpen(fc);
+            final byte[] copyBuffer = new byte[Math.toIntExact(Math.min(8192L, end - start))];


I think we should extract 8192L as a named constant (used here and in copySource).

Sure, I added COPY_BUFFER_SIZE in 99a9968

DaveCTurner · 2020-01-27T10:07:41Z

...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java

+        }
+
+        @Override
+        public IndexInput slice(String sliceDescription, long offset, long length) throws IOException {


Apparently this doesn't throw IOException.

Right -> 348fac8

DaveCTurner · 2020-01-27T10:13:38Z

...-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheService.java

+
+        // Prevent new CacheFile objects to be added to the cache while the cache is being fully or partially invalidated
+        // This can happen because CacheFile objects might execute listeners at eviction time, potentially forcing more
+        // objects (like CacheDirectory's index inputs) to get new CacheFile.


I don't follow this. We invalidate the cache when stopping and expected that in that case any attempts to get things from the cache would result in exceptions. Can you clarify?

Hum, I'm confused too 😕 I remember I added this because of some kind of deadlock in integration tests (not in manual tests, though) where the full invalidation ran concurrently with some file accesses in Lucene directory (executed by nodes list shard stores actions). The invalidation blocked the cache but the file accesses tried to get/compute new entries in cache, resulting in deadlocks. I've ran more tests without these locks and everything is fine so I now think that the deadlocks were just a manifestation of some previous buggy code.

DaveCTurner · 2020-01-27T10:34:39Z

...ble-snapshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheFile.java

+        void onEviction(CacheFile evictedCacheFile);
+    }
+
+    static final int RANGE_SIZE = 1 << 15;


I think we should run tests with a smaller (and non-power-of-two) here just to be certain we have good coverage, and a larger range size in production. Can we make this something that gets passed in?

(I thought I'd already made this comment but it seems to have been lost, sorry for the duplicate if so)

I thought I'd already made this comment but it seems to have been lost,

Yes you did and no not lost :) I replied in #50693 (comment) but basically I agree with you and I think it should be a another setting (we might want to play with cache range size in tests). It requires a bit of plumbing in tests so I proposed to do it in a follow up PR (but I can do it in this one if you prefer).

...napshots/src/main/java/org/elasticsearch/xpack/searchablesnapshots/cache/CacheDirectory.java

tlrx · 2020-01-27T13:24:32Z

Thanks @DaveCTurner. I've addressed and/or replied to your comments.

I think there might be value in extracting the parts of CacheBufferedIndexInput that only apply with parent == null since they're kind of a different concept from the actual IndexInput.

That sounds like a good suggestion but the part that only apply with parent == null is getOrAcquire() which is tightly coupled with how the root IndexInput's cacheFile instance is updated so I don't think we should extract this. Unless you were thinking of something different?

DaveCTurner · 2020-01-27T13:55:38Z

Perhaps it's clearer to show what I mean by the extraction:

tlrx/elasticsearch@add-searchable-snapshots-cache-directory...DaveCTurner:2020-01-27-cachefilereference-wip

WDYT?

tlrx · 2020-01-27T14:35:19Z

@DaveCTurner Thanks, I better understand what you meant. It's indeed clearer with the CacheFileReference. If that's ok for you I borrowed your suggestion, included the comments you added that I find valuable. Let me know if we can merge this now or if you have more comments, thanks!

DaveCTurner

LGTM, nice work @tlrx

tlrx · 2020-01-27T15:39:02Z

Thanks David!

This commit changes the current cache range size from 32kb to 32mb and makes it configurable through a globally defined xpack.searchable. snapshot.cache.range_size setting. This setting is set to low values in unit tests (see #50693) but a bit higher values in integration tests so that they don't take too much time to complete.

Today cache files are identified in cache using a string representing an absolute path to a file on disk. This path is a sub directory of the current shard data path and as such already contains identification bits like the current index id and the shard id. It also contains the snapshot id that is passed at CacheDirectory creation time. While this has been done for quick prototyping and already been improved in #51520, it feels wrong to rely on a path converted to a string as cache keys. Instead we should have a distinct CacheKey object to identify CacheFile in cache. Relates #50693

This commit is a first step to add instrumentation to the CacheDirectory added in #50693. It adds a new mutable IndexInputStats object that allows to track various information about how CacheBufferedIndexInput interacts with the underlying cache file to satisfy the read operations. It keep tracks of small/large forward/backward seekings as well as the total number of bytes read from the IndexInput and the total number of bytes read/written from/to the CacheFile. Note that the stats do not reflect the exact usage that Lucene does of the IndexInputs opened through a CacheDirectory: IndexInputStats is not aware of the read operations that are directly served at a higher level by the internal BufferedIndexInput's buffer. Instead it tracks what really hit the disk.

tlrx and others added 5 commits January 6, 2020 16:51

Add basis for CacheService & CacheDirectory

cc58922

Add SparseFileTracker

a3bd088

Co-Authored-By: David Turner <david.turner@elastic.co>

Add length to SparseFileTracker

bfbddbf

Tests passed

95ca3dc

Add small note

99194dc

tlrx added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs labels Jan 7, 2020

tlrx requested a review from DaveCTurner January 7, 2020 08:26

DaveCTurner reviewed Jan 7, 2020

View reviewed changes

tlrx and others added 8 commits January 8, 2020 09:37

Remove invalidate_on_shutdown

bf277a3

Estimate weight using file length

e0a6d01

whitespaces

e24a74d

Handle lifecycle

f3a958b

Add small cache size + concurrent searches

43b59b4

apply feedback

288fa21

Update x-pack/plugin/searchable-snapshots/src/main/java/org/elasticse…

b8597b8

…arch/xpack/searchablesnapshots/cache/CacheService.java Co-Authored-By: David Turner <david.turner@elastic.co>

Restore sparse flag

bc783a4

DaveCTurner reviewed Jan 14, 2020

View reviewed changes

DaveCTurner mentioned this pull request Jan 14, 2020

Lazy snapshot restores #50999

Closed

19 tasks

Apply David's feedback

51852da

Merge branch 'feature/searchable-snapshots' into add-searchable-snaps…

178b6c4

…hots-cache-directory

Merge branch 'feature/searchable-snapshots' into add-searchable-snaps…

8645db3

…hots-cache-directory

Fix bug in position + potential deadlock in getOrAcquire() + NPE in c…

93d4064

…hannel inc ref

DaveCTurner reviewed Jan 20, 2020

View reviewed changes

tlrx added 3 commits January 23, 2020 15:53

remove unused

50f36bb

More assertion with underlying index input

610affd

Clones should acquire from root IndexInput + prevent file closing

262d7e1

tlrx requested a review from DaveCTurner January 24, 2020 13:02

Merge branch 'feature/searchable-snapshots' into add-searchable-snaps…

757dc68

…hots-cache-directory

DaveCTurner reviewed Jan 27, 2020

View reviewed changes

tlrx added 6 commits January 27, 2020 14:01

ReleasableLock -> synchronized

0aece73

closed -> evicted

48fa923

readDirectly

391633f

COPY_BUFFER_SIZE

99a9968

Remove IOE

348fac8

Remove locking in CacheService

f01d632

Add CacheFileReference

f7cc415

DaveCTurner approved these changes Jan 27, 2020

View reviewed changes

tlrx merged commit 714c480 into elastic:feature/searchable-snapshots Jan 27, 2020

tlrx mentioned this pull request Jan 27, 2020

Introduce a caching mechanism for files in Searchable Snapshot Directory #49934

Closed

DaveCTurner mentioned this pull request Jan 28, 2020

Remove file field from CacheFileReference #51520

Merged

tlrx mentioned this pull request Jan 28, 2020

Add Cache Range Size setting #51521

Merged

This was referenced Jan 29, 2020

Add cache directory low-level instrumentation #51637

Merged

Use dedicated cache keys instead of relying on an absolute path #51669

Merged

tlrx mentioned this pull request Apr 6, 2020

Merge feature/searchable-snapshots branch into master #54803

Merged

	this.clones = new ArrayList<>();
	this.clones = new CopyOnWriteArrayList<>();

	clone.clones = new ArrayList<>();
	clone.clones = new CopyOnWriteArrayList <>();

	if (closed == false) {
	if (closed.compareAndSet(false, true)) {

Add searchable snapshots cache directory #50693

Add searchable snapshots cache directory #50693

Uh oh!

Conversation

tlrx commented Jan 7, 2020

Uh oh!

elasticmachine commented Jan 7, 2020

Uh oh!

tlrx commented Jan 7, 2020

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tlrx commented Jan 14, 2020

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tlrx commented Jan 20, 2020

Uh oh!

tlrx commented Jan 20, 2020

Uh oh!

tlrx commented Jan 20, 2020

Uh oh!

DaveCTurner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

tlrx Jan 27, 2020 •

edited

Loading