Description
Elasticsearch version (bin/elasticsearch --version
): 5.5.1
Plugins installed: [repository-gcs discovery-gce]
JVM version (java -version
): 1.8.0_131
OS version (uname -a
if on a Unix-like system): Linux XXX 4.10.0-27-generic #30~16.04.2-Ubuntu SMP Thu Jun 29 16:07:46 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Description of the problem including expected versus actual behavior:
Creating a snapshot fails on certain shards. Retrying a new snapshot works. For me it seems to fail on about 10% of shards (testing with 51 shards, 4 failed last test, 2 when I retried, finally 0 on the third try)
The exception is IndexShardSnapshotFailedException[BlobStoreException[Failed to check if blob [__79.part4] exists]; nested: SocketTimeoutException[Read timed out];]; nested: BlobStoreException[Failed to check if blob [__79.part4] exists]; nested: SocketTimeoutException[Read timed out];
This is using gcs coldstorage.
I see that there are further options i can give the plugin, mainly http.connect_timeout and http.read_timeout, but im not sure if they are relevant for the exception below: java.io.IOException: insufficient data written
I wouldn't mind this failing if I could detect it and retry. Could I do this by deleting the snapshot and recreating it? From what I understand the successfully backed up shards will not be deleted if I did this?
Steps to reproduce:
- Create a gcs snapshot with these settings
{"gcs":{"type":"gcs","settings":{"bucket":"XXXX","compress":"true"}}}
Provide logs (if relevant):
org.elasticsearch.index.snapshots.IndexShardSnapshotFailedException: Failed to perform snapshot (index files)
at org.elasticsearch.repositories.blobstore.BlobStoreRepository$SnapshotContext.snapshot(BlobStoreRepository.java:1377) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.repositories.blobstore.BlobStoreRepository.snapshotShard(BlobStoreRepository.java:972) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.snapshots.SnapshotShardsService.snapshot(SnapshotShardsService.java:382) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.snapshots.SnapshotShardsService.access$200(SnapshotShardsService.java:88) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.snapshots.SnapshotShardsService$1.doRun(SnapshotShardsService.java:335) [elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) [elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.5.1.jar:5.5.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
Caused by: java.io.IOException: insufficient data written
at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.close(HttpURLConnection.java:3540) ~[?:?]
at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:81) ~[?:?]
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:972) ~[?:?]
at com.google.api.client.googleapis.media.MediaHttpUploader.executeCurrentRequestWithoutGZip(MediaHttpUploader.java:545) ~[?:?]
at com.google.api.client.googleapis.media.MediaHttpUploader.resumableUpload(MediaHttpUploader.java:417) ~[?:?]
at com.google.api.client.googleapis.media.MediaHttpUploader.upload(MediaHttpUploader.java:336) ~[?:?]
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:427) ~[?:?]
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352) ~[?:?]```