Skip to content

Increment google cloud apis to 1.52#35459

Closed
clandry94 wants to merge 2 commits intoelastic:masterfrom
clandry94:bump_google_api_versions
Closed

Increment google cloud apis to 1.52#35459
clandry94 wants to merge 2 commits intoelastic:masterfrom
clandry94:bump_google_api_versions

Conversation

@clandry94
Copy link

@clandry94 clandry94 commented Nov 12, 2018

A bug was fixed in the google cloud java API that caused IOExceptions when uploading snapshots to GCS. Namely, this issue googleapis/google-cloud-java#3410 and this PR fixing it
googleapis/google-cloud-java#3433. There have been reports of this bug appearing in Elasticsearch #35229. Let's bump to the latest release of those google cloud java APIs.

Closes #35229

cc @ywelsch

@jtibshirani jtibshirani added :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >bug labels Nov 12, 2018
@ywelsch
Copy link
Contributor

ywelsch commented Nov 12, 2018

@elasticmachine test this please

@alpar-t
Copy link
Contributor

alpar-t commented Nov 12, 2018

@clandry94 you have to run updateSHAs and check the results in when updating dependencies.

@clandry94 clandry94 force-pushed the bump_google_api_versions branch from fe60f65 to bbfe98b Compare November 12, 2018 20:44
@clandry94
Copy link
Author

@elasticmachine test this please

@ywelsch
Copy link
Contributor

ywelsch commented Nov 13, 2018

@elasticmachine retest this please

@imrimt
Copy link

imrimt commented Feb 5, 2019

what workaround do you guys recommend around resolving the IOException issue since 6.7 is not out yet and we won't be able to upgrade to it anytime soon. This issue is particularly more likely when we are taking a snapshot at scale (~80gb of data) -- I have tried running it 4 times and they all failed.

We are currently on Elasticsearch 6.4.2. Would replacing the client jars directly in the plugins resolve this issue (i.e would ES 6.4.2 work with a newer GCS java client?)

Here's an example of the stacktrace that we are seeing:

[2019-02-04T22:37:33,028][WARN ][o.e.s.SnapshotShardsService] [localhost] [[tamr_datasets][13]][test_snapshot1:snapshot_1/ZMzfA4DUTB6y_NolLaL2_Q] failed to snapshot shard
org.elasticsearch.index.snapshots.IndexShardSnapshotFailedException: com.google.cloud.storage.StorageException: Error writing request body to server
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.snapshotShard(BlobStoreRepository.java:858) ~[elasticsearch-6.4.2.jar:6.4.2]
	at org.elasticsearch.snapshots.SnapshotShardsService.snapshot(SnapshotShardsService.java:410) ~[elasticsearch-6.4.2.jar:6.4.2]
	at org.elasticsearch.snapshots.SnapshotShardsService.access$200(SnapshotShardsService.java:97) ~[elasticsearch-6.4.2.jar:6.4.2]
	at org.elasticsearch.snapshots.SnapshotShardsService$1.doRun(SnapshotShardsService.java:354) [elasticsearch-6.4.2.jar:6.4.2]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:723) [elasticsearch-6.4.2.jar:6.4.2]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.4.2.jar:6.4.2]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]
Caused by: com.google.cloud.storage.StorageException: Error writing request body to server
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.translate(HttpStorageRpc.java:220) ~[?:?]
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.write(HttpStorageRpc.java:703) ~[?:?]
	at com.google.cloud.storage.BlobWriteChannel$1.run(BlobWriteChannel.java:51) ~[?:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_111]
	at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:89) ~[?:?]
	at com.google.cloud.RetryHelper.run(RetryHelper.java:74) ~[?:?]
	at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:51) ~[?:?]
	at com.google.cloud.storage.BlobWriteChannel.flushBuffer(BlobWriteChannel.java:47) ~[?:?]
	at com.google.cloud.BaseWriteChannel.flush(BaseWriteChannel.java:122) ~[?:?]
	at com.google.cloud.BaseWriteChannel.write(BaseWriteChannel.java:149) ~[?:?]
	at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore$2.lambda$write$0(GoogleCloudStorageBlobStore.java:238) ~[?:?]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_111]
	at org.elasticsearch.repositories.gcs.SocketAccess.doPrivilegedIOException(SocketAccess.java:44) ~[?:?]
	at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore$2.write(GoogleCloudStorageBlobStore.java:238) ~[?:?]
	at java.nio.channels.Channels.writeFullyImpl(Channels.java:78) ~[?:1.8.0_111]
	at java.nio.channels.Channels.writeFully(Channels.java:101) ~[?:1.8.0_111]
	at java.nio.channels.Channels.access$000(Channels.java:61) ~[?:1.8.0_111]
	at java.nio.channels.Channels$1.write(Channels.java:174) ~[?:1.8.0_111]
	at org.elasticsearch.core.internal.io.Streams.copy(Streams.java:55) ~[elasticsearch-core-6.4.2.jar:6.4.2]
	at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.writeBlobResumable(GoogleCloudStorageBlobStore.java:224) ~[?:?]
	at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.writeBlob(GoogleCloudStorageBlobStore.java:203) ~[?:?]
	at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobContainer.writeBlob(GoogleCloudStorageBlobContainer.java:68) ~[?:?]
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository$SnapshotContext.snapshotFile(BlobStoreRepository.java:1331) ~[elasticsearch-6.4.2.jar:6.4.2]
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository$SnapshotContext.snapshot(BlobStoreRepository.java:1266) ~[elasticsearch-6.4.2.jar:6.4.2]
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.snapshotShard(BlobStoreRepository.java:852) ~[elasticsearch-6.4.2.jar:6.4.2]
	... 8 more
Caused by: java.io.IOException: Error writing request body to server
	at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.checkError(HttpURLConnection.java:3518) ~[?:?]
	at sun.net.www.protocol.http.HttpURLConnection$StreamingOutputStream.write(HttpURLConnection.java:3501) ~[?:?]
	at com.google.api.client.util.ByteStreams.copy(ByteStreams.java:55) ~[?:?]
	at com.google.api.client.util.IOUtils.copy(IOUtils.java:94) ~[?:?]
	at com.google.api.client.http.AbstractInputStreamContent.writeTo(AbstractInputStreamContent.java:72) ~[?:?]
	at com.google.api.client.http.javanet.NetHttpRequest.execute(NetHttpRequest.java:80) ~[?:?]
	at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:981) ~[?:?]
	at com.google.cloud.storage.spi.v1.HttpStorageRpc.write(HttpStorageRpc.java:684) ~[?:?]
	at com.google.cloud.storage.BlobWriteChannel$1.run(BlobWriteChannel.java:51) ~[?:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_111]
	at com.google.api.gax.retrying.DirectRetryingExecutor.submit(DirectRetryingExecutor.java:89) ~[?:?]
	at com.google.cloud.RetryHelper.run(RetryHelper.java:74) ~[?:?]
	at com.google.cloud.RetryHelper.runWithRetries(RetryHelper.java:51) ~[?:?]
	at com.google.cloud.storage.BlobWriteChannel.flushBuffer(BlobWriteChannel.java:47) ~[?:?]
	at com.google.cloud.BaseWriteChannel.flush(BaseWriteChannel.java:122) ~[?:?]
	at com.google.cloud.BaseWriteChannel.write(BaseWriteChannel.java:149) ~[?:?]
	at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore$2.lambda$write$0(GoogleCloudStorageBlobStore.java:238) ~[?:?]
	at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_111]
	at org.elasticsearch.repositories.gcs.SocketAccess.doPrivilegedIOException(SocketAccess.java:44) ~[?:?]
	at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore$2.write(GoogleCloudStorageBlobStore.java:238) ~[?:?]
	at java.nio.channels.Channels.writeFullyImpl(Channels.java:78) ~[?:1.8.0_111]
	at java.nio.channels.Channels.writeFully(Channels.java:101) ~[?:1.8.0_111]
	at java.nio.channels.Channels.access$000(Channels.java:61) ~[?:1.8.0_111]
	at java.nio.channels.Channels$1.write(Channels.java:174) ~[?:1.8.0_111]
	at org.elasticsearch.core.internal.io.Streams.copy(Streams.java:55) ~[elasticsearch-core-6.4.2.jar:6.4.2]
	at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.writeBlobResumable(GoogleCloudStorageBlobStore.java:224) ~[?:?]
	at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStore.writeBlob(GoogleCloudStorageBlobStore.java:203) ~[?:?]
	at org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobContainer.writeBlob(GoogleCloudStorageBlobContainer.java:68) ~[?:?]
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository$SnapshotContext.snapshotFile(BlobStoreRepository.java:1331) ~[elasticsearch-6.4.2.jar:6.4.2]
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository$SnapshotContext.snapshot(BlobStoreRepository.java:1266) ~[elasticsearch-6.4.2.jar:6.4.2]
	at org.elasticsearch.repositories.blobstore.BlobStoreRepository.snapshotShard(BlobStoreRepository.java:852) ~[elasticsearch-6.4.2.jar:6.4.2]
	... 8 more
[2019-02-04T22:47:03,848][INFO ][o.e.s.SnapshotsService   ] [localhost] snapshot [test_snapshot1:snapshot_1/ZMzfA4DUTB6y_NolLaL2_Q] completed with state [PARTIAL]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>bug :Distributed/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Snapshots on large indices fail on some shards when master election occurs

7 participants