-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Fix Concurrent Snapshot Create+Delete + Delete Index #61770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We had a bug here were we put a `null` value into the shard assignment mapping when reassigning work after a snapshot delete had gone through. This only affects partial snaphots but essentially dead-locks the snapshot process. Closes elastic#61762
Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore) |
@@ -1715,8 +1723,7 @@ public static ClusterState updateWithSnapshots(ClusterState state, | |||
IndexMetadata indexMetadata = metadata.index(indexName); | |||
if (indexMetadata == null) { | |||
// The index was deleted before we managed to start the snapshot - mark it as missing. | |||
builder.put(new ShardId(indexName, IndexMetadata.INDEX_UUID_NA_VALUE, 0), | |||
new SnapshotsInProgress.ShardSnapshotStatus(null, ShardState.MISSING, "missing index", null)); | |||
builder.put(new ShardId(indexName, IndexMetadata.INDEX_UUID_NA_VALUE, 0), ShardSnapshotStatus.MISSING); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before concurrent snapshots this spot would cover all possible scenarios because we'd only be dealing with shard ids for indices that still exist in the repo ever beyond this point. If an index was deleted after assignment then it would just fail in the SnapshotShardsService
and things would work out that way.
But with concurrent snapshots where we could have indices deleted from under a queued up shard snapshot we have to explicitly deal with this situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks Yannick! |
We had a bug here were we put a `null` value into the shard assignment mapping when reassigning work after a snapshot delete had gone through. This only affects partial snaphots but essentially dead-locks the snapshot process. Closes elastic#61762
We had a bug here were we put a `null` value into the shard assignment mapping when reassigning work after a snapshot delete had gone through. This only affects partial snaphots but essentially dead-locks the snapshot process. Closes elastic#61762
We had a bug here were we put a `null` value into the shard assignment mapping when reassigning work after a snapshot delete had gone through. This only affects partial snaphots but essentially dead-locks the snapshot process. Closes #61762
We had a bug here were we put a `null` value into the shard assignment mapping when reassigning work after a snapshot delete had gone through. This only affects partial snaphots but essentially dead-locks the snapshot process. Closes #61762
Linking this to #56911 so that I can find it again in future. |
We had a bug here that is new to
7.9
were we put anull
value into the shardassignment mapping when reassigning work after a snapshot delete
had gone through. This only affects partial snaphots but essentially
dead-locks the snapshot process.
Closes #61762