-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Fix queued snapshot assignments after partial snapshot fails due to delete #88470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix queued snapshot assignments after partial snapshot fails due to delete #88470
Conversation
…elete We can't just assume that snapshot after snapshot is assigned right, we must re-compute the right node or whether or not the shard even exists still. closes elastic#86724
Pinging @elastic/es-distributed (Team:Distributed) |
Hi @original-brownbear, I've created a changelog YAML for you. |
Jenkins run elasticsearch-ci/part-2 |
final SnapshotsInProgress.ShardSnapshotStatus shardSnapshotStatus = startedSnapshot.shards().get(routingShardId); | ||
assertThat(shardSnapshotStatus.state(), is(SnapshotsInProgress.ShardState.INIT)); | ||
assertThat(shardSnapshotStatus.nodeId(), is(dataNodeId)); | ||
assertThat(shardSnapshotStatus.state(), is(SnapshotsInProgress.ShardState.MISSING)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was broken before, the shard isn't assigned so it must not move to INIT
=> since there's no other shards the snapshot must complete right away as well.
updatedState.generation(), | ||
entry.shardId(repoShardId) | ||
); | ||
startShardSnapshot(repoShardId, updatedState.generation()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No big change here really, just extracted the code that re-computes where to run the snapshot since we don't need to do the isQueued
check twice and used it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks Tanguy! |
We can't just assume that snapshot after snapshot is assigned right,
we must re-compute the right node or whether or not the shard even
exists still.
closes #86724