Fix queued snapshot assignments after partial snapshot fails due to delete #88470

original-brownbear · 2022-07-12T13:02:34Z

We can't just assume that snapshot after snapshot is assigned right,
we must re-compute the right node or whether or not the shard even
exists still.

closes #86724

…elete We can't just assume that snapshot after snapshot is assigned right, we must re-compute the right node or whether or not the shard even exists still. closes elastic#86724

elasticmachine · 2022-07-12T13:02:50Z

Pinging @elastic/es-distributed (Team:Distributed)

elasticsearchmachine · 2022-07-12T13:03:11Z

Hi @original-brownbear, I've created a changelog YAML for you.

original-brownbear · 2022-07-12T13:28:42Z

Jenkins run elasticsearch-ci/part-2

original-brownbear · 2022-07-12T13:57:54Z

server/src/test/java/org/elasticsearch/snapshots/SnapshotsServiceTests.java

        final SnapshotsInProgress.ShardSnapshotStatus shardSnapshotStatus = startedSnapshot.shards().get(routingShardId);
-        assertThat(shardSnapshotStatus.state(), is(SnapshotsInProgress.ShardState.INIT));
-        assertThat(shardSnapshotStatus.nodeId(), is(dataNodeId));
+        assertThat(shardSnapshotStatus.state(), is(SnapshotsInProgress.ShardState.MISSING));


This was broken before, the shard isn't assigned so it must not move to INIT
=> since there's no other shards the snapshot must complete right away as well.

original-brownbear · 2022-07-12T13:58:46Z

server/src/main/java/org/elasticsearch/snapshots/SnapshotsService.java

-                                updatedState.generation(),
-                                entry.shardId(repoShardId)
-                            );
+                            startShardSnapshot(repoShardId, updatedState.generation());


No big change here really, just extracted the code that re-computes where to run the snapshot since we don't need to do the isQueued check twice and used it here.

tlrx

LGTM

original-brownbear · 2022-07-27T08:12:05Z

Thanks Tanguy!

Fix queued snapshot assignments after partial snapshot fails due to d…

58bb09c

…elete We can't just assume that snapshot after snapshot is assigned right, we must re-compute the right node or whether or not the shard even exists still. closes elastic#86724

original-brownbear added >bug :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.4.0 labels Jul 12, 2022

original-brownbear marked this pull request as ready for review July 12, 2022 13:02

elasticmachine added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label Jul 12, 2022

Update docs/changelog/88470.yaml

7df02b3

original-brownbear commented Jul 12, 2022

View reviewed changes

original-brownbear requested review from DaveCTurner and tlrx July 12, 2022 14:27

elasticsearchmachine changed the base branch from master to main July 22, 2022 23:05

tlrx approved these changes Jul 26, 2022

View reviewed changes

original-brownbear merged commit 0e8f5e4 into elastic:main Jul 27, 2022

original-brownbear deleted the 86724-one-more-time branch July 27, 2022 08:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix queued snapshot assignments after partial snapshot fails due to delete #88470

Fix queued snapshot assignments after partial snapshot fails due to delete #88470

Uh oh!

original-brownbear commented Jul 12, 2022

Uh oh!

elasticmachine commented Jul 12, 2022

Uh oh!

elasticsearchmachine commented Jul 12, 2022

Uh oh!

original-brownbear commented Jul 12, 2022

Uh oh!

original-brownbear Jul 12, 2022

Uh oh!

original-brownbear Jul 12, 2022

Uh oh!

tlrx left a comment

Uh oh!

original-brownbear commented Jul 27, 2022

Uh oh!

Uh oh!

Fix queued snapshot assignments after partial snapshot fails due to delete #88470

Fix queued snapshot assignments after partial snapshot fails due to delete #88470

Uh oh!

Conversation

original-brownbear commented Jul 12, 2022

Uh oh!

elasticmachine commented Jul 12, 2022

Uh oh!

elasticsearchmachine commented Jul 12, 2022

Uh oh!

original-brownbear commented Jul 12, 2022

Uh oh!

original-brownbear Jul 12, 2022

Choose a reason for hiding this comment

Uh oh!

original-brownbear Jul 12, 2022

Choose a reason for hiding this comment

Uh oh!

tlrx left a comment

Choose a reason for hiding this comment

Uh oh!

original-brownbear commented Jul 27, 2022

Uh oh!

Uh oh!