Stop Setting "Completed" States on Snapshots in CS #54433

original-brownbear · 2020-03-30T15:08:01Z

SnapshotsService will end snapshots once all their shards are in a final state, regardless of their state stored in the SnapshotsInProgress.Entry.
There is no point in setting SUCCESS when that happens. All that does is create
the strange situation where a snapshot shows as SUCCESS in snapshot status APIs
when its not yet done and loses the information as to whether or not a snapshot
was aborted.

This change in the state machine is fully BwC and enables a smarter snapshot abort logic in a follow-up that does not need to finalize an aborted snapshot only to then delete it again (keeping this in a separate PR since it's BwC and smarter abort logic wouldn't be).

`SnapshotsService` will end snapshots once all their shards are in a final state. There is no point in setting `SUCCESS` when that happens. All that does is create the strange situation where a snapshot shows as `SUCCESS` in snapshot status APIs when its not yet done and loses the information as to whether or not a snapshot was aborted.

elasticmachine · 2020-03-30T15:08:03Z

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

original-brownbear · 2020-03-30T19:11:06Z

Jenkins test this

original-brownbear · 2020-03-31T06:43:45Z

Jenkins run elasticsearch-ci/2 (unrelated security fail)

ywelsch · 2020-04-01T09:52:27Z

I wonder if we should introduce a new state called FINALIZE which shows that all shard-level actions completed (and snapshot was not aborted, where we would otherwise keep the ABORTED state), and that only finalization is needed. This makes it easier at the high level to track the transitions. I'm wondering why this change here works, as endSnapshot is triggered by checking entry.state().completed(). Doing the full scan of completed(entry.shards().values()) in SnapshotsService.applyClusterState is much more expensive, especially if this needs to be done on every CS update that is happening while snapshot is pending completion.

…bort

original-brownbear · 2020-04-01T10:19:30Z

@ywelsch

Doing the full scan of completed(entry.shards().values()) in SnapshotsService.applyClusterState is much more expensive, especially if this needs to be done on every CS update

This already happens anyway. The only situation that gets more expensive is when the snapshot actually finished, because that short circuits the condition we use for finalization:

entry.state().completed()
                            || initializingSnapshots.contains(entry.snapshot()) == false
                               && (entry.state() == State.INIT || completed(entry.shards().values()))

That said, I pushed f62c701 which exploits our logic that keeps track of what snapshots are currently being finalization to not do the check for snapshots we're already finalizing so now it's always cheap :)

I'm wondering why this change here works

See above, we're doing the check already anyway.

This makes it easier at the high level to track the transitions

I'd rather not do that for now to be honest, just a lot of added complexity when we'll have some BwC related changes for concurrent snapshots anyway? (this also changes the API returns unless we add some logic to keep the output the same so we can't easily backport the change to 7.x I guess)

original-brownbear · 2020-04-03T08:46:49Z

@ywelsch could you take another look here if you have a sec? Should be a quick one :)
Having this in, would make some other work easier (and apparently stabilize some ILM/SLM integ-tests). Thanks!

tlrx

LGTM

…bort

original-brownbear · 2021-01-29T12:31:37Z

Closing here, this has become irrelevant now that we have concurrent snapshots and all of this code has changed. The general idea is still a possible optimization though.

original-brownbear added >non-issue :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.8.0 labels Mar 30, 2020

original-brownbear changed the title ~~Stop Setting completed States on Snapshots in CS~~ Stop Setting "Completed" States on Snapshots in CS Mar 30, 2020

original-brownbear requested review from ywelsch and tlrx March 31, 2020 07:25

original-brownbear added 2 commits April 1, 2020 12:13

Merge remote-tracking branch 'elastic/master' into simpler-snapshot-a…

95e0c24

…bort

CR: improve efficiency

f62c701

andreidan mentioned this pull request Apr 2, 2020

ILM wait for snapshot creation to complete #54673

Merged

tlrx approved these changes Apr 6, 2020

View reviewed changes

original-brownbear added 2 commits April 6, 2020 11:55

Merge remote-tracking branch 'elastic/master' into simpler-snapshot-a…

9efca1d

…bort

Merge remote-tracking branch 'elastic/master' into simpler-snapshot-a…

f97ae7d

…bort

dliappis mentioned this pull request Apr 10, 2020

Integ Test Failure with ILM / testSearchableSnapshotAction #55050

Closed

rjernst added the Team:Distributed (Obsolete) Meta label for distributed team (obsolete). Replaced by Distributed Indexing/Coordination. label May 4, 2020

pugnascotia added v7.8.1 v7.9.0 and removed v7.8.0 v7.8.1 labels May 6, 2020

pugnascotia added v7.10.0 and removed v7.9.0 labels Jul 15, 2020

ywelsch removed their request for review July 23, 2020 07:26

andreidan removed the v7.10.0 label Oct 7, 2020

andreidan added the v7.11.0 label Oct 7, 2020

pugnascotia added v7.12.0 and removed v7.11.0 labels Dec 16, 2020

andreidan mentioned this pull request Dec 21, 2020

Fix SearchableSnapshotActionIT #66698

Merged

original-brownbear removed v7.12.0 v8.0.0 labels Jan 29, 2021

original-brownbear closed this Jan 29, 2021

andreidan mentioned this pull request Feb 8, 2021

Unmute testDeleteActionDeletesSearchableSnapshot #68700

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Stop Setting "Completed" States on Snapshots in CS #54433

Stop Setting "Completed" States on Snapshots in CS #54433

Uh oh!

original-brownbear commented Mar 30, 2020

Uh oh!

elasticmachine commented Mar 30, 2020

Uh oh!

original-brownbear commented Mar 30, 2020

Uh oh!

original-brownbear commented Mar 31, 2020

Uh oh!

ywelsch commented Apr 1, 2020

Uh oh!

original-brownbear commented Apr 1, 2020

Uh oh!

original-brownbear commented Apr 3, 2020

Uh oh!

tlrx left a comment

Uh oh!

original-brownbear commented Jan 29, 2021

Uh oh!

Uh oh!

Stop Setting "Completed" States on Snapshots in CS #54433

Stop Setting "Completed" States on Snapshots in CS #54433

Uh oh!

Conversation

original-brownbear commented Mar 30, 2020

Uh oh!

elasticmachine commented Mar 30, 2020

Uh oh!

original-brownbear commented Mar 30, 2020

Uh oh!

original-brownbear commented Mar 31, 2020

Uh oh!

ywelsch commented Apr 1, 2020

Uh oh!

original-brownbear commented Apr 1, 2020

Uh oh!

original-brownbear commented Apr 3, 2020

Uh oh!

tlrx left a comment

Choose a reason for hiding this comment

Uh oh!

original-brownbear commented Jan 29, 2021

Uh oh!

Uh oh!