-
Notifications
You must be signed in to change notification settings - Fork 25.3k
Stop Setting "Completed" States on Snapshots in CS #54433
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stop Setting "Completed" States on Snapshots in CS #54433
Conversation
`SnapshotsService` will end snapshots once all their shards are in a final state. There is no point in setting `SUCCESS` when that happens. All that does is create the strange situation where a snapshot shows as `SUCCESS` in snapshot status APIs when its not yet done and loses the information as to whether or not a snapshot was aborted.
Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore) |
Jenkins test this |
Jenkins run elasticsearch-ci/2 (unrelated security fail) |
I wonder if we should introduce a new state called |
This already happens anyway. The only situation that gets more expensive is when the snapshot actually finished, because that short circuits the condition we use for finalization: entry.state().completed()
|| initializingSnapshots.contains(entry.snapshot()) == false
&& (entry.state() == State.INIT || completed(entry.shards().values())) That said, I pushed f62c701 which exploits our logic that keeps track of what snapshots are currently being finalization to not do the check for snapshots we're already finalizing so now it's always cheap :)
See above, we're doing the check already anyway.
I'd rather not do that for now to be honest, just a lot of added complexity when we'll have some BwC related changes for concurrent snapshots anyway? (this also changes the API returns unless we add some logic to keep the output the same so we can't easily backport the change to 7.x I guess) |
@ywelsch could you take another look here if you have a sec? Should be a quick one :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Closing here, this has become irrelevant now that we have concurrent snapshots and all of this code has changed. The general idea is still a possible optimization though. |
SnapshotsService
will end snapshots once all their shards are in a final state, regardless of theirstate
stored in theSnapshotsInProgress.Entry
.There is no point in setting
SUCCESS
when that happens. All that does is createthe strange situation where a snapshot shows as
SUCCESS
in snapshot status APIswhen its not yet done and loses the information as to whether or not a snapshot
was aborted.
This change in the state machine is fully BwC and enables a smarter snapshot abort logic in a follow-up that does not need to finalize an aborted snapshot only to then delete it again (keeping this in a separate PR since it's BwC and smarter abort logic wouldn't be).