-
Notifications
You must be signed in to change notification settings - Fork 112
Add support for managing a fixed number of retained snapshots #3942
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
7559a12 to
2c0667f
Compare
00de3b0 to
a15f9db
Compare
2c0667f to
69048c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating this PR @pcholakov. I think it will be great improvement for our users no longer having to manage the snapshots themselves.
Before diving into the details, what was the motivation to explicitly keep track of deletions and retained snapshots from the perspective of the latest snapshot? The extra bookkeeping adds a bit of complexity which could not be necessary if we had a simple periodic snapshot cleaner that periodically lists the snapshot repository and deletes everything except for the latest retained snapshots. Did you want to save S3 get calls? Would there be a problem with reporting the archived lsn (or the lsn that no snapshot refers to anymore)? Or is this a preparation for things that will become necessary once we add support for incremental snapshots? Or is the idea that in the future there can be different retention policies which makes us want to track exactly which snapshots to retain and which ones to delete?
| snapshot_lsn = %metadata.min_applied_lsn, | ||
| archived_lsn = %archived_lsn.get_archived_lsn(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between snapshot lsn and archived lsn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I assume that the latter is the snapshot lsn of the earliest retained snapshot.
| /// A stream that tracks the last reported durable Lsn, replica-set durable points, and | ||
| /// last archived lsn and emits a [`PartitionDurability`] when the durable Lsn changes. | ||
| /// A stream that tracks the last reported durable Lsn, replica-set durable points, and archived LSN | ||
| /// (from snapshot repository), and emits a [`PartitionDurability`] when the durable lSN changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/lSN/LSN/
|
|
||
| pending_snapshots: HashMap<PartitionId, PendingSnapshotTask>, | ||
| latest_snapshots: HashMap<PartitionId, SnapshotCreated>, | ||
| latest_snapshots: HashMap<PartitionId, LatestSnapshot>, // NB: latest snapshot min LSN != archived LSN, necessarily |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment holds only true if num retained snapshots > 1, right? And even then, snapshots might have the same lsn (admittedly this is a corner case).
|
|
||
| pending_snapshots: HashMap<PartitionId, PendingSnapshotTask>, | ||
| latest_snapshots: HashMap<PartitionId, SnapshotCreated>, | ||
| latest_snapshots: HashMap<PartitionId, LatestSnapshot>, // NB: latest snapshot min LSN != archived LSN, necessarily |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between latest_snapshots and archived_lsns? The latter seems to contain a subset of the information of the former. Could this be unified?
| #[default] | ||
| V1, | ||
| /// V2 adds support for a fixed number of retained snapshots | ||
| // todo(v1.7): make this the default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's create a Github issue with the release-blocker label to help us remember this.
| .map(|filename| { | ||
| self.get_snapshot_file(&metadata, filename.name.trim_start_matches("/")) | ||
| }) | ||
| .chain(vec![metadata_path].into_iter()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::iter::once might be better than vec!.into_iter()
| info!( | ||
| %partition_id, | ||
| snapshot_id = %snapshot_ref.snapshot_id, | ||
| "Failed to clean up old snapshot; repeated failures will impact the ability to create new snapshots" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure whether failed cleanups should affect future snapshot creations.
| use super::{PartitionSnapshotMetadata, SnapshotFormatVersion}; | ||
|
|
||
| #[restate_core::test] | ||
| #[traced_test] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's just about enabling logging, then we have a different pattern in #[test_log::test(restate_core::test)] in the code base as well. traced_test does a bit more by giving tests access to the loggging, I believe. Would be great to unify those approaches at some point.
| snapshots.push(snapshot); | ||
| } | ||
|
|
||
| tokio::time::sleep(tokio::time::Duration::from_millis(100)).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this sleep here?
| let archived_lsn = repository.get_latest_archived_lsn(PartitionId::MIN).await?; | ||
| assert_eq!(archived_lsn.get_archived_lsn(), Lsn::new(1000)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to admit that this is tad bit confusing. When calling get_latest_archived_lsn I kind of would expect to get the archived lsn of the latest snapshot. I do understand that the archived lsn is now something else (like the snapshot lsn of th earliest retained snapshot) and maybe method names should reflect this?
a15f9db to
ba47e39
Compare
With this change, Restate can be configured to retain a set of recent snapshots, and automatically delete older snapshots that are no longer needed. Previously, older snapshots were no longer managed by Retate. This saves users from having to implement an external lifecycle policy. When explicit snapshot retention is specified, the reported Archived LSN will be that of the earliest retained snapshot. Together with the durability setting, this influences the automatic log trim behavior. When auto trim respects the Archived LSN, the latest partition snapshot reference can be (manually) updated to any snapshot within the retention window, e.g. to deal with snapshot corruption.
69048c2 to
afba6c6
Compare
With this change, Restate can be configured to retain a set of recent snapshots, and automatically delete older snapshots that are no longer needed. Previously, older snapshots were not managed by Restate and users are expected to figure out how to safely clean them up. This change saves users from having to implement an external lifecycle policy that respects the latest snapshot necessary for bootstrap.
When explicit snapshot retention is specified, the reported Archived LSN will be that of the earliest retained snapshot. Together with the durability setting, this influences the automatic log trim behavior. When auto trim respects the Archived LSN, any retained snapshot can be used to bootstrap a partition. For now, falling back to an earlier snapshot requires that the partition's
latest.jsonfile is manually updated to point to an earlier snapshot id if necessary, e.g. to deal with corruption.This change builds on #3918.
Sample minimal configuration file:
This configuration means: