-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix missing alternate full snapshots #272
Fix missing alternate full snapshots #272
Conversation
Signed-off-by: Shreyas Rao <shreyas.sriganesh.rao@sap.com>
Thanks for the PR, @shreyas-s-rao! The change itself looks good. But as discussed elsewhere, how about addressing this by tweaking the gardener to consider hibernation schedule while computing the backup schedule string? It is already considering maintenance schedule. |
@amshuman-kr as discussed, we can go ahead with this PR as it's a quick win, while we do acknowledge that it's only a temporary solution, and we need a more generic approach to determine whether a "recent enough" previous full snapshot was taken, according to the given cron schedule. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Have you tested the case? I haven't :-)
@amshuman-kr I have tested this locally with a helm chart, where I simulated a delayed startup (by 10 minutes) of etcd on day 1. Without this PR, the day 2 etcdbr startup missed the full snapshot. With the PR, it was able to account for that 10-minute delay the previous day, and took the full snapshot on etcdbr startup. Not sure if we can have an automated test case for this, since the config values are more Gardener-specific (like 24 hours). Until we automate that, I don't see a way to add a test case for this scenario. I'll open an issue to move away from the hard-coded 24-hour value here towards a more cron-schedule-aware method to calculate the period between two consecutive full snapshots. For the time-being, I think we can go ahead and merge this PR. WDYT? |
Signed-off-by: Shreyas Rao shreyas.sriganesh.rao@sap.com
What this PR does / why we need it:
This PR fixes missing full snapshots on shoot unhibernation. Setting the previous full snapshot existence check to 23.5 hours provides a buffer of 30 minutes for the pod to become ready to take the first full snapshot everyday in the case where full snapshot schedule lies within the shoot's hibernation period.
Which issue(s) this PR fixes:
Fixes #258
Special notes for your reviewer:
Release note: