-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Lazy upgrade of the Spilo image #859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I welcome suggestions on how forcing the rolling upgrade should behave. The PR allows an administrator to force an update for the entire k8s cluster by changing the operator's configuration and restarting the operator pod. To support such behavior, the operator checks at every Currently the operator triggers the rolling update based on the results of comparison the actual and desired statefulssets' Specs. So it will not automatically catch pods running old images when the lazy update is disable because that action alone leaves the statefulset Spec intact. Things to consider:
|
pkg/cluster/sync.go
Outdated
| } | ||
| } | ||
|
|
||
| if !podsRollingUpdateRequired { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if !podsRollingUpdateRequired { | |
| if c.OpConfig.EnableLazySpiloUpgrade && !podsRollingUpdateRequired { |
I would move the check for lazy update here instead of doing it within mustUpdatePodsAfterLazyUpdate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this part here is really necessary? When lazy update is active it would create another API call for every StatefulSet. Wouldn't the next sync fix outdated images?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move the check for lazy update here instead of doing it within mustUpdatePodsAfterLazyUpdate
Changed. Note the condition is !c.OpConfig.EnableLazySpiloUpgrade , that is, we only need to do it when lazy update is disabled.
Wouldn't the next sync fix outdated images?
No. Sync() relies on stateful set spec to figure out if rolling update is needed. With the lazy upgrade, stateful set itself will be up-to-date (pod template will contain the most recent image), but some of the pods may still be running old images just because they were not yet restarted. Operator will look in the stateful set spec, see the most recent image and happily skip the rolling upgrade for this cluster even though some pods run outdated images.
|
👍 |
1 similar comment
|
👍 |
| // until they are re-created for other reasons, for example node rotation | ||
| if c.OpConfig.EnableLazySpiloUpgrade && !reflect.DeepEqual(c.Statefulset.Spec.Template.Spec.Containers[0].Image, statefulSet.Spec.Template.Spec.Containers[0].Image) { | ||
| needsReplace = true | ||
| needsRollUpdate = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needsRollUpdate = !c.OpConfig.EnableLazySpiloUpgrade would also have been enough here.
The two codes parts just for the error message I am not sure are great. But done is done.
In certain situation (think node rotation) we know pods re-start for reason independent of the operator. When doing planned updates of the Spilo image, we piggyback on this observation and skip the rolling update after the image update in the statefulset. That save us some downtime.