-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve calculation of NodeDrainTimeout & NodeVolumeDetachTimeout exceeded #11126
Comments
/triage accepted |
+1 |
/assign |
One small note, wondering if we should call it status.deletion instead of termination. I think in CAPI we usually always talk about "deletion" + NodeVolumeDetachStartTime should be WaitForNodeVolumeDetachStartTime |
No strong opinions about the name. |
Yup, we just use deletion in a lot of places and we don't have things like terminationGracePeriod for Machines (but we have a deletionTimestamp) (we're also always talking about Machine deletion, never really used/heard Machine termination) |
Today, we check if the NodeDrainTimeout & NodeVolumeDetachTimeout are exceeded by comparing time.Now with the LastTransitionTime of the corresponding conditions.
This is fragile for multiple reasons:
cluster-api/util/conditions/setter.go
Line 54 in 4f1637e
cluster-api/util/conditions/setter.go
Line 199 in 4f1637e
It would be already a good improvement if the following is implemented:
But I think instead of relying on the LastTransitionTime we should have additional status fields tracking when we started to drain & when we started to wait for volume detach and then calculate if the timeouts are exceeded based on that.
(Note: this is not relevant for NodeDeletionTimeout because for this one we use the deletionTimestamp field)
The text was updated successfully, but these errors were encountered: