-
Notifications
You must be signed in to change notification settings - Fork 40k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NodeLifecycleController treats node lease renewal as a heartbeat signal #69241
Conversation
Still work in progress. Need to add tests. Will notify once it's done. For visibility: |
@wangzhen127: GitHub didn't allow me to request PR reviews from the following users: wasylkowski-a. Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/kind feature |
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable, but it requires tests.
Also, as I mentioned in one comment, if you could move renames to a separate PR, this PR would be so much easier to review (and renames are no-controversial at all and doesn't require tests, so we could merge them immediately).
@@ -132,10 +137,11 @@ const ( | |||
retrySleepTime = 20 * time.Millisecond | |||
) | |||
|
|||
type nodeStatusData struct { | |||
type nodeHealthData struct { | |||
probeTimestamp metav1.Time | |||
readyTransitionTimestamp metav1.Time | |||
status v1.NodeStatus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're touching this code anyway - can we change this to pointer? [potentially in a separate PR if that's easier - you can do all the renames there too]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After reading this PR more, if you could move just renames to a separate PR, that would make this PR so much easier.
Thanks!
nodeStatusMap map[string]nodeStatusData | ||
// Per Node map storing last observed healthy data, including Status and Lease, | ||
// together with a local time when it was observed. | ||
nodeHealthMap map[string]nodeHealthData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're touching it anyway - this should probably be map[string]*nodeHealthData.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see #69305 for the PR doing the renames. I will rebase and add tests after that PR gets merged.
/cc @cheftako |
/sig node |
@wangzhen127 - this is ready for rebase I think. |
Yes. I am adding tests. Will push another version soon. |
Ping @gmarek :) |
I talked offline with @gmarek. He won't have time to look into that in the next 2 weeks or so. But he is fine with merging without his review and he will take a look some time towards end of October and if needed we will apply his comments then. @yujuhong - will you be able to take a look in that case (in addition to my review)? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. I have some questions about the initialization of the nodeHealth map and various conditions, but most probably because I'm not that familiar with the code
if utilfeature.DefaultFeatureGate.Enabled(features.NodeLease) { | ||
// Always update the probe time if node lease is renewed. | ||
observedLease, _ = nc.leaseLister.Leases(v1.NamespaceNodeLease).Get(node.Name) | ||
if observedLease != nil && (!found || savedLease == nil || savedLease.Spec.RenewTime.Before(observedLease.Spec.RenewTime)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Is it necessary to check found
? It was set 50 lines ago and savedLease == nil
seems to be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. We do not need to check found
. Updated.
probeTimestamp: node.CreationTimestamp, | ||
readyTransitionTimestamp: node.CreationTimestamp, | ||
} | ||
if _, found := nc.nodeHealthMap[node.Name]; found { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When do we expect the node to be in the nodeHealthMap
but does not have the ready condition set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another question is whether it's possible that nc.nodeHealthMap[node.Name]
is set to nil?
It's not obvious to me that when setting the value in line 929 below nc.nodeHealthMap[node.Name] = savedNodeHealth
, the savedNodeHealth
will always be non-nil...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When do we expect the node to be in the
nodeHealthMap
but does not have the ready condition set?
Look at this example:
- time t0: Node is created.
- time t1: Node renew lease, but NodeStatus does not have ready condition. And
found
is false. We create and save a fake ready condition (unknown) here. And we also save the lease below. - time t2: Still, NodeStatus does not have ready condition. So
currentReadyCondition
is still nil. We saved the node at t1 with unknown ready condition. Sofound
is true.
Another question is whether it's possible that
nc.nodeHealthMap[node.Name]
is set to nil?
savedNodeHealth
is always not nil at line 929. So once this function runs at least once, nc.nodeHealthMap[node.Name]
is always non-nil.
Consider 2 cases:
- If currentReadyCondition is nil, we either use the existing one or create a fake one. Then
savedNodeHealth
on line 855 is not nil. - If currentReadyCondition is not nil, and if at this point we do not have any saved copy, i.e.,
nc.nodeHealthMap[node.Name]
is nil for now. Then at line 855, we check it again, we will havefound
to be false, so we create asavedNodeHealth
at line 879, and then assign it tonc.nodeHealthMap[node.Name]
at line 929.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explaining. I think the code here is not very easy to follow (not this PR's fault).
One case that seems a bit strange to me is that even if the node doesn't have status, as long as the it continues renewing the lease, the controller will take no action. I think this is by design, but still feels strange :\
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In terms of heartbeat, I think renewing the lease satisfy the need of making sure node is alive.
@wojtek-t, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the situation where a node does not have status, but keeps renewing lease depends on the value of node-status-update-period.
So what we have been thinking is to switch:
- set lease refresh period (however we call it) to how frequently we update node status now
- decrease frequency of node status updates to something like 1 minute
- but still compute node statuses with the same frequency (equal of lease refresh period)
So if we make leases and node-statuses independent, then yes - there may be small period when there is no node status (this can pretty much happen only after node registration though, because after that it should always be there).
But if we would make them somehow correlated (I'm not saying we should do that - I'm saying it's theoretically possible), then it would only happen on races.
But yeah - I agree the latter probably doesn't make much sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yujuhong, does this small period that there is no node status sound ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was more interested in the pathological case where kubelet could not post the status ever, while it could still sending heartbeats. I think this theoretically could happen if we decouple heartbeat and status updates in kubelet (which is probably the right thing to do to not add more complexity and dependency). I don't think we need to deal with the pathological case now, but in case we want to put a safeguard in the future, let's settle on leaving a comment saying something like If kubelet never posted the node status, but continues renewing the heartbeat leases, the node controller will assume the node is healthy and take no action
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Added the comments at line 923 now. PTAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. PTAL
probeTimestamp: node.CreationTimestamp, | ||
readyTransitionTimestamp: node.CreationTimestamp, | ||
} | ||
if _, found := nc.nodeHealthMap[node.Name]; found { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When do we expect the node to be in the
nodeHealthMap
but does not have the ready condition set?
Look at this example:
- time t0: Node is created.
- time t1: Node renew lease, but NodeStatus does not have ready condition. And
found
is false. We create and save a fake ready condition (unknown) here. And we also save the lease below. - time t2: Still, NodeStatus does not have ready condition. So
currentReadyCondition
is still nil. We saved the node at t1 with unknown ready condition. Sofound
is true.
Another question is whether it's possible that
nc.nodeHealthMap[node.Name]
is set to nil?
savedNodeHealth
is always not nil at line 929. So once this function runs at least once, nc.nodeHealthMap[node.Name]
is always non-nil.
Consider 2 cases:
- If currentReadyCondition is nil, we either use the existing one or create a fake one. Then
savedNodeHealth
on line 855 is not nil. - If currentReadyCondition is not nil, and if at this point we do not have any saved copy, i.e.,
nc.nodeHealthMap[node.Name]
is nil for now. Then at line 855, we check it again, we will havefound
to be false, so we create asavedNodeHealth
at line 879, and then assign it tonc.nodeHealthMap[node.Name]
at line 929.
if utilfeature.DefaultFeatureGate.Enabled(features.NodeLease) { | ||
// Always update the probe time if node lease is renewed. | ||
observedLease, _ = nc.leaseLister.Leases(v1.NamespaceNodeLease).Get(node.Name) | ||
if observedLease != nil && (!found || savedLease == nil || savedLease.Spec.RenewTime.Before(observedLease.Spec.RenewTime)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. We do not need to check found
. Updated.
2bf8515
to
1934437
Compare
/lgtm |
1934437
to
ec0cda7
Compare
ec0cda7
to
e35d808
Compare
I took one more final look and it also looks good to me now. /lgtm |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wangzhen127, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@wangzhen127 @yujuhong @wojtek-t I took a look at this PR and IMO the change is technically correct. What I'm wondering about is the decision to not set the NodeReady condition to true when NC observes a lease. Can someone explain me the reasoning behind this decision? I feel that users might depend on NodeReady=True as a signal to detect when Nodes are up (e.g. kubeUp depended on that at some point). |
I think that's a missing thing - if NodeReady != True and NC observed a Lease, I think we should set NodeReady = True. Would that solve all your concerns? |
Yup. |
@wangzhen127 - will you be able to add this? ^^ |
Yes, will create another PR for it. |
@wangzhen127 - thanks! |
I thought more on this and also talked to @yujuhong offline. I think we should not let NC change NodeReady status to True if it observed a lease renewal. Because node lease is just a heartbeat signal. It says nothing about whether the node is ready. Let's look at all 4 cases below. Case 1: When node just starts, there was no status: Case 2, When node already has status as Case 3, When node already has status as Case 4, When node already has status as |
Make periodic NodeStatus updates cheaper cherry-pick master, create lease api, kubernetes#64246 master, node lifecycle controller, kubernetes#69241 kubelet, kubernetes#66257 See merge request !184866
What this PR does / why we need it:
This PR makes NodeLifecycleController treat node lease renewal as a heartbeat signal, in addition to NodeStatus update. This is part of KEP-0009, feature #589 and issue #14733.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Release note: