-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify node drain requirement for kubelet minor version upgrades #46359
base: main
Are you sure you want to change the base?
Conversation
Welcome @avestuk! |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
✅ Pull request preview available for checkingBuilt without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify site configuration. |
This change clarifies the requirement that nodes are drained before a minor version upgrade takes place. The old wording implied that a minor version upgrade of the kubelet was not supported at all. Co-authored-by: Tim Bannister <tim+github@scalefactory.com> Co-authored-by: Xander Grzywinski <xandergrzyw@gmail.com>
0f3eba4
to
3f622d2
Compare
@sftim @salaxander you both happy with the latest version? Also thank you for the reviews! |
@@ -217,7 +217,7 @@ Optionally upgrade `kubelet` instances to **{{< skew currentVersion >}}** (or th | |||
|
|||
{{< note >}} | |||
Before performing a minor version `kubelet` upgrade, [drain](/docs/tasks/administer-cluster/safely-drain-node/) pods from that node. | |||
In-place minor version `kubelet` upgrades are not supported. | |||
In-place minor version `kubelet` upgrades, to a newer minor version of Kubernetes, without a node drain, are not supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the "to a newer minor version of Kubernetes" bit is redundant.
I also think the content of kubelet-managed directories is not necessarily cross-minor-version compatible. I'd like sig-node's input on highlighting the need to clear those directories before an in-place upgrade, something like this:
In-place minor version `kubelet` upgrades, to a newer minor version of Kubernetes, without a node drain, are not supported. | |
In-place minor version `kubelet` upgrades without draining pods and clearing the content of the kubelet-managed directories are not supported. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, so combining the thinking so far, I suggest:
In-place minor version `kubelet` upgrades, to a newer minor version of Kubernetes, without a node drain, are not supported. | |
In-place `kubelet` upgrades, without first draining pods from the node **and** clearing the content | |
of the kubelet-managed directories, are only supported for upgrades to a newer patch release within | |
the same minor version of Kubernetes. |
This makes for a problem, though: can we hyperlink “clearing the content of the kubelet-managed directories” to the specific text about how to do that? If not, we're almost certainly going to get issues filed about adding more detail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More information about this clearing of directories would definitely be good because the current documentation doesn't state anything about needing to do this - see https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/upgrading-linux-nodes/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liggitt Is there somebody from sig-node you'd recommend asking for a consult on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably folks in https://github.com/kubernetes/community/tree/master/sig-node#technical-leads or delegates
cc @kubernetes/sig-node-pr-reviews
/cc @dchen1107 @derekwaynecarr @mrunalp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mrunalp @dchen1107 @derekwaynecarr anything I can do to help move this PR forward?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theoretically those files should be compatible unless specified otherwise in release notes. But after the drain happened, some of them have little sense. But committing on compatibility is not ideal.
I think the ask here is to come up with the docs page on what files kubelet may be using. Is there any suggestion on whether it will be best as admin actions section? Or perhaps kubelet configuration? Any advice would be helpful.
Off the top of my head, the files include:
- static pods manifests
/etc/kubernetes/manifests
(must be compatible unless some removed fields in pod API, which is rare (never?)) - topology manager checkpoints (have little sense to keep when the node was drained)
- kubelet configuration file and additional drop-in configuration files. Mostly compatible
- device plugin directory - may not be able to delete as Device plugins may be implemented as DaemonSets ignoring the bad node state. And those will be running and keeping the socket in
/var/lib/kubelet/device-plugins/
. - Kubelet lock file:
/var/run/kubelet.lock
. Likely will not change, but safe to clean up if it was left over.
I can bring this to tomorrow's meeting to find volunteer to help document this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best place for a list of files that the kubelet uses: within https://kubernetes.io/docs/reference/node/
Note that many of these are defaults or conventions; eg the kubelet even used to be able to fetch static Pod manifests via HTTP, /etc/kubernetes/manifests
is just the most common place to put them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
old thread on this topic:
#12326 (comment)
but clearing contents on kubelet managed directories was never mentioned before?
that includes /var/lib/kubelet/pki, which i don't think the users would want to clear.
I think the ask here is to come up with the docs page on what files kubelet may be using.
CPU checkpoint files too maybe?
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cm/cpumanager/state/checkpoint.go#L31-L33
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we also need a new task page: prepare a node for an in-place upgrade.
/cc |
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
Addresses the issue raised in #46356 where I suggested that the language here could be made more clear.
This change clarifies the requirement that nodes are drained before a minor version upgrade takes place. The old wording implied that a minor version upgrade of the kubelet was not supported at all.