-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-4004: Deprecate the kubeProxyVersion field of v1.Node #4005
Conversation
Welcome @HirazawaUi! |
Hi @HirazawaUi. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/retitle KEP-4004: Deprecate the kubeProxyVersion field of v1.Node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. A few procedural notes, and some suggestions...
owning-sig: sig-node | ||
participating-sigs: [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
owning-sig: sig-node | |
participating-sigs: [] | |
owning-sig: sig-network | |
participating-sigs: | |
- sig-node |
which implies moving the files to keps/sig-network
too.
Even though this is about a change to Node, if sig-network is going to be driving this, then we should be the owning-sig.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right
# The following PRR answers are required at alpha release | ||
# List the feature gate name and the components for which it must be enabled | ||
feature-gates: | ||
- name: kubeProxyVersion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feature gates start with a capital letter, and should be more verbose than that. Maybe DisableNodeKubeProxyVersion
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DisableNodeKubeProxyVersion used to be one of the options I was thinking about, and it would really be more relevant (although I discarded it at the last minute....)
|
||
## Motivation | ||
|
||
This field is not accurate, the field is set by kubelet, which does not actually know the kube-proxy version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This field is not accurate, the field is set by kubelet, which does not actually know the kube-proxy version | |
This field is not accurate, the field is set by kubelet, which does not actually know the kube-proxy version, | |
or even if kube-proxy is running. |
|
||
### Goals | ||
|
||
Deprecate the `status.nodeInfo.kubeProxyVersion` field and then do the hard work of finding all the places using it and getting them off in time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I'd phrase that as:
- Mark `status.nodeInfo.kubeProxyVersion` deprecated.
- Find any places use it and get them to stop.
- Make kubelet stop setting the field.
## Proposal | ||
|
||
- Deprecate the `status.nodeInfo.kubeProxyVersion` field of v1.Node | ||
- Remove the gcp cloud provider check for this field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, yeah, I feel like this is more "Goals" than "Proposal"...
Also, wherever it is that you mention the GCP cloud provider, you need to be clear that that's just one specific example, and that we may find others.
(Also, maybe link to the PRs I already filed: kubernetes/cloud-provider-gcp#533, kubernetes/kubernetes#117806.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, yeah, I feel like this is more "Goals" than "Proposal"...
Also, wherever it is that you mention the GCP cloud provider, you need to be clear that that's just one specific example, and that we may find others.
(Also, maybe link to the PRs I already filed: kubernetes/cloud-provider-gcp#533, kubernetes/kubernetes#117806.)
I have made some changes, please see if it is more appropriate
- Else: | ||
- `status.nodeInfo.kubeProxyVersion` will be not empty | ||
|
||
### Graduation Criteria |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need this section filled in.
For Alpha, we should have created the feature gate, disabled by default, and started looking for components that might be using the deprecated field.
For Beta, we want to make sure that we fixed the GCP cloud provider and any other components we found that use the deprecated field, and we need to make sure that upgrades to the Beta version will work across supported levels of version skew. eg, if there are components in 1.28 that depend on kubeProxyVersion
being set, and it's allowable for a cluster to have a 1.29 kubelet and a 1.28 version of the component that depends on kubeProxyVersion
, then we can't safely go to beta in 1.29.
### Upgrade / Downgrade Strategy | ||
|
||
### Version Skew Strategy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upgrade/downgrade is basically "N/A". For skew, see what I said above.
|
||
###### How can this feature be enabled / disabled in a live cluster? | ||
|
||
- [ ] Feature gate (also fill in values in `kep.yaml`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[X]
this, and delete the [ ] Other
section
|
||
###### Are there any tests for feature enablement/disablement? | ||
|
||
No, these will be introduced in the Alpha phase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be "Yes", not "No" because you're supposed to include the work being done for Alpha in the answers here.
But I think this should be "No" anyway; "tests for feature enablement/disablement" means "tests where you have a running cluster with the feature disabled and then you enable it", or vice versa. And there's no reason to do that here; we need to make sure no one is looking at the deprecated field ever, so we don't need to test what happens when the value of the field changes, since if anyone was actually looking at it all, then we're already failing.
/ok-to-test |
069f5dc
to
8975c42
Compare
@danwinship I made some changes to the suggestion, can I bother you to continue reviewing this PR? |
|
||
## Proposal | ||
|
||
- The proposal is to deprecate the kubeProxyVersion field on NodeStatus, which would involve modifying the Kubernetes API and would affect GCP cloud providers (e.g. https://github.com/kubernetes/cloud-provider-gcp/pull/533, https://github.com/kubernetes/kubernetes/pull/117806) or other components, so we will use a feature gates to protect it until we fix the GCP cloud provider and any other components we find using the deprecated fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- The proposal is to deprecate the kubeProxyVersion field on NodeStatus, which would involve modifying the Kubernetes API and would affect GCP cloud providers (e.g. https://github.com/kubernetes/cloud-provider-gcp/pull/533, https://github.com/kubernetes/kubernetes/pull/117806) or other components, so we will use a feature gates to protect it until we fix the GCP cloud provider and any other components we find using the deprecated fields | |
- The proposal is to deprecate the kubeProxyVersion field on NodeStatus, which would involve modifying the Kubernetes API and would affect GCP cloud providers (e.g. https://github.com/kubernetes/cloud-provider-gcp/pull/533, https://github.com/kubernetes/kubernetes/pull/117806) or other components, so we will use a feature gates to protect it until we fix the GCP cloud provider and any other components we find using the deprecated fields |
can we be more explicit? what do we mean by "modifying the Kubernetes API "
I thought that we were going to stop setting the field only and set as "deprecated" on the generated API docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we be more explicit? what do we mean by "modifying the Kubernetes API " I thought that we were going to stop setting the field only and set as "deprecated" on the generated API docs
Sorry, setting the field to deprecated shouldn't count as an API change, I'll change the wording
/sig network |
|
||
``` | ||
// Deprecated: Kubelet Version reported by the node. | ||
KubeletVersion string `json:"kubeletVersion" protobuf:"bytes,7,opt,name=kubeletVersion"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this say Kubelet ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry, that was a stupid mistake
- "@danwinship" | ||
- "@thockin" | ||
approvers: | ||
- "@sig-node-leads" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danwinship I think this is you, yeah?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
|
||
1. Alpha: v1.28, Created the feature gate, disabled by default, and started looking for components that might be using the deprecated field. | ||
2. Beta: v1.29, Make sure that we fixed the GCP cloud provider and any other components we found that use the deprecated field, and we need to make sure that upgrades to the Beta version will work across supported levels of [version skew](https://kubernetes.io/releases/version-skew-policy/). | ||
3. GA: v1.30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be separate sections; see the example in the template README
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be separate sections; see the example in the template README
all right
1. Alpha: v1.28, Created the feature gate, disabled by default, and started looking for components that might be using the deprecated field. | ||
2. Beta: v1.29, Make sure that we fixed the GCP cloud provider and any other components we found that use the deprecated field, and we need to make sure that upgrades to the Beta version will work across supported levels of [version skew](https://kubernetes.io/releases/version-skew-policy/). | ||
3. GA: v1.30 | ||
keps/sig-node/4004-deprecate-kube-proxy-version/kep.yaml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this line got accidentally pasted here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes...
|
||
###### Are there any tests for feature enablement/disablement? | ||
|
||
No, we don't need to test what happens when the value of the field changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
People will push back against that claim; you need to explain why. We don't need to test what happens when the value changes, because our goal is to make sure no one is looking at the value at all, so no components should notice if the value changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
People will push back against that claim; you need to explain why. We don't need to test what happens when the value changes, because our goal is to make sure no one is looking at the value at all, so no components should notice if the value changes.
thanks for reminder
- "@danwinship" | ||
- "@thockin" | ||
approvers: | ||
- "@sig-node-leads" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
47553ed
to
d1a9416
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, let's get this in for 1.29...
|
||
## Proposal | ||
|
||
- The proposal is to deprecate the kubeProxyVersion field on NodeStatus, which would affect GCP cloud providers (e.g. https://github.com/kubernetes/cloud-provider-gcp/pull/533, https://github.com/kubernetes/kubernetes/pull/117806) or other components, so we will use a feature gates to protect it until we fix the GCP cloud provider and any other components we find using the deprecated fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, both of those PRs have merged; the cloud-provider-gcp fix will be in 1.29. So...
- The proposal is to deprecate the kubeProxyVersion field on NodeStatus, which would affect GCP cloud providers (e.g. https://github.com/kubernetes/cloud-provider-gcp/pull/533, https://github.com/kubernetes/kubernetes/pull/117806) or other components, so we will use a feature gates to protect it until we fix the GCP cloud provider and any other components we find using the deprecated fields | |
The proposal is to deprecate the `kubeProxyVersion` field of `NodeStatus`, and to stop setting it in the future. | |
This field was used by the GCP cloud provider up until 1.28 for the legacy built-in cloud provider ([kubernetes #117806], and up until 1.29 for the external cloud-provider ([cloud-provider-gcp #533]). It may also be used by other components. Thus, we will use a feature gate to protect it until all components are fixed. | |
[kubernetes #117806]: https://github.com/kubernetes/kubernetes/pull/117806 | |
[cloud-provider-gcp #533]: https://github.com/kubernetes/cloud-provider-gcp/pull/533 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, those versions are off by one; the bug was fixed in cloud-provider-gcp as of 1.29, so that means it used the field up until 1.28. And likewise 1.27 for the legacy provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All right
##### Integration tests | ||
|
||
|
||
- N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
##### Integration tests | |
- N/A | |
##### Integration tests | |
- N/A |
|
||
#### Alpha | ||
|
||
- [ ] Created the feature gate, disabled by default. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, we don't normally bother with checkboxes here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
#### Beta | ||
|
||
- [ ] Make sure that we fixed the GCP cloud provider and any other components we found that use the deprecated field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GCP cloud provider is already done at this point so that doesn't need to be mentioned here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
### Rollout, Upgrade and Rollback Planning | ||
|
||
###### How can a rollout or rollback fail? Can it impact already running workloads? | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is a component that we didn't know about using kubeProxyVersion
, then it may fail in some unknown way when the administrator upgrades to a version with that field disabled. It should be possible to just roll back in this case.
|
||
###### What specific metrics should inform a rollback? | ||
|
||
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
answer these questions with "N/A" or "No", as appropriate; when they're empty it looks like you just haven't gotten around to filling in this section of the KEP yet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.? | ||
|
||
### Monitoring Requirements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can just put "N/A" here and delete the sub-sections of this section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
# The most recent milestone for which work toward delivery of this KEP has been | ||
# done. This can be the current (upcoming) milestone, if it is being actively | ||
# worked on. | ||
latest-milestone: "v1.28" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
v1.29 now
milestone: | ||
alpha: "v1.28" | ||
beta: "v1.29" | ||
stable: "v1.30" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should be able to do alpha in 1.29, beta in 1.31 (3 releases after 1.28, to avoid version skew issues with cloud-provider-gcp), and stable in 1.33 (your graduation criteria say "no issues reported in 2 releases")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
248d582
to
a668f4b
Compare
|
||
###### What specific metrics should inform a rollback? | ||
|
||
No |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No | |
N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested? | ||
|
||
No |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No | |
N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
oh, sorry, just noticed that this is still in |
Okay, it's moved to sig-network. |
/remove-sig node |
/lgtm |
➕ /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danwinship, HirazawaUi, shaneutt The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@danwinship After the PR merger, may I start working for it now? |
yes |
One-line PR description: This KEP proposes for deprecate the
status.nodeInfo.kubeProxyVersion
field of v1.NodeIssue link: #4004
Other comments: kubernetes/kubernetes#117756