Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-2400: Update swap KEP for 1.23 beta #2858
KEP-2400: Update swap KEP for 1.23 beta #2858
Changes from 1 commit
20a8885
18a9bee
8277301
908266a
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone does an in-place upgrade on a node (stopping kubelet, starting a new kubelet on the same server), can that fail? How?
If it could fail then an upgrade might for example take out the nodes where the control plane ought to be running as static pods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The in-place upgrade would not fail unless swap access was added while the node was still online. Normally turning swap on and off at runtime isn't considered best practice for a production environment, I'd expect a node to be reimaged and rebooted, but I can mention it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like the spot to provide those suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but we don't have them yet (comment below).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I tell two nodes apart:
failOnSwap: false
andmemorySwap
set toswapBehavior: LimitedSwap
, with theNodeSwap
feature gate enabledfailOnSwap: false
and theNodeSwap
feature gate disabledvia the kubernetes API? If so, I'd mention how to distinguish them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if that is bubbled up to the API Server. The purpose of this question is for beta, when the feature gate is defaulted on, so you can't rely on it being turned on as a sign that the feature is in use. We might be able to check if
swapBehavior
is explicitly set, but empty string is equivalent to LimitedSwap.Realistically, this KEP iterates on the existed unsupported configuration with
failOnSwap: false
. Because it was previously unsupported, I am assuming here that a production environment would not have it set if it were not using this feature.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once this is beta we should assume that people have this feature gate set to a value of their choice. For alpha it was different: you need to be a little more brave to try it and most clusters run with the default, ie feature enabled.
The switch from unsupported to “mostly supported, but it's still beta” is why I'm asking about observability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The beta criteria (including this metric) is listed as tentative. Can you commit to the beta criteria about metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Figuring out what these metrics are is blocking for beta graduation, but we need end-user data/a release cycle of testing to determine them, which will happen during the dev cycle. The graduation of swap is a little different from a regular Kubernetes feature in that we need to be able to look at metrics node-wide.
This is why it might be a stretch to graduate this for beta this release, but I don't want to block the other required dev work for beta just because we don't have this list yet.