-
Notifications
You must be signed in to change notification settings - Fork 807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report zone via well-known topology key in NodeGetInfo #1931
Report zone via well-known topology key in NodeGetInfo #1931
Conversation
Code Coverage DiffThis PR does not change the code coverage |
f59568d
to
b32f203
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for starting to resolve this paper-cut.
Had a small nit about renaming WellKnownTopologyKey
but realized it is CO-agnostic and has existed for 3 years already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, so I took some time to review the history here and my current understanding is that this PR is not necessary - ill explain.
First, ill start by laying out some context that both the author of this PR and reviewers already know, but will be helpful to future maintainers and external contributors to follow this discussion.
There are 2 places where topology labels come into the picture:
- Controller service - creates PVs with a topology label.
- Node service - adds a topology label on the node.
Kubernetes or a CO will then use that topology information to ensure that a volume is accessible from a given node when scheduling workloads (wording directly from CSI spec).
That is to say, we need to ensure that dynamically provisioned PVs have a label that nodes can satisfy, if the label keys don't match the volumes will not be scheduled.
Back in circa 2021, maintainers of this project tried to switch to the well-known label (which was not so well-known at the time) and made a mistake in adding both the custom label AND the well-known label to provisioned PVs in the controller side. An important detail here as pointed out by Michelle is that the node selector matchExpression
is ANDed, which I understand to mean that in order to schedule those volumes, both labels (the custom, and the well-known) must exist on the node.
Now the reason why this PR is not necessary: CCM already adds the well-known label to all nodes -- it does not make sense for the driver to override that. I have also manually tested and confirmed that to be true.
In short, we can rely on CCM / Kubelet to add the well-known label, and the important bit here over on the Node is that we must continue adding the custom label, otherwise PVs may not be scheduled on nodes which do not have the custom label and ebs-plugin
node is the only component adding that label on nodes.
Given the above, I believe we can safely proceed to switching the label over on the controller to the well-known label if on Kubernetes. To illustrate:
Node Topology:
- Custom label:
topology.ebs.csi.aws.com/zone
-- applied by CSI driver - Well-known label:
topology.kubernetes.io/zone
-- applied by CCM / Kubelet
PV Topology:
topology.ebs.csi.aws.com/zone
-- applied by CSI driver
On PVs created by newer driver:
topology.kubernetes.io/zone
-- applied by CSI driver if on Kubernetes
topology.ebs.csi.aws.com/zone
-- applied by CSI driver if not on Kubernetes
That should fix all the issues and enable scaling from 0, which is explained in detail here. Please let me know if I missed anything and especially any compatibility concerns.
Hi @torredil - unfortunately, that will not work, as the AWS CCM is not a requirement to run the EBS CSI Driver. Thus, while your change works with clusters with the AWS CCM installed, it doesn't work for clusters without the AWS CCM installed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest adding the following unit test to node_test.go
if at.Segments[WellKnownTopologyKey] != tc.availabilityZone {
t.Fatalf("Expected well-known topology %q, got %q", tc.availabilityZone, at.Segments[WellKnownTopologyKey])
}
/lgtm
/approve
/hold
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: torredil The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Connor Catlett <conncatl@amazon.com>
b32f203
to
ae753df
Compare
Missed that, thanks! Ready for re-review |
/lgtm |
/retest |
4 similar comments
/retest |
/retest |
/retest |
/retest |
/retest |
1 similar comment
/retest |
/retest |
1 similar comment
/retest |
/unhold |
Is this a bug fix or adding new feature?
Both?
What is this PR about? / Why do we need it?
See #729 (comment)
What testing is done?