-
Notifications
You must be signed in to change notification settings - Fork 39.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Respect PodTopologySpread after rolling upgrades #111441
Conversation
/sig scheduling |
6b1008e
to
4283ef2
Compare
4283ef2
to
87a2954
Compare
87a2954
to
bdfffb7
Compare
/assign @ahg-g |
MatchLabelKeys: []string{"/simple"}, | ||
}, | ||
}, | ||
wantFieldErrors: []*field.Error{field.Invalid(fieldPathMatchLabelKeys, "/simple", "prefix part must be non-empty")}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefix part must be non-empty
this error msg is hard to understand at the first glance until I checked the source which split the string by '/'.
you may want to add another test string, such as an empty string, I am fine with this as is though.
looks good to me @liggitt this is ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API bits look good, just a few doc/lint/message nits
I didn't review the pkg/scheduler and test changes, will defer to scheduling reviewers/approvers for those changes
@@ -6490,6 +6490,7 @@ func validateTopologySpreadConstraints(constraints []core.TopologySpreadConstrai | |||
if err := validateNodeInclusionPolicy(subFldPath.Child("nodeTaintsPolicy"), constraint.NodeTaintsPolicy); err != nil { | |||
allErrs = append(allErrs, err) | |||
} | |||
allErrs = append(allErrs, validateMatchLabelKeys(subFldPath.Child("matchLabelKeys"), constraint.MatchLabelKeys, constraint.LabelSelector)...) | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-existing, but I noticed validateTopologySpreadConstraints doesn't seem to validate constraint.LabelSelector at all... @ahg-g, can you open a separate issue to track that (need to make sure the scheduler is robust against completely invalid selectors, and consider how to fix this validation in the least disruptive way possible)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some update on this comments, the validation is actually done when the labels.Selector
is built. e.g.
selector, err := metav1.LabelSelectorAsSelector(c.LabelSelector) |
And it is validated here,
kubernetes/staging/src/k8s.io/apimachinery/pkg/labels/selector.go
Lines 191 to 194 in 891cbed
for i := range vals { | |
if err := validateLabelValue(key, vals[i], valuePath.Index(i)); err != nil { | |
allErrs = append(allErrs, err) | |
} |
I think the original comment is still valid and the the validation mentioned above would mitigate the problem as the invalid label value won't be accepted at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that happens at time of use, not time of API write, which is not ideal (since the user has no idea the object they created is not working like they want it to)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pr sent: #112121
/milestone v1.25 |
/approve /hold looks like commits need squashing |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: denkensk, liggitt The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: Alex Wang <wangqingcan1990@gmail.com>
Signed-off-by: Alex Wang <wangqingcan1990@gmail.com>
Signed-off-by: Alex Wang <wangqingcan1990@gmail.com>
6cdf9bd
to
86a2a85
Compare
/lgtm |
/triage accepted |
What type of PR is this?
/kind feature
/kind api-change
What this PR does / why we need it:
PodTopologySpread is widely used in production environments, especially in service type workloads which employ Deployments. However, currently it has a limitation that manifests during rolling updates which causes the deployment to end up out of balance (98215, 105661,k8s-pod-topology spread is not respected after rollout).
The root cause is that PodTopologySpread constraints allow defining a key-value label selector, which applies to all pods in a Deployment irrespective of their owning ReplicaSet. As a result, when a new revision is rolled out, spreading will apply across pods from both the old and new ReplicaSets, and so by the time the new ReplicaSet is completely rolled out and the old one is rolled back, the actual spreading we are left with may not match expectations because the deleted pods from the older ReplicaSet will cause skewed distribution for the remaining pods.
Which issue(s) this PR fixes:
Fixes #98215
Fixes #105661
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: