Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Pod Disruption Budget for the CSI Controller #612

Closed
wants to merge 2 commits into from
Closed

Add a Pod Disruption Budget for the CSI Controller #612

wants to merge 2 commits into from

Conversation

risinger
Copy link
Contributor

@risinger risinger commented Nov 15, 2020

Is this a bug fix or adding new feature?
Could be considered either.

What is this PR about? / Why do we need it?
The Cluster Autoscaler cannot scale down a node where the CSI controller is scheduled, because it is a non-daemonset in kube-system (assuming one uses the namespace specified in the README install instructions). With this chart-specific PR, the CSI controller pod can be rescheduled to another node, allowing the node on which it was previously scheduled to scale down.

What testing is done?
With the current master version of the chart, I ran 3 nodes, got CA to attempt to scale down a node with the CSI controller on it, and saw the scale down hang indefinitely.
I then templated this PDB and installed it. I saw the pod reschedule to another node. CA successfully scaled down the old node.

Here's a run of the CA loop, which repeated consistently before installing the PDB:

I1115 19:34:27.535773       1 cluster.go:148] Fast evaluation: ip-10-65-8-55.us-west-2.compute.internal for removal
I1115 19:34:27.535865       1 cluster.go:306] Pod kube-system/ebs-csi-controller-6dcfd74dfc-zbwm5 can be moved to ip-10-65-13-238.us-west-2.compute.internal
I1115 19:34:27.535879       1 cluster.go:185] Fast evaluation: node ip-10-65-8-55.us-west-2.compute.internal may be removed
I1115 19:34:27.535906       1 static_autoscaler.go:479] ip-10-65-8-55.us-west-2.compute.internal is unneeded since 2020-11-15 19:24:24.34802611 +0000 UTC m=+1640484.617698385 duration 10m3.187154602s
I1115 19:34:27.535925       1 static_autoscaler.go:490] Scale down status: unneededOnly=false lastScaleUpTime=2020-11-15 19:16:41.389931417 +0000 UTC m=+1640021.659603640 lastScaleDownDeleteTime=2020-10-27 19:43:20.335330698 +0000 UTC m=+20.605002906 lastScaleDownFailTime=2020-10-27 19:43:20.335330782 +0000 UTC m=+20.605002995 scaleDownForbidden=false isDeleteInProgress=false scaleDownInCooldown=false
I1115 19:34:27.535945       1 static_autoscaler.go:503] Starting scale down
I1115 19:34:27.535978       1 scale_down.go:790] ip-10-65-8-55.us-west-2.compute.internal was unneeded for 10m3.187154602s
I1115 19:34:27.536001       1 cluster.go:148] Detailed evaluation: ip-10-65-8-55.us-west-2.compute.internal for removal
I1115 19:34:27.536126       1 cluster.go:306] Pod kube-system/ebs-csi-controller-6dcfd74dfc-zbwm5 can be moved to ip-10-65-13-238.us-west-2.compute.internal
I1115 19:34:27.536142       1 cluster.go:185] Detailed evaluation: node ip-10-65-8-55.us-west-2.compute.internal may be removed
I1115 19:34:27.536159       1 scale_down.go:930] Scale-down: removing node ip-10-65-8-55.us-west-2.compute.internal, utilization: {0.05357142857142857 0.019302317361953793 0 cpu 0.05357142857142857}, pods to reschedule: kube-system/ebs-csi-controller-6dcfd74dfc-zbwm5
I1115 19:34:27.536303       1 event.go:278] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-system", Name:"cluster-autoscaler-status", UID:"82e5457e-3a50-4e50-8be7-737f70319fa4", APIVersion:"v1", ResourceVersion:"32804081", FieldPath:""}): type: 'Normal' reason: 'ScaleDown' Scale-down: removing node ip-10-65-8-55.us-west-2.compute.internal, utilization: {0.05357142857142857 0.019302317361953793 0 cpu 0.05357142857142857}, pods to reschedule: kube-system/ebs-csi-controller-6dcfd74dfc-zbwm5
I1115 19:34:27.547201       1 delete.go:103] Successfully added ToBeDeletedTaint on node ip-10-65-8-55.us-west-2.compute.internal
I1115 19:34:27.548551       1 event.go:278] Event(v1.ObjectReference{Kind:"Node", Namespace:"", Name:"ip-10-65-8-55.us-west-2.compute.internal", UID:"f98d4608-63a8-42f8-b8a3-a3c400db1db1", APIVersion:"v1", ResourceVersion:"32803099", FieldPath:""}): type: 'Normal' reason: 'ScaleDown' marked the node as toBeDeleted/unschedulable
I1115 19:34:27.548659       1 event.go:278] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"ebs-csi-controller-6dcfd74dfc-zbwm5", UID:"143150c6-4550-4ac6-857f-664b0156746c", APIVersion:"v1", ResourceVersion:"32800180", FieldPath:""}): type: 'Normal' reason: 'ScaleDown' deleting pod for node scale down
E1115 19:34:27.566923       1 scale_down.go:1238] Not deleted yet kube-system/ebs-csi-controller-6dcfd74dfc-zbwm5
E1115 19:34:32.570610       1 scale_down.go:1238] Not deleted yet kube-system/ebs-csi-controller-6dcfd74dfc-zbwm5

Note this only works in conjunction with #526 because the ToBeDeletedByClusterAutoscaler=:NoSchedule taint used by CA is tolerated in the current version of the chart.

I chose the PDB approach to make sure we have at least one Controller pod running at a time. The "cluster-autoscaler.kubernetes.io/safe-to-evict": "true" label approach could potentially create downtime.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 15, 2020
@k8s-ci-robot
Copy link
Contributor

Hi @risinger. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: risinger
To complete the pull request process, please assign gnufied after the PR has been reviewed.
You can assign the PR to them by writing /assign @gnufied in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Nov 15, 2020
@risinger risinger changed the title Add a pod disruption budget for the controller. Run generate-kustomize. Add a Pod Disruption Budget for the CSI Controller Nov 15, 2020
@coveralls
Copy link

coveralls commented Nov 15, 2020

Pull Request Test Coverage Report for Build 1377

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 81.713%

Totals Coverage Status
Change from base Build 1375: 0.0%
Covered Lines: 1622
Relevant Lines: 1985

💛 - Coveralls

@bertinatto
Copy link
Member

Having a PDB makes sense to me, but I'll defer this to @wongma7 and @leakingtapan who maintain the chart.

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Nov 16, 2020
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 8, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 12, 2020
@risinger
Copy link
Contributor Author

Can I get your opinions, @wongma7 @leakingtapan?

@PhilThurston
Copy link
Contributor

Would adding this to the kustomization base be out of scope of this PR?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 11, 2021
@k8s-ci-robot
Copy link
Contributor

@risinger: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@ayberk
Copy link
Contributor

ayberk commented Apr 28, 2021

@vdhanan #857 supersedes this, correct?

@risinger
Copy link
Contributor Author

Yeah it does.
Closing. PDB added in #857.

@risinger risinger closed this Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants