Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable Machine drain behavior #11240

Open
5 of 8 tasks
sbueringer opened this issue Sep 30, 2024 · 0 comments
Open
5 of 8 tasks

Configurable Machine drain behavior #11240

sbueringer opened this issue Sep 30, 2024 · 0 comments
Assignees
Labels
area/machine Issues or PRs related to machine lifecycle management kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@sbueringer
Copy link
Member

sbueringer commented Sep 30, 2024

Today, when Cluster API deletes a Machine it drains the corresponding Node to ensure all Pods running on the Node have
been gracefully terminated before deleting the corresponding infrastructure. The current drain implementation has
hard-coded rules to decide which Pods should be evicted. This implementation is aligned to kubectl drain (see
Machine deletion process
for more details).

With recent changes in Cluster API, we can now have finer control on the drain process, and thus we propose a new
MachineDrainRule CRD to make the drain rules configurable per Pod. Additionally, we're proposing annotations that
workload cluster admins can add to individual Pods to control their drain behavior.

This would be a huge improvement over the “standard” kubectl drain aligned implementation we have today and help to
solve a family of issues identified when running Cluster API in production.

More details can be found in the proposal PR.

Prior related discussions:

Tasks:

Follow-ups:

@sbueringer sbueringer added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/machine Issues or PRs related to machine lifecycle management triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Sep 30, 2024
@k8s-ci-robot k8s-ci-robot added the needs-kind Indicates a PR lacks a `kind/foo` label and requires one. label Sep 30, 2024
@sbueringer sbueringer added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 2, 2024
@k8s-ci-robot k8s-ci-robot removed the needs-kind Indicates a PR lacks a `kind/foo` label and requires one. label Oct 2, 2024
@sbueringer sbueringer self-assigned this Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/machine Issues or PRs related to machine lifecycle management kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

2 participants