Skip to content

fix: KubeadmControlPlane should allow orphans #2505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

sethp-nr
Copy link
Contributor

@sethp-nr sethp-nr commented Mar 2, 2020

What this PR does / why we need it:

Check for the "orphan" finalizer and avoid deleting dependents if present.

It looks like the change in #2334 was incomplete: my controller manager was refusing to honor the orphan finalizer for some reason I couldn't discern, but the KCP controller was happy to ignore my intent and delete dependent resources.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

Check for the "orphan" finalizer and avoid deleting dependents if present.
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 2, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sethp-nr
To complete the pull request process, please assign luxas
You can assign the PR to them by writing /assign @luxas in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sethp-nr
Copy link
Contributor Author

sethp-nr commented Mar 2, 2020

I'm not too sure about this change – why are we deleting the dependents anyway and pre-empting the GC? Would it make more sense to only do the parallel deletion just in case the foregroundDeletion finalizer is set?

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 2, 2020
@detiber
Copy link
Member

detiber commented Mar 2, 2020

The current behavior is really no different than other places where we enforce deletion ordering in Cluster API (mainly to avoid the requirement that someone set foregroundDeletion = true) in order to be able to properly delete a Cluster and it's dependents without orphaning resources.

Do you have use cases in mind where deletion should orphan resources?

@sethp-nr
Copy link
Contributor Author

sethp-nr commented Mar 2, 2020

Oh deletes, such joy you bring me. Specifically the use case I had in mind is one where it's requested by the user, as in:

k delete kcp FOO --cascade=false

The way --cacade=false is implemented is that it sets an orphan finalizer on the parent object and then waves hands the owner references are removed. My controller manager took a nap, and so the KCP controller got there first and cascaded the delete despite my intent.

I could try to remember in the future that "cascade=false" doesn't work for CAPI objects, but that wouldn't be my preference. To that end, I'm wondering where the breakdown occurs if the CAPI controller doesn't issue the DELETE: perhaps kubernetes/kubernetes#86509 is a related concern here?

@detiber
Copy link
Member

detiber commented Mar 3, 2020

I'm not necessarily sure it would have an immediate impact on KubeadmControlPlane, but in the case of Cluster -> Machine, Machine -> InfraMachine, Cluster -> InfraCluster, and Cluster -> Machine -> InfraMachine, if the parent object is not present then we would currently have a lot of potential for either breakage or orphaning of cloud resources if we didn't enforce the deletion semantics we currently do.

@sethp-nr
Copy link
Contributor Author

sethp-nr commented Mar 3, 2020

Yeah, I'm beginning to understand that this is probably an issue with a larger scope than just a PR to the KCP controller: it sounds like we're hanging out at the intersection of at least one api-server bug, one confusingly named field (and/or poor default), and a tremendous number of moving parts.

Especially since this is not something unique to the KCP behavior, I'm thinking the right thing to do is close this PR and open an issue or CAEP for discussion post-v1alpha3. Does that sound right to you?

@detiber
Copy link
Member

detiber commented Mar 3, 2020

Especially since this is not something unique to the KCP behavior, I'm thinking the right thing to do is close this PR and open an issue or CAEP for discussion post-v1alpha3. Does that sound right to you?

sgtm, I think we definitely have to have a larger discussion and have a solid plan for any changes :)

@sethp-nr sethp-nr closed this Mar 3, 2020
@sethp-nr sethp-nr deleted the patch-2 branch March 3, 2020 01:28
@sethp-nr sethp-nr restored the patch-2 branch March 3, 2020 01:28
@sethp-nr sethp-nr mentioned this pull request Mar 9, 2020
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants