Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terminate pod on container completion #3582

Open
4 tasks
SergeyKanzhelev opened this issue Oct 5, 2022 · 21 comments
Open
4 tasks

Terminate pod on container completion #3582

SergeyKanzhelev opened this issue Oct 5, 2022 · 21 comments
Labels
sig/node Categorizes an issue or PR as relevant to SIG Node.

Comments

@SergeyKanzhelev
Copy link
Member

Enhancement Description

  • One-line enhancement description (can be used as a release note):

Allow to configure container to terminate the pod on container completion when no restart should be performed as per restartPolicy.

/sig node

  • Enhancement target (which target equals to which milestone):
    • Alpha release target (x.y): 1.26
    • Beta release target (x.y): 1.27
    • Stable release target (x.y): 1.29
  • Alpha
    • KEP (k/enhancements) update PR(s):
    • Code (k/k) update PR(s):
    • Docs (k/website) update PR(s):
@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Oct 5, 2022
@sftim
Copy link
Contributor

sftim commented Dec 22, 2022

The discussion uses the title Per Container Restart Policy override, which I prefer because you might want to terminate an entire Pod when a particular container fails.

@thockin
Copy link
Member

thockin commented Jan 28, 2023

Since we have converged on #3759 I will close this

@thockin thockin closed this as completed Jan 28, 2023
@thockin thockin closed this as not planned Won't fix, can't repro, duplicate, stale Jan 28, 2023
@fpoirotte fpoirotte mentioned this issue Feb 1, 2023
4 tasks
@SergeyKanzhelev
Copy link
Member Author

From @davidhadas in https://kubernetes.slack.com/archives/C0BP8PW9G/p1676531990732709

There are security related use cases in which a sidecar that is used to protect a pod, detects that the pod is compromised. In such cases it is necessary for the sidecar to initiate pod deletion.

Providing the sidecar with permission to delete pods is not in line with the principle of least privilege (as the sidecar should only have permission to delete its own pod and not all pods in the namespace).

Ideally, it would be helpful if the container will simply exit and kubelet will decide to delete the pod instead of restarting the sidecar. This may require that the container is marked in the podSpec as one whose untimely demise should result in the Pod untimely demise. (Naturally we rely here on existing mechanisms to bring a new pod instead of our compromised pod once it gets deleted).

We discussed this in sig-security (during our latest sig-security-tooling meeting) and wanted to bring this up with sig-node (edited)

Re-opening to collect more use cases to understand if this feature worth pursuing. Originally the thought was that the scenarios beyond sidecars are limited.

@SergeyKanzhelev
Copy link
Member Author

Scenario clarification from @davidhadas: the Pod that needs to be terminated is marked as restartPolicy=Always. And it's one of the deployment's controlled many Pods

@SergeyKanzhelev
Copy link
Member Author

Clarifications - copied from slack for history here:

I don't see the security implication regarding where it get restarted. It is fine if the decision will be to start it exactly where it was running (or somewhere else).

All that is needed is for the container to signal somehow to kubelet to delete the pod. Apparently the simplest way is to exit from the container (maybe with specific exit code, or with error, or maybe any exit will do). Now when we exited, kubelet need to take a decision if to restart the container (as done today) or delete the pod (required feature). to decide between the two we probably need to have some indication in the podSpec indicating that this container is marked to never be restarted and instead the pod need to be deleted when it exists.

I can repeat this in GitHub, but there are so many github issues and PRs on this and related issues, that I am not really sure where it is most relevant.

@davidhadas
Copy link

davidhadas commented Feb 16, 2023

To be precise the need identified (as was quoted above) will be met if we support deletion of the pod (which may be followed by later RelicaSet, for example, starting a new pod wherever) if a specific container exists - i.e. it calls for some marking on the container to indicate that if this container exists, it should result in the deletion of the pod (as an example by supporting restartPolict=Never, but other markings can also be used).

It is best that if we implement this feature, the decision of whether to restart a pod on container exit will be per container such that users will be able to decide which container should result in pod termination and which should result in container restart.

@thockin
Copy link
Member

thockin commented Feb 16, 2023

Is this the same as #3676 ? aka "keystone" containers

@SergeyKanzhelev
Copy link
Member Author

Is this the same as #3676 ? aka "keystone" containers

I think the semantic of re-running Init containers and killing the entire Pod is different. Both has scenarios associated with them. But Terminate Pod is more "destructive" "keystone" than what asked in #3676. #3676 is more about dependencies between Containers the way I read it. It can be implemented via Terminate Pod though.

@thockin
Copy link
Member

thockin commented Feb 17, 2023 via email

@ceastman-r7
Copy link

Use case: kubernetes cronjobs/jobs where a service mesh is involved. The service mesh sidecar container is designed to run forever. But in the case of a kubernetes cronjob/job the service mesh sidecar should terminate when the main pod completes its' job. istio/istio#11659

@thockin
Copy link
Member

thockin commented Mar 20, 2023

@ceastman-r7 this is somewhat different - sidecars (which is alpha in 1.27) covers what you need I think.

@ceastman-r7
Copy link

@thockin Do you have a link to the sidecars documentation?

@SergeyKanzhelev
Copy link
Member Author

@thockin Do you have a link to the sidecars documentation?

#753

@smarterclayton
Copy link
Contributor

Now that graceful shutdown guarantees pods report their terminal status (via changes to how we report pod phase), I think we’ve established more context around what the node is allowed to do to a pod (indicate via shutdown that the pod is terminal regardless of restart policy). While I would be hesitant to allow pods to indicate they can be deleted, a workload already has the power to “give up, take its toys, and go home” by crash looping, which is not always the most effective mechanism.

For restart never containers (init or otherwise) giving up makes a lot of sense if the inputs are fixed and there is an expectation of a time window for retry before giving up. For restart always containers, I would suggest exploring questions like:

  1. What kind of container configuration states can be identified as “unsolvable” only by the workload? The format of an env var argument is wrong? The command line argument is impossible? The volume is missing data? A necessary config map directory was not mounted?
  2. Is there any way that workloads failing could interact more effectively with rollout? Ie, allowing workloads to signal to their controllers in a way that improves overall workload health?
  3. Are there any use cases where the workload should be able to decide the current node it is running on is deficient and refuse to run on it? Such as absence of implicit node apis (the ones we don’t namespace, like the type of processor, the shape of memory)? Is there value in workloads taking that into their own hands (perhaps more privileged tools), like eviction?

No matter what we’d need to improve controller back off when pods fail - I just reopened and froze two issues that such a feature would make significantly more dangerous (which is why this one caught my eye).

@SergeyKanzhelev
Copy link
Member Author

All scenarios I saw thus far were around Jobs (restart policy Never or OnFailure). For terminate policy Always some of scenarios you listed with infinite restart backoff can be implemented via Init Containers.

How do you see risk vs. value for the TerminatPod behavior for finite Pods?

@SergeyKanzhelev
Copy link
Member Author

This KEP has valid scenarios, but nobody to work on it. Likely not for 1.28 release

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 19, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 18, 2024
@thockin
Copy link
Member

thockin commented Feb 18, 2024

@tallclair

@thockin thockin removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 18, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 18, 2024
@thockin thockin removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 20, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2024
@thockin thockin removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
Status: Not for release
Development

No branches or pull requests

8 participants