Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue-20372 - Initial impl for restartable init-container #20647

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vincentywdeng
Copy link

Implementing native sidecar as kubernetes
https://kubernetes.io/blog/2023/08/25/native-sidecar-containers/

Does this PR introduce a user-facing change?


@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 10, 2023
Copy link
Contributor

openshift-ci bot commented Nov 10, 2023

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None label Nov 10, 2023
@rhatdan
Copy link
Member

rhatdan commented Nov 10, 2023

Please sign your commits.

git commit -a --amend -s
git push --force

@vincentywdeng vincentywdeng force-pushed the ywdeng-sidecar-20372 branch 2 times, most recently from 06ed6d0 to 210cd0e Compare November 13, 2023 05:57
@ygalblum
Copy link
Contributor

The code looks good. Please make sure to add test(s)

Copy link

Cockpit tests failed for commit 5fb6d9f. @martinpitt, @jelly, @mvollmer please check.

Copy link
Contributor

@ygalblum ygalblum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition, there is no test for the actual restarting of an init container

test/e2e/play_kube_test.go Outdated Show resolved Hide resolved
@martinpitt
Copy link
Contributor

The cockpit test failure detected a crun regression in podman-next, see containers/crun#1367 Unrelated to this PR.

@vincentywdeng
Copy link
Author

Thanks @ygalblum for the comment, I had updated my test.
On the other hand, for actual restarting of an init container, do you mean when sidecar container failed, it should be automatic restarted by Pod?
I haven't implement that yet as I don't know how to do it yet. Could I get some hint from expert for how to do that?

@ygalblum
Copy link
Contributor

ygalblum commented Dec 6, 2023

Thanks @ygalblum for the comment, I had updated my test.

Great, thanks

On the other hand, for actual restarting of an init container, do you mean when sidecar container failed, it should be automatic restarted by Pod? I haven't implement that yet as I don't know how to do it yet. Could I get some hint from expert for how to do that?

Yes, this is what I meant. I thought that once the container pass the check, podman will monitor and restart them. Have you tried it our manually?

@vincentywdeng
Copy link
Author

vincentywdeng commented Dec 6, 2023

@ygalblum
In my manual test, the restartable init container won't get restarted when exit. I tried

  1. podman kill
  2. Defining this command for init container, and then delete /shared/test to make it fail
      command:
        - sh
        - -c  
        - "while [ -e '/shared/test' ]; do echo 'File exists!'; sleep 5; done && exit 1"  

I assume podman didn't support it because podman won't continue until old init container exit. There is not such requirement to restart old init container.

@vincentywdeng vincentywdeng force-pushed the ywdeng-sidecar-20372 branch 2 times, most recently from e48556d to ec63837 Compare December 13, 2023 02:18
@vincentywdeng vincentywdeng marked this pull request as ready for review December 13, 2023 09:28
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 13, 2023
@vincentywdeng vincentywdeng changed the title [Draft] issue-20372 - Initial impl for restartable init-container issue-20372 - Initial impl for restartable init-container Dec 13, 2023
@vincentywdeng
Copy link
Author

Hi @ygalblum and @rhatdan
As mentioned above, it seems that podman currently didn't support re-starting init container when it fail and the requirement is not in the original description of the issue. Is it reasonable to create another enhancement for it and have current implementations merged?

@ygalblum
Copy link
Contributor

I fear that merging this code without the code to restart the init-container might mislead the users as they will expect that the init-container is restarted while in fact in won't.
Having said that, does Podman restart a init-container when the policy is on-failure? If not, then I think we can merge this PR while making sure to document that init-containers are not restarted at all. But, if yes, then we're back to my initial response.
@rhatdan WDYT?

@rhatdan
Copy link
Member

rhatdan commented Dec 21, 2023

No idea on the second question about the init container.

@vincentywdeng
Copy link
Author

According to https://github.com/containers/podman/blame/7dc7cbfd9b0440ddc86e210a2272fdaccd6376bb/pkg/domain/infra/abi/play.go#L803, the original init-container can not be restarted. So maybe we can just merge this PR with documentation?

Regarding to restarting, do you know how podman restart regular container. I didn't see how it is done in Podman code

@rhatdan
Copy link
Member

rhatdan commented Dec 24, 2023

Conmon restarts the container.

@ygalblum
Copy link
Contributor

According to https://github.com/containers/podman/blame/7dc7cbfd9b0440ddc86e210a2272fdaccd6376bb/pkg/domain/infra/abi/play.go#L803, the original init-container can not be restarted. So maybe we can just merge this PR with documentation?

Fine by me. @containers/podman-maintainers Are you OK with this approach?

@rhatdan
Copy link
Member

rhatdan commented Dec 27, 2023

Makes sense from my understanding of a pod, that an init container only happens once during creation of the pod.

@ygalblum
Copy link
Contributor

Makes sense from my understanding of a pod, that an init container only happens once during creation of the pod.

K8S is kinda piggybacking on init-containers to define side-car containers that must start successfully before the main container however unlike the conventional init-containers these ones are expected to keep running as long as the pod is running and therefore have a restartPolicy.
Without this change, Podman returns an error when trying to start a pod with an init-container that has a restartPolicy of always. This change allows Podman to consume such pods. However, it does not handle the actual restarting according to the policy.

The question was whether we should merge this change alone. The problem is that with this change, Podman ignores a configuration it accepts. However, from https://github.com/containers/podman/blame/7dc7cbfd9b0440ddc86e210a2272fdaccd6376bb/pkg/domain/infra/abi/play.go#L803 we see that even today Podman does not restart the init-container if the restartPolicy is on-failure.

So, I think we it's OK to merge this change with the current functionality. But, make sure the documentation is clear about it.

@vincentywdeng
Copy link
Author

Thanks, I updated docs/source/markdown/podman-kube-play.1.md.in for the restartable init container.

@rhatdan
Copy link
Member

rhatdan commented Dec 28, 2023

I am fine with merging, since we are already broken, and this makes us less broken. But we should fix the handling of init containers with restart policy in a separate PR.

@ygalblum
Copy link
Contributor

The LGTM. @vincentywdeng do you mind squashing the commits?

@vincentywdeng
Copy link
Author

Thanks, I squashed my commits. How can I proceed with the the PR? There are some failed tests which I belive is unrelated to my change.

@ygalblum
Copy link
Contributor

Thanks, I squashed my commits. How can I proceed with the the PR? There are some failed tests which I belive is unrelated to my change.

@vincentywdeng It seems that there is a failing check on the commit subject length: FAIL - commit subject exceeds 90 characters

@vincentywdeng
Copy link
Author

Thanks, I updated the commit title. The rest failures are all

[+1957s] Summarizing 2 Failures:
[+1957s]   [FAIL] Podman search [It] podman search image with --compatible
[+1957s]   /var/tmp/go/src/github.com/containers/podman/test/e2e/search_test.go:74
[+1957s]   [FAIL] Podman search [It] podman search image with description
[+1957s]   /var/tmp/go/src/github.com/containers/podman/test/e2e/search_test.go:65

@ygalblum
Copy link
Contributor

The code LGTM and it seems like the test failures are unrelated.

@vincentywdeng
Copy link
Author

vincentywdeng commented Jan 3, 2024

@rhatdan and @ygalblum
Should I merge with latest master to see if it fixes the failed test?
If yes, is the practice to merge or rebase?

@ygalblum
Copy link
Contributor

ygalblum commented Jan 3, 2024

Should I merge with latest master to see if it fixes the failed test?
If yes, is the practice to merge or rebase?

Yes. Just rebase on top of the current main

@vincentywdeng vincentywdeng force-pushed the ywdeng-sidecar-20372 branch 2 times, most recently from 582f341 to fa972b6 Compare January 5, 2024 02:17
Copy link
Contributor

@ygalblum ygalblum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the last minute nit. But, I think it can be beneficial

test/e2e/play_kube_test.go Outdated Show resolved Hide resolved
Supports sidecar container by setting restartPolicy of init container to 'always'
Closes containers#20372
Signed-off-by: Vincent Deng <ywdeng@tw.ibm.com>
Copy link
Contributor

@ygalblum ygalblum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

openshift-ci bot commented Jan 8, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vincentywdeng, ygalblum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 8, 2024
@rhatdan
Copy link
Member

rhatdan commented Jan 8, 2024

@ashley-cui @umohnani8 PTAL

@ygalblum
Copy link
Contributor

ygalblum commented May 2, 2024

@containers/podman-maintainers PTAL. This will probably affect the discussion #22496

@rhatdan
Copy link
Member

rhatdan commented Aug 6, 2024

Lets bring this back alive.

@baude @Luap99 WDYT?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants