Move poststart hook from runc create to runc start#5186
Move poststart hook from runc create to runc start#5186kolyshkin wants to merge 3 commits intoopencontainers:mainfrom
Conversation
0aee5a8 to
2d88d2e
Compare
|
We will have to revert this if it breaks anyone. But I don't think hooks are so used (they were mostly unexported to upper layers), so hopefully we can do this and no one notices? |
|
I thought we decided to just keep the broken poststart hook as is and push people to use the new createRuntime and friends? EDIT: I think the issue is that opencontainers/runtime-spec#1169 missed a few changes when changing the explanation of |
|
I suspect the issue is we just need to update the lifecycle doc in the spec, since the description of
|
|
@cyphar this one is about poststart only. Only Currently runc implements I feel we need to fix the discrepancy in one way or another. |
|
@kolyshkin I built runc from your branch poststart (git commit 2d88d2e). |
|
can you add Co-authored add me @kolyshkin |
|
/cc @elezar : Evan, would it break any of the usages in nvidia container toolkit? |
|
The NVIDIA tooling uses only createContainer hooks in the CDI-based approach and the precreate hook in the legacy stack. Changing poststart behaviour should not affect this. |
halaney
left a comment
There was a problem hiding this comment.
We use a good bit of OCI hooks, but don't use this one (anymore, and digging thru the history it seems that's actually due to this bug I think).
This seems to make sense to me and matches the spec as I read it.
Rename c.signal to c.signalInit, and add c.signal which is a lock-less form of c.Signal. To be used by the next patch. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
No functional change. To be used by the next patch. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
I haven't actually used any of your code here (although I've reviewed it). In any case, thank you for your contribution! I have acknowledged it by adding the |
The runtime-spec [1] currently says: > 6. Runtime's start command is invoked with the unique identifier of > the container. > 7. The startContainer hooks MUST be invoked by the runtime. If any > startContainer hook fails, the runtime MUST generate an error, stop > the container, and continue the lifecycle at step 12. > 8. The runtime MUST run the user-specified program, as specified by > process. > 9. The poststart hooks MUST be invoked by the runtime. If any > poststart hook fails, the runtime MUST generate an error, stop the > container, and continue the lifecycle at step 12. > ... > 11. Runtime's delete command is invoked with the unique identifier of > the container. > 12. The container MUST be destroyed by undoing the steps performed > during create phase (step 2). > 13. The poststop hooks MUST be invoked by the runtime. If any poststop > hook fails, the runtime MUST log a warning, but the remaining hooks > and lifecycle continue as if the hook had succeeded. Currently, we do 9 before 8 (heck, even before 6), which is clearly against the spec and results in issues like the one described in [2]. Let's move running poststart hook to after the user-specified process has started. NOTE this patch only fixes the order and does not implement removing the container when the poststart hook failed (as this part of the spec is controversial -- destroy et al and should probably be, and currently are, part of "runc delete"). [1]: https://github.com/opencontainers/runtime-spec/blob/main/runtime.md#lifecycle [2]: opencontainers#5182 Reported-by: ningmingxiao <ning.mingxiao@zte.com.cn> Reported-by: Erik Sjölund <erik.sjolund@gmail.com> Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
rata
left a comment
There was a problem hiding this comment.
I'm a little scared of this change, does this enables something that was not possible before?
But if the consensus is to do the change, as most big users seem are fine with this, I'm fine. We can always revert. LGTM
The runtime-spec 1 currently says:
Currently, we do 9 before 8 (heck, even before 6), which is clearly
against the spec and results in issues like the one described in 2.
Let's move running poststart hook to after the user-specified process
has started.
NOTE this patch only fixes the order and does not implement removing
the container when the poststart hook failed (as this part of the spec
is controversial -- destroy et al and should probably be, and currently
are, part of "runc delete").
Fixes: #5182
Fixes: #4347
Closes: #4348