-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Destroy container along with processes before stdio #646
Conversation
@@ -17,6 +18,10 @@ const signalBufferSize = 2048 | |||
// newSignalHandler returns a signal handler for processing SIGCHLD and SIGWINCH signals | |||
// while still forwarding all other signals to the process. | |||
func newSignalHandler(tty *tty) *signalHandler { | |||
// set us as the subpreaper before registering the signal handler for the container | |||
if err := system.SetSubreaper(1); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why set as the sub reaper by default? Can we add a runtime flag instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is needed to do a wait if you are running in PID host for grandchild processes of the container. I think it is a safe thing to set but we could have a flag or something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, should be safe, but I think it will be better to add a flag if someone wants to use their own higher level process as the reaper.
d919e72
to
0f16383
Compare
@mrunalp what do you think the default should be? |
@crosbymichael I think we can set the sub_reaper by default but have a flag to opt out. |
We need to make sure the container is destroyed before closing the stdio for the container. This becomes a big issues when running in the host's pid namespace because the other processes could have inherited the stdio of the initial process. The call to close will just block as they still have the io open. Calling destroy before closing io, especially in the host pid namespace will cause all additional processes to be killed in the container's cgroup. This will allow the io to be closed successfuly. This change makes sure the order for destroy and close is correct as well as ensuring that if any errors encoutered during start or exec will be handled by terminating the process and destroying the container. We cannot use defers here because we need to enforce the correct ordering on destroy. This also sets the subreaper setting for runc so that when running in pid host, runc can wait on the addiontal processes launched by the container, useful on destroy, but also good for reaping the additional processes that were launched. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
0f16383
to
fdb100d
Compare
@mrunalp updated with a flag for |
One thing to consider is that folks using init systems as pid 1 that expects to reap will have to disable this option.
|
@mrunalp the processes will still be reparented to init first so it should not be an issue. If the subreaper does not reap them then they still fall back to init as well. |
@crosbymichael It looks like the case where one runs a supervisor/init in the pid namespace, it would come into picture after the sub-reaper. Since we have the flag, they could pass the flag to disable this reaper and let their pid 1 in container handle the reaping. |
LGTM |
@mrunalp actually the pid 1 inside the container will reap first even with subreaper enabled. its always handled specially being pid 1 |
@crosbymichael okay, that is better 👍 Thanks! |
LGTM |
Destroy container along with processes before stdio
Perfect json content
We need to make sure the container is destroyed before closing the stdio
for the container. This becomes a big issues when running in the host's
pid namespace because the other processes could have inherited the stdio
of the initial process. The call to close will just block as they still
have the io open.
Calling destroy before closing io, especially in the host pid namespace
will cause all additional processes to be killed in the container's
cgroup. This will allow the io to be closed successfuly.
This change makes sure the order for destroy and close is correct as
well as ensuring that if any errors encoutered during start or exec will
be handled by terminating the process and destroying the container. We
cannot use defers here because we need to enforce the correct ordering
on destroy.
This also sets the subreaper setting for runc so that when running in
pid host, runc can wait on the addiontal processes launched by the
container, useful on destroy, but also good for reaping the additional
processes that were launched.
Signed-off-by: Michael Crosby crosbymichael@gmail.com