Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destroy container along with processes before stdio #646

Merged
merged 1 commit into from
Mar 17, 2016

Conversation

crosbymichael
Copy link
Member

We need to make sure the container is destroyed before closing the stdio
for the container. This becomes a big issues when running in the host's
pid namespace because the other processes could have inherited the stdio
of the initial process. The call to close will just block as they still
have the io open.

Calling destroy before closing io, especially in the host pid namespace
will cause all additional processes to be killed in the container's
cgroup. This will allow the io to be closed successfuly.

This change makes sure the order for destroy and close is correct as
well as ensuring that if any errors encoutered during start or exec will
be handled by terminating the process and destroying the container. We
cannot use defers here because we need to enforce the correct ordering
on destroy.

This also sets the subreaper setting for runc so that when running in
pid host, runc can wait on the addiontal processes launched by the
container, useful on destroy, but also good for reaping the additional
processes that were launched.

Signed-off-by: Michael Crosby crosbymichael@gmail.com

@@ -17,6 +18,10 @@ const signalBufferSize = 2048
// newSignalHandler returns a signal handler for processing SIGCHLD and SIGWINCH signals
// while still forwarding all other signals to the process.
func newSignalHandler(tty *tty) *signalHandler {
// set us as the subpreaper before registering the signal handler for the container
if err := system.SetSubreaper(1); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why set as the sub reaper by default? Can we add a runtime flag instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is needed to do a wait if you are running in PID host for grandchild processes of the container. I think it is a safe thing to set but we could have a flag or something else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, should be safe, but I think it will be better to add a flag if someone wants to use their own higher level process as the reaper.

@crosbymichael
Copy link
Member Author

@mrunalp what do you think the default should be?

@mrunalp
Copy link
Contributor

mrunalp commented Mar 15, 2016

@crosbymichael I think we can set the sub_reaper by default but have a flag to opt out.

We need to make sure the container is destroyed before closing the stdio
for the container.  This becomes a big issues when running in the host's
pid namespace because the other processes could have inherited the stdio
of the initial process.  The call to close will just block as they still
have the io open.

Calling destroy before closing io, especially in the host pid namespace
will cause all additional processes to be killed in the container's
cgroup.  This will allow the io to be closed successfuly.

This change makes sure the order for destroy and close is correct as
well as ensuring that if any errors encoutered during start or exec will
be handled by terminating the process and destroying the container.  We
cannot use defers here because we need to enforce the correct ordering
on destroy.

This also sets the subreaper setting for runc so that when running in
pid host, runc can wait on the addiontal processes launched by the
container, useful on destroy, but also good for reaping the additional
processes that were launched.

Signed-off-by: Michael Crosby <crosbymichael@gmail.com>
@crosbymichael
Copy link
Member Author

@mrunalp updated with a flag for no-subreaper

@mrunalp
Copy link
Contributor

mrunalp commented Mar 15, 2016

One thing to consider is that folks using init systems as pid 1 that expects to reap will have to disable this option.
From https://github.com/torvalds/linux/blob/master/kernel/exit.c#L479

/*
 * When we die, we re-parent all our children, and try to:
 * 1. give them to another thread in our thread group, if such a member exists
 * 2. give it to the first ancestor process which prctl'd itself as a
 *    child_subreaper for its children (like a service manager)
 * 3. give it to the init process (PID 1) in our pid namespace
 */

@crosbymichael
Copy link
Member Author

@mrunalp the processes will still be reparented to init first so it should not be an issue. If the subreaper does not reap them then they still fall back to init as well.

@mrunalp
Copy link
Contributor

mrunalp commented Mar 15, 2016

@crosbymichael It looks like the case where one runs a supervisor/init in the pid namespace, it would come into picture after the sub-reaper. Since we have the flag, they could pass the flag to disable this reaper and let their pid 1 in container handle the reaping.

@mrunalp
Copy link
Contributor

mrunalp commented Mar 15, 2016

LGTM

@crosbymichael
Copy link
Member Author

@mrunalp actually the pid 1 inside the container will reap first even with subreaper enabled. its always handled specially being pid 1

@mrunalp
Copy link
Contributor

mrunalp commented Mar 15, 2016

@crosbymichael okay, that is better 👍 Thanks!

@mrunalp
Copy link
Contributor

mrunalp commented Mar 16, 2016

@LK4D4 @hqhq ping

@LK4D4
Copy link
Contributor

LK4D4 commented Mar 17, 2016

LGTM

LK4D4 added a commit that referenced this pull request Mar 17, 2016
Destroy container along with processes before stdio
@LK4D4 LK4D4 merged commit bbde9c4 into opencontainers:master Mar 17, 2016
@crosbymichael crosbymichael deleted the pid-host-block branch March 17, 2016 21:53
stefanberger pushed a commit to stefanberger/runc that referenced this pull request Sep 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants