Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(jailer): Make "init" in new PID NS a session leader #5045

Merged

Conversation

zulinx86
Copy link
Contributor

@zulinx86 zulinx86 commented Feb 20, 2025

Changes & Reasons

https://man7.org/linux/man-pages/man7/pid_namespaces.7.html

A process in an ancestor namespace can send signals to the "init"
process of a child PID namespace only if the "init" process has
established a handelr for that signal.

Firecracker (i.e. the "init" process of the new PID namespace) sets up handlers for some signals including SIGHUP and jailer exits soon after spawning firecracker into the new PID namespace. If the jailer process is a session leader and its exit happens after firecracker configures the signal handlers, SIGHUP will be sent to firecracker and be caught by the handler unexpectedly.

In order to avoid the above issue, if jailer is a session leader, creates a new session and makes the child process (i.e. firecracker) become the leader of the new session to not get SIGHUP on the exit of jailer.

Note that this is the case only if --daemonize is not passed. This is because we use the double fork method to daemonize, making itself not a session leader.

Testing

We cannot test this reliably in unit tests / integration tests because it's race condition between firecracker's signal handler configuration and jailer's exit.

Inserting sleep before the jailer exits to make the situation reliably happen as follows, I confirmed the fix works.

$ git diff
diff --git a/src/jailer/src/env.rs b/src/jailer/src/env.rs
index 66880ac2e..7efe049ae 100644
--- a/src/jailer/src/env.rs
+++ b/src/jailer/src/env.rs
@@ -375,6 +375,9 @@ impl Env {
                 // Save the PID of the process running the exec file provided
                 // inside <chroot_exec_file>.pid file.
                 self.save_exec_file_pid(child_pid, chroot_exec_file)?;
+                // Sleep to gives firecracker time to configure signal handlers.
+                std::thread::sleep(std::time::Duration::from_millis(100));
+                println!("Now jailer exiting...");
                 // SAFETY: This is safe because 0 is valid input to exit.
                 unsafe { libc::exit(0) }
             }
$ sudo screen -L -Logfile screen.log -dm build/cargo_target/x86_64-unknown-linux-musl/release/jailer --new-pid-ns --exec-file build/cargo_target/x86_64-unknown-linux-musl/release/firecracker --gid 123 --uid 123 --id test --chroot-base-dir root/

$ tail -f screen.log 
2025-02-20T11:56:26.600062368 [test:main] Running Firecracker v1.11.0-dev
Now jailer exiting...
^C

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • [ ] If a specific issue led to this PR, this PR closes the issue.
  • [ ] When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • [ ] I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

Copy link

codecov bot commented Feb 20, 2025

Codecov Report

Attention: Patch coverage is 0% with 15 lines in your changes missing coverage. Please review.

Project coverage is 83.16%. Comparing base (2167247) to head (ea283ef).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/jailer/src/env.rs 0.00% 15 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5045      +/-   ##
==========================================
- Coverage   83.20%   83.16%   -0.05%     
==========================================
  Files         245      245              
  Lines       26631    26645      +14     
==========================================
  Hits        22159    22159              
- Misses       4472     4486      +14     
Flag Coverage Δ
5.10-c5n.metal 83.64% <0.00%> (-0.05%) ⬇️
5.10-m5n.metal 83.63% <0.00%> (-0.05%) ⬇️
5.10-m6a.metal 82.83% <0.00%> (-0.06%) ⬇️
5.10-m6g.metal 79.59% <0.00%> (-0.05%) ⬇️
5.10-m6i.metal 83.61% <0.00%> (-0.05%) ⬇️
5.10-m7g.metal 79.59% <0.00%> (-0.05%) ⬇️
6.1-c5n.metal 83.64% <0.00%> (-0.05%) ⬇️
6.1-m5n.metal 83.62% <0.00%> (-0.06%) ⬇️
6.1-m6a.metal 82.83% <0.00%> (-0.06%) ⬇️
6.1-m6g.metal 79.59% <0.00%> (-0.05%) ⬇️
6.1-m6i.metal 83.61% <0.00%> (-0.05%) ⬇️
6.1-m7g.metal 79.59% <0.00%> (-0.05%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@zulinx86 zulinx86 force-pushed the exit_after_signal_handler branch from 65c152b to 04e9b0d Compare February 20, 2025 12:14
@zulinx86 zulinx86 self-assigned this Feb 20, 2025
@zulinx86 zulinx86 added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Feb 20, 2025
roypat
roypat previously approved these changes Feb 20, 2025
Copy link
Contributor

@roypat roypat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wohooo, nice find on the "init is special" thing!

I wonder if its worth it to skip the two syscalls (getpid, getsid) on the --daemonize path (because there we statically know we dont run into session leader issues), but probably not worth it.

@zulinx86
Copy link
Contributor Author

sudo screen -L -Logfile screen.log -dm build/cargo_target/x86_64-unknown-linux-musl/release/jailer --new-pid-ns --exec-file build/cargo_target/x86_64-unknown-linux-musl/release/firecracker --gid 123 --uid 123 --id test --chroot-base-dir root/

Good question. Inserting a single branch (whether --daemonize is passed or not) should not impact performance, even if the branch predictor mis-speculates to the path of calling those syscalls. Let's leverage the domain knowledge!

https://man7.org/linux/man-pages/man7/pid_namespaces.7.html
> A process in an ancestor namespace can send signals to the "init"
> process of a child PID namespace only if the "init" process has
> established a handelr for that signal.

Firecracker (i.e. the "init" process of the new PID namespace) sets up
handlers for some signals including SIGHUP and jailer exits soon after
spawning firecracker into the new PID namespace. If the jailer process
is a session leader and its exit happens after firecracker configures
the signal handlers, SIGHUP will be sent to firecracker and be caught by
the handler unexpectedly.

In order to avoid the above issue, if jailer is a session leader,
creates a new session and makes the child process (i.e. firecracker)
become the leader of the new session to not get SIGHUP on the exit of
jailer.

Note that this is the case only if `--daemonize` is not passed. This is
because we use the double fork method to daemonize, making itself not a
session leader.

Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
@zulinx86 zulinx86 force-pushed the exit_after_signal_handler branch from c136b5a to ea283ef Compare February 20, 2025 13:48
@zulinx86 zulinx86 merged commit 51bbc77 into firecracker-microvm:main Feb 20, 2025
6 of 7 checks passed
@zulinx86 zulinx86 deleted the exit_after_signal_handler branch February 20, 2025 14:48
@Manciukic
Copy link
Contributor

should we add an integration test to check all the combinations of --daemonize and --new-pid-ns to verify the fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Awaiting review Indicates that a pull request is ready to be reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants