Skip to content

Measure jailer startup performance #5282

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 2, 2025

Conversation

ShadowCurse
Copy link
Contributor

@ShadowCurse ShadowCurse commented Jun 26, 2025

Changes

Update jailer requirements on the executable name and permissions. Now it does not need to contain firecracker in it's name, but must be marked as executable.
Add jailer startup performance test. The test is parametrized on the number of jailers started and the number of bind mounts present in the system.

Reason

We want to know what is the overhead of a jailer in isolation. Since usually there are multiple jailed VMs on the system, the test is parametrized by the number of jailers started.
The reason for adding measurements with bind mounts is because it slows down the jailer startup time. The more bind mounts are present, the more contention there is in the kernel. At the same time, bind mounts are frequently created per VM in the production environment.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

  • I have read and understand CONTRIBUTING.md.
  • I have run tools/devtool checkstyle to verify that the PR passes the
    automated style checks.
  • I have described what is done in these changes, why they are needed, and
    how they are solving the problem in a clear and encompassing way.
  • I have updated any relevant documentation (both in code and in the docs)
    in the PR.
  • I have mentioned all user-facing changes in CHANGELOG.md.
  • If a specific issue led to this PR, this PR closes the issue.
  • When making API changes, I have followed the
    Runbook for Firecracker API changes.
  • I have tested all new and changed functionalities in unit tests and/or
    integration tests.
  • I have linked an issue to every new TODO.

  • This functionality cannot be added in rust-vmm.

@ShadowCurse ShadowCurse force-pushed the jailer_tweaks branch 5 times, most recently from 3b16c6e to 2e99a59 Compare June 26, 2025 16:31
@ShadowCurse ShadowCurse marked this pull request as ready for review June 26, 2025 16:35
@ShadowCurse ShadowCurse added Status: Awaiting review Indicates that a pull request is ready to be reviewed Type: Enhancement Indicates new feature requests labels Jun 26, 2025
@ShadowCurse ShadowCurse self-assigned this Jun 26, 2025
@ShadowCurse ShadowCurse force-pushed the jailer_tweaks branch 2 times, most recently from 0b5d3aa to 65d8ee3 Compare June 27, 2025 11:43
Copy link

codecov bot commented Jun 27, 2025

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 82.91%. Comparing base (931a233) to head (8bf5e63).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
src/jailer/src/main.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5282      +/-   ##
==========================================
+ Coverage   82.86%   82.91%   +0.05%     
==========================================
  Files         250      250              
  Lines       26902    26897       -5     
==========================================
+ Hits        22292    22302      +10     
+ Misses       4610     4595      -15     
Flag Coverage Δ
5.10-c5n.metal 83.35% <0.00%> (+<0.01%) ⬆️
5.10-m5n.metal 83.35% <0.00%> (+<0.01%) ⬆️
5.10-m6a.metal 82.56% <0.00%> (+<0.01%) ⬆️
5.10-m6g.metal 79.17% <0.00%> (-0.01%) ⬇️
5.10-m6i.metal 83.35% <0.00%> (+<0.01%) ⬆️
5.10-m7a.metal-48xl 82.55% <0.00%> (?)
5.10-m7g.metal 79.17% <0.00%> (-0.01%) ⬇️
5.10-m7i.metal-24xl 83.30% <0.00%> (?)
5.10-m7i.metal-48xl 83.31% <0.00%> (?)
5.10-m8g.metal-24xl 79.17% <0.00%> (?)
5.10-m8g.metal-48xl 79.17% <0.00%> (?)
6.1-c5n.metal 83.40% <0.00%> (+<0.01%) ⬆️
6.1-m5n.metal 83.39% <0.00%> (-0.01%) ⬇️
6.1-m6a.metal 82.61% <0.00%> (+<0.01%) ⬆️
6.1-m6g.metal 79.17% <0.00%> (-0.01%) ⬇️
6.1-m6i.metal 83.39% <0.00%> (-0.01%) ⬇️
6.1-m7a.metal-48xl 82.60% <0.00%> (?)
6.1-m7g.metal 79.17% <0.00%> (-0.01%) ⬇️
6.1-m7i.metal-24xl 83.41% <0.00%> (?)
6.1-m7i.metal-48xl 83.40% <0.00%> (?)
6.1-m8g.metal-24xl 79.16% <0.00%> (?)
6.1-m8g.metal-48xl 79.17% <0.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ShadowCurse ShadowCurse force-pushed the jailer_tweaks branch 2 times, most recently from 1d2e38c to a9464ba Compare June 27, 2025 12:23
@Manciukic
Copy link
Contributor

do you have a link where I could see the metrics?

@roypat
Copy link
Contributor

roypat commented Jun 30, 2025

Do we actually need the special binary? Doesn't firecracker already report this as parent_cpu_time_us?

@ShadowCurse
Copy link
Contributor Author

@roypat I added it to make the test easier. Starting the whole FC and parsing it's output will be more complicated and slower. Considering this test is only for jailer, I think it is better to not touch FC.

@roypat
Copy link
Contributor

roypat commented Jun 30, 2025

@roypat I added it to make the test easier. Starting the whole FC and parsing it's output will be more complicated and slower. Considering this test is only for jailer, I think it is better to not touch FC.

It'd just be

def test_jailer_perf(microvm_factory):
    microvm = microvm_factory.spawn()
    metrics = microvm.flush_metrics()
    
	emit_metrics (metrics['parent_cpu_time_us'], "microseconds")

no? Looks a lot simpler to me than a whole new test binary and modifying the jailer tbh

@ShadowCurse
Copy link
Contributor Author

hmm, I didn't even remember these metrics exist. But there is still a ting of starting a lot of jails in a loop. With VMs, there is a whole setup process with jailing rootfs, kernel and so on which is totally unnecessary. So as I said, for jailer tests, it is better to use custom binaries (they are small and easy to understand anyway).

@roypat
Copy link
Contributor

roypat commented Jun 30, 2025

hmm, I didn't even remember these metrics exist. But there is still a ting of starting a lot of jails in a loop. With VMs, there is a whole setup process with jailing rootfs, kernel and so on which is totally unnecessary. So as I said, for jailer tests, it is better to use custom binaries (they are small and easy to understand anyway).

No, if you just start a Firecracker process it won't copy the rootfs and kernel image, it'll literally just start up a firecracker process and have it listen on the API server socket. The rest only happens when you boot the microvm, which you wouldn't do in this test. But arguably, for a real usecase, wouldn't we want to account for the jailing of rootfs/kernel image/snapshot anyway though?

@ShadowCurse
Copy link
Contributor Author

wouldn't we want to account for the jailing of rootfs/kernel image/snapshot anyway though?

Maybe in additional tests. Current one is only concerned with amount of bind mounts in a system.

@ShadowCurse ShadowCurse force-pushed the jailer_tweaks branch 2 times, most recently from 86ef0c4 to 50a71d1 Compare June 30, 2025 12:09
@ShadowCurse ShadowCurse force-pushed the jailer_tweaks branch 2 times, most recently from e4ca6ae to 52c3062 Compare June 30, 2025 12:41
@ShadowCurse
Copy link
Contributor Author

As it appeared, the flush_metrics is not available as pre_boot api call, so there is no way to just start FC without staring guest for this test.

@ShadowCurse ShadowCurse force-pushed the jailer_tweaks branch 3 times, most recently from e43f956 to 867621f Compare July 1, 2025 14:20
Now jailer will not complain if the executable does
not contain `firecracker` in it's name. This restriction
was unnecessary and it's removal is not a breaking change.

Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
@ShadowCurse ShadowCurse force-pushed the jailer_tweaks branch 3 times, most recently from a726b68 to 61f890c Compare July 1, 2025 18:45
Add note about new jailer executable requirements.

Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Remove explicit panic on error. Let the error be returned from
the main.

Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
This binary just outputs the start and end time of
the jailer startup.

Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
The test measures jailer startup time. It is parametrized
by the number of parallel jailers starting up and the number
of bind mount points present in the system.

Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
Add new entry for the jailer tests in the performance
bk pipeline.

Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>
@ShadowCurse ShadowCurse enabled auto-merge (rebase) July 2, 2025 14:26
@ShadowCurse ShadowCurse merged commit 033825d into firecracker-microvm:main Jul 2, 2025
6 of 7 checks passed
@ShadowCurse ShadowCurse deleted the jailer_tweaks branch July 2, 2025 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Awaiting review Indicates that a pull request is ready to be reviewed Type: Enhancement Indicates new feature requests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants