-
Notifications
You must be signed in to change notification settings - Fork 2.2k
fix: saving/restoring async IO engine transport state #5582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix: saving/restoring async IO engine transport state #5582
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5582 +/- ##
==========================================
- Coverage 83.23% 83.23% -0.01%
==========================================
Files 277 277
Lines 29263 29262 -1
==========================================
- Hits 24358 24357 -1
Misses 4905 4905
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
When saving the state of a microVM with one or more block devices backed by the async IO engine, we need to take a few steps extra steps before serializing the state to the disk, as we need to make sure that there aren't any pending io_uring requests that have not been handled by the kernel yet. For these types of devices that need that we call a prepare_save() hook before serializing the device state. If there are indeed pending requests, once we handle them we need to let the guest know, by adding the corresponding VirtIO descriptors to the used ring. Moreover, since we use notification suppression, this might or might not require us to send an interrupt to the guest. Now, when we save the state of a VirtIO device, we save the device specific state **and** the transport (MMIO or PCI) state along with it. There were a few issues with how we were doing the serialization: 1. We were saving the transport state before we run the prepare_save() hook. The transport state includes information such as the `interrupt_status` in MMIO or `MSI-X config` in PCI. prepare_save() in the case of async IO might change this state, so us running it after saving the transport state essentially looses information. 2. We were saving the devices states after saving the KVM state. This is problematic because, if prepare_save() sends an interrupt to the guest we don't save that "pending interrupt" bit of information in the snapshot. These two issues, were making microVMs with block devices backed by async IO freeze in some cases post snapshot resume, since the guest is stuck in the kernel waiting for some notification for the device emulation which never arrives. Currently, this is only a problem with virtio-block with async IO engine. The only other device using the prepare_save() hook is currently virtio-net, but this one doesn't modify any VirtIO state, neither sends interrupts. Fix this by ensuring the correct ordering of operations during the snapshot phase. Signed-off-by: Babis Chalios <bchalios@amazon.es>
3f71ea3 to
27c7be6
Compare
|
Maybe it makes sense to add the tests from my PR? |
Problem with this is that the tests are flaky. We can´t conceivably have 100 iterations of the tests. Our CI will take for ever. I think we'll merge it as is. Just waiting for @dobrac to confirm that the fix solves the problem for them. |
When saving the state of a microVM with one or more block devices backed by the async IO engine, we need to take a few steps extra steps before serializing the state to the disk, as we need to make sure that there aren't any pending io_uring requests that have not been handled by the kernel yet. For these types of devices that need that we call a prepare_save() hook before serializing the device state.
If there are indeed pending requests, once we handle them we need to let the guest know, by adding the corresponding VirtIO descriptors to the used ring. Moreover, since we use notification suppression, this might or might not require us to send an interrupt to the guest.
Now, when we save the state of a VirtIO device, we save the device specific state and the transport (MMIO or PCI) state along with it.
There were a few issues with how we were doing the serialization:
interrupt_statusin MMIO orMSI-X configin PCI. prepare_save() in the case of async IO might change this state, so us running it after saving the transport state essentially looses information.These two issues, were making microVMs with block devices backed by async IO freeze in some cases post snapshot resume, since the guest is stuck in the kernel waiting for some notification for the device emulation which never arrives.
Currently, this is only a problem with virtio-block with async IO engine. The only other device using the prepare_save() hook is currently virtio-net, but this one doesn't modify any VirtIO state, neither sends interrupts.
Fix this by ensuring the correct ordering of operations during the snapshot phase.
Fixes #5554
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.PR Checklist
tools/devtool checkbuild --allto verify that the PR passesbuild checks on all supported architectures.
tools/devtool checkstyleto verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md.Runbook for Firecracker API changes.
integration tests.
TODO.rust-vmm.