-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Add snapsafety features to VMClock device #5564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
6818148 to
81dbeb2
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #5564 +/- ##
=======================================
Coverage 83.23% 83.24%
=======================================
Files 277 277
Lines 29263 29308 +45
=======================================
+ Hits 24358 24398 +40
- Misses 4905 4910 +5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
115a1d6 to
99d68b0
Compare
| # Enable PHYLIB and NETWORK_PHY_TIMESTAMPING to see the additional clocks. | ||
| # | ||
| CONFIG_PTP_1588_CLOCK_KVM=y | ||
| CONFIG_PTP_1588_CLOCK_VMCLOCK=y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have this in our patches on top of this base config, otherwise we risk removing these when we rebase the config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem is that our patches apply in the Linux tree and this config doesn't exist inside it. We would need a separate set of patches which apply in the configs, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not following. We're adding patches to the linux tree to support this device on aarch64. Why wouldn't the configs apply to it?
We do this after checking out the tree and applying the patches:
# Concatenate all config files into one. olddefconfig will then resolve
# as needed. Later values override earlier ones.
cat "$@" >.config
make olddefconfig
So having it here or in vmclock.config should be equivalent, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh right. I had forgotten about the extra configs we're keeping. I though you meant having it as part of the linux kernel source code pathces (the one I just added). Sure, I'll move it there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done!
9a9ba72 to
74aaf01
Compare
| # IRQs are available from 5 to 23. We always use one IRQ for VMGenID device, so | ||
| # the maximum number of devices supported at the same time is 18. | ||
| return 18 | ||
| # IRQs are available from 5 to 23. We always use one IRQ for VMGenID and VMClock | ||
| # devices, so the maximum number of devices supported at the same time is 17. | ||
| return 17 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably highlight this change in the release notes as it's potentially breaking customers using up to 18 devices. They can solve this by just switching to PCI, so I don't think it's a big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll include it in the CHANGELOG entry. I was planning to make a follow up PR with documentation and CHANGELOG
| # Undo any patches previsouly applied, so that we can build the same kernel with different | ||
| # configs, e.g. no-acpi | ||
| git checkout - | ||
| git reset --hard HEAD | ||
| git clean -f -d |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to do any cleanup at all since we committed all the changes to the tmp branch? I think it'd be useful to leave the build folder as is so it's easy to debug what was built or change something on the fly and manually re-run make.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely need to reset, otherwise git complains that checkout would overwrite changed files. Problem being that git apply does not commit the patches, just applies the changes. We also need the clean because new files added by the patches don't get removed by reset.
However, good catch. git checkout - needs to go last here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, given that we need to do git reset & git clean I think we don't even need the git checkout
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, I thought these were committed on top. We could actually commit those with git-am (but it may complain no committer is specified) and avoid all the cleanup code.
GIT_COMMITTER_NAME="temp" GIT_COMMITTER_EMAIL="temp@temp.com" git am ${patchset}/${KERNEL_VERSION}/*.patch
74aaf01 to
f2f293f
Compare
Add support to apply patches while build kernels for our CI. For the time being we use this to apply patches for snapshot safety VMClock extensions [1] (backported to 5.10 and 6.1 AL kernels). [1] https://lkml.org/lkml/2025/12/3/653 Signed-off-by: Babis Chalios <bchalios@amazon.es>
Enable CONFIG_PTP_1588_CLOCK_VMCLOCK for both 5.10 and 6.1 kernels. Signed-off-by: Babis Chalios <bchalios@amazon.es>
Add support for vm_generation_counter and notifications. Keep this separately for now, as a patch to the VMClock bindings, since currently upstream bindings don't include it. We will recreate this for Linux headers once they are released, at which point we will need to drop the bindgen patch. Signed-off-by: Babis Chalios <bchalios@amazon.es>
Add support for `vm_generation_counter` field in VMClock ABI. This field is similar to `disruption_marker` but it's only updated on snapshot loading events (not in live migration). It is meant to provide the guest with snapshot safety notifications. Moreover, add support for the notification capability. This capability require us to send an ACPI notification every time we change the seq_count field to a new even value. This essentially means that we need to send a notification upon resuming from a snapshot just before resuming vCPUs. Signed-off-by: Babis Chalios <bchalios@amazon.es>
Extend VMClock integration tests to also account for the vm_generation_counter field and notification support flag. Signed-off-by: Babis Chalios <bchalios@amazon.es>
Expose VMClock device to guest via DT and enable compiling the vmclock for ARM architectures. Keep VMClock tests only on x86 until we get support from the guest kernel. Signed-off-by: Babis Chalios <bchalios@amazon.es>
We now apply backported patches that add support for VMClock on Aarch64 systems via DT bindings, so enable tests for Aarch64 as well. Moreover, backported patches add support for the poll() system calls family, so enable the relevant test. Signed-off-by: Babis Chalios <bchalios@amazon.es>
f2f293f to
f7111bf
Compare
Changes
Implement snapshot safety features for VMClock device as described in this Linux patchset: https://lore.kernel.org/lkml/20251203123539.7292-1-bchalios@amazon.es/.
Expose the
vm_generation_counterfield in thevmclock_abi. Also, implement the notification feature. This simply means that during snapshot resume we bumpvm_gneration_counterand inject an interrupt to the guest before resuming vCPUs.Reason
Provide a means for the guest userspace to know that the microVM has been loaded from a snpahost
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.PR Checklist
tools/devtool checkbuild --allto verify that the PR passesbuild checks on all supported architectures.
tools/devtool checkstyleto verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md.Runbook for Firecracker API changes.
integration tests.
TODO.rust-vmm.