-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Huge page support for guest memory #2139
Comments
As an additional data point, when running on a two core A57, putting the kernel image and initrd into a hugetlb filesystem resulted in a 20% improvement. When running 'perf -ag' and looking at where the time is being spent, it was no longer spending all of it's time faulting from devices::virtio::block::device::Block::process_queue. It still spends a lot of time handling faults from process_queue(), but at least it is no longer at the very top of the perf output. Also, the console on arm is extremely expensive; 'quiet' on the kernel command line is your friend. |
Hi, @pclesr ! Sorry for the delay and thanks for logging this feature request. I think we are interested in exposing such capabilities. We're currently using the guest memory primitives from rust-vmm/vm-memory. We need to contribute there first with huge pages support and release a new vm-memory version which can be consumed by Firecracker. I've opened an issue on rust-vmm/vm-memory. We'll keep this issue here to track the progress on rust-vmm/memory and the discussions around Firecracker consumption/exposing of the feature. |
Hi, @iulianbarbu, I responded to the issue you opened on rust-vmm/vm-memory, expressing interest in that issue. We'd also like to work on the Firecracker end of that feature, too. Is there more information/context you can provide to help us get started? |
Hi @EmeraldShift ! Personally I think on the Firecracker end the issue would be how exactly to expose the feature to the customer. Should it be an API call ? Or should it be something else ? We haven't discussed anything yet. But anyway, we should wait for the rust-vmm/vm-memory implementation first. There might be some aspects that will depend on the design that will be adopted there. |
I did a simple proof of concept by patching
Obviously, this is not generic, but I wanted to just see if it would work. One possibility for firecracker would be an option in the machine config that would control whether the pages for guest memory were backed by huge pages or normal. |
Hey all, |
Why is this feature request important? What are the use cases? Please describe.
Page faulting can increase startup time. In the development of Nitro Hypervisor support for arm, supporting huge pages for guest memory made the difference between hitting the target SPEC numbers and not. Since there was no hugepage fs in Nitro Hypervisor, the only alternative was to make the kernel support huge pud/pmd for arm. I would not advocate that route, rather have the ability to use a mounted hugetblfs.
For embedded environments where everything is restricted, being able to allocate a specific number of huge pages at boot that can be used for guests would decrease the startup time of the app by reducing faults.
Describe the desired solution
Since not every environment will have huge pages, either a build-time or run time option to have guest memory that is specified in the KVM_SET_USER_MEMORY_REGION ioctl be backed by huge pages. I would propose using the hugetlbfs and mmap() to alloc the anon memory. Since I barely know Rust, I can't tell how guest memory is currently allocated, but I would assume it is some mmap() call.
Describe possible alternatives
perf tracing shows that a lot of time is spent page faulting, especially in arm. Obviously it works without having huge pages, but it could reduce not only startup time, but fault time as well.
Additional context
Running on a poor, resource-starved 2-core A57 in an embedded environment. Since it's embedded, there is control over everything. The goal is to reduce startup time and overhead of the KVM calls.
Checks
The text was updated successfully, but these errors were encountered: