-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mmap: try THP via madvise #113
base: v0.2.2_release
Are you sure you want to change the base?
Conversation
Huge pages bring performance benefits for memory intensive applications. A simple way to use huge pages is by using transparent huge pages. This can be done by either using statically pre-reserved huge pages, or by using transparent huge pages. While some distributions enable transparent huge pages by default, other distributions chose to allow this feature only when the madvise system call is used. This change adds the madvise system call to the memory allocation. On Unix systems, the invocation of mmap is followed with an madvise system call that asks the kernel to back the memory with transparent huge pages, if possible. Note: this is a prototypical implementation of getting huge page support. No performance testing has been performed yet. We expect similar results as reported in https://arxiv.org/abs/2004.14378 Once this data is available, a configuration layer should be added to be able to disable or enable this change. Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
We have done the same optimization out of the vm-memory crate. It's really tricky to deal with preallocation and transparent huge pages. How about left this part to the vmm implementation? |
Can you point to your implementation of this? I'd argue being able to use this in a configurable way right in the crate might be easier to adapt for other consumers. Can you also propose benchmarks you performed to see the effect of this? |
src/mmap_unix.rs
Outdated
@@ -154,6 +154,8 @@ impl MmapRegion { | |||
return Err(Error::Mmap(io::Error::last_os_error())); | |||
} | |||
|
|||
let _ret = unsafe { libc::madvise(addr, size, libc::MADV_HUGEPAGE) }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Apologies for the late reply :( I was just wondering, what's the difference between using this madvise
call and providing MAP_HUGETLB
as part of flags
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAP_HUGETLB uses hugetlbfs, MADV_HUGEPAGE uses transparent huge page for memfd or anonymous mapping.
We have done some benchmarks with internal workload, and the improvements depends on several factors. For most cases it could achieve 10-20% improvement for latency. |
Thanks for the details! It seems like there are some trade-offs to consider when using THP, so it shouldn't always be enabled. Does that sound right? In that case, since the |
Allow to control whether we will use huge pages via options. Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
I added a cherry-picked commit from my current development tree that is based on v0.3.0 of this package. To not move the PR around, I decided to just add the change on top. The change introduced options in a brief way, without properly forwarding the options to the actual user of the created maps. Do you suggest to continue the effort and lift the options struct to the signature of all relevant functions, so that a caller can always choose how to set them? That seems to result in a lot of code (and interface) changes. Alternatives I can see would be to have a singleton wrt properties, or have something else that is global, and allows to control the behavior, like environment variables. I currently prefer a singleton, any comments? Wrt #120, can we see the change of vmm to use huge pages. I guess, that change does not support using transparteng huge pages, does it? |
Hmm, seems like there are several proposed changes in flight that would benefit from a better interface for building From what I understand, #120 is sort of orthogonal to the transparent huge pages aspect (i.e. regions with THP are not considered to be backed by huge pages w.r.t. the semantics in there), but I might be wrong and hopefully some1 will correct me if that's the case. |
Huge pages bring performance benefits for memory intensive applications.
A simple way to use huge pages is by using transparent huge pages. This
can be done by either using statically pre-reserved huge pages, or by
using transparent huge pages.
While some distributions enable transparent huge pages by default, other
distributions chose to allow this feature only when the madvise system
call is used.
This change adds the madvise system call to the memory allocation. On
Unix systems, the invocation of mmap is followed with an madvise system
call that asks the kernel to back the memory with transparent huge
pages, if possible.
Note: this is a prototypical implementation of getting huge page
support. No performance testing has been performed yet. We expect
similar results as reported in https://arxiv.org/abs/2004.14378
Once this data is available, a configuration layer should be added to
be able to disable or enable this change.
Signed-off-by: Norbert Manthey nmanthey@amazon.de
Testing Done
Currently, I only compiled firecracker with the change, and will run it's test suite next and add the results here, as well as look into performance testing.
The test suite mostly passes, except some tests that fail due to time outs, which might depend on me using a mobile CPU for testing.
Update dependencies
Actually Build
Required Changes in Firecracker (to pick up this change)
Place this vm-memory repository into "local-crates/vm-memory" in the firecracker repository.
In firecracker, use the following changes on top of the (current) master commit: