-
Notifications
You must be signed in to change notification settings - Fork 188
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Touch pages in buffer allocations prior to running first algorithm to… #30
Conversation
* that can be passed to free). Touches each page so that the each page is actually | ||
* physically allocated and mapped into the process. | ||
*/ | ||
void *alloc_and_touch(size_t size, bool must_zero) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the function that does the malloc
or calloc
and then touches each page. You might imagine that you could just do a memset(buf, 0, size)
instead, but in fact tricky compilers like gcc
can recognize this malloc + memset
idiom, place it with a calloc
and then we get the same behavior as before.
* physically allocated and mapped into the process. | ||
*/ | ||
void *alloc_and_touch(size_t size, bool must_zero) { | ||
void *buf = must_zero ? calloc(1, size) : malloc(size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We select between calloc
and malloc
so we can preserve the old behavior of keeping the file and comp buffers un-zeroed, and the decomp buffer zeroed.
* that can be passed to free). Touches each page so that the each page is actually | ||
* physically allocated and mapped into the process. | ||
*/ | ||
void *alloc_and_touch(size_t size, bool must_zero) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the function that does the malloc
or calloc
and then touches each page. You might imagine that you could just do a memset(buf, 0, size)
instead, but in fact tricky compilers like gcc
can recognize this malloc + memset
idiom, place it with a calloc
and then we get the same behavior as before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about memset()
with something else than 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that could also work, although then in the must_zero
case you have to re-memset it after the zero, and then the compiler is possibly smart enough to elide the first memset and again do the "calloc" transformation. Of even if compiler version X isn't smart enough, then X+1 may be. A final advantage to this approach is that's a bit more minimal (at runtime, not in the code) since it only touches 1 out of 4096 bytes, so it's generally faster than memset
, especially for smaller buffers (which may be cached).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Cyan4973 - I should also added that memset
was my initial approach and that's how I discovered that, well, compilers are pretty smart.
Thanks. Looks good to me. |
This is a straightforward fix for issue #29 - basically the idea is to touch every 4096th byte after
malloc
orcalloc
such that the buffer is paged in, all PTEs set up, etc. On some architectures, the page size may be larger than 4096, but this works fine since it will still touch every page in that case (any "extra" touches are harmless).