Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Touch pages in buffer allocations prior to running first algorithm to… #30

Merged
merged 1 commit into from
Jan 16, 2017

Conversation

travisdowns
Copy link
Contributor

This is a straightforward fix for issue #29 - basically the idea is to touch every 4096th byte after malloc or calloc such that the buffer is paged in, all PTEs set up, etc. On some architectures, the page size may be larger than 4096, but this works fine since it will still touch every page in that case (any "extra" touches are harmless).

* that can be passed to free). Touches each page so that the each page is actually
* physically allocated and mapped into the process.
*/
void *alloc_and_touch(size_t size, bool must_zero) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the function that does the malloc or calloc and then touches each page. You might imagine that you could just do a memset(buf, 0, size) instead, but in fact tricky compilers like gcc can recognize this malloc + memset idiom, place it with a calloc and then we get the same behavior as before.

* physically allocated and mapped into the process.
*/
void *alloc_and_touch(size_t size, bool must_zero) {
void *buf = must_zero ? calloc(1, size) : malloc(size);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We select between calloc and malloc so we can preserve the old behavior of keeping the file and comp buffers un-zeroed, and the decomp buffer zeroed.

* that can be passed to free). Touches each page so that the each page is actually
* physically allocated and mapped into the process.
*/
void *alloc_and_touch(size_t size, bool must_zero) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the function that does the malloc or calloc and then touches each page. You might imagine that you could just do a memset(buf, 0, size) instead, but in fact tricky compilers like gcc can recognize this malloc + memset idiom, place it with a calloc and then we get the same behavior as before.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about memset() with something else than 0 ?

Copy link
Contributor Author

@travisdowns travisdowns Jan 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that could also work, although then in the must_zero case you have to re-memset it after the zero, and then the compiler is possibly smart enough to elide the first memset and again do the "calloc" transformation. Of even if compiler version X isn't smart enough, then X+1 may be. A final advantage to this approach is that's a bit more minimal (at runtime, not in the code) since it only touches 1 out of 4096 bytes, so it's generally faster than memset, especially for smaller buffers (which may be cached).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cyan4973 - I should also added that memset was my initial approach and that's how I discovered that, well, compilers are pretty smart.

@inikep
Copy link
Owner

inikep commented Jan 16, 2017

Thanks. Looks good to me.

@inikep inikep merged commit 03ef34d into inikep:master Jan 16, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants