-
Notifications
You must be signed in to change notification settings - Fork 76
Allow stress test to trigger at byte-level granularity #198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There seems to be an issue for running workflows from a fork. I will work on that today. |
Using secrets in a pull request from a fork: https://docs.github.com/en/free-pro-team@latest/actions/reference/events-that-trigger-workflows#pull_request_target Most likely what we need |
Thanks for the links. I have got a seemingly working fix here: #200. The only issue I noticed is that it cannot post perf results as comments (as You can wait until #200 gets merged (should be tomorrow), and then merge with master. Hopefully things will work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two issues for this PR.
- There are race conditions and the implementation does not guarantee precise triggering per bytes. For example, we would like to see stress GC for every 8 bytes allocation, and we are allocating 8 bytes objects in two mutator threads. In this line
mmtk-core/src/util/alloc/bumpallocator.rs
Line 121 in 201d442
&& (base.allocation_count.load(Ordering::Relaxed) > base.options.stress_factor) allocation_count == 0
and pass the check. In this case, they both allocate from their local buffer and no stress GC is triggered with 16 bytes allocated. To fix this, you probably would need to restructure the code a bit and use atomic operations to check and update the allocation count. - The implementation is very specific about bump allocator. For example, large object allocator does not trigger stress GC in this implementation. In my understanding, we should have some general code in
allocator.rs
which works for all allocators (e.g. check and add allocation bytes), and have some allocator-specific code that each allocator is required to implement (e.g. how to allocate when we are in slow path but do not need to acquire a new block).
Since we are calling all shots off on performance, we can address the first point by using sequential consistency. That'll force the two threads to synchronize in sense. For the second, I'll refactor updating the allocation count semantics into |
Yeah, I hope so. It is a good test to see if you have cleanly separated general code and allocator-specific code. Large object allocator does not have a fastpath, which means once you have separated the code, you should need minimal changes to large object allocator. If things are not going this way (i.e. you still need a lot of changes for large object allocator), please let me know and I will look more into your code. |
I believe all your concerns should have been met. I have been unable to test the interaction of stress testing with large object space, but since the count is updated in EDIT: Apologies for the messy git log. If possible, please squash the formatting commits during merge. I'll set up a reminder to run the ci-style script before every commit |
The race condition still exists. As long as reading and updating However, I think this PR is good to merge, and that race issue can be resolved in a separate PR.
Don't worry about that. We always do squash and merge. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This PR re-hauls the Stress GC trigger mechanism in MMTk so that we can trigger a GC at byte-level granularity as opposed to the page-level granularity we have right now. This mechanism will also be appropriated by the analysis tracer (such as sanity) in future PRs.
Note: I have only tested this for the openjdk binding. I have informally tested the mechanism on small test cases such as
fop
.