-
Notifications
You must be signed in to change notification settings - Fork 5.9k
8353115: GenShen: mixed evacuation candidate regions need accurate live_data #24319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
8353115: GenShen: mixed evacuation candidate regions need accurate live_data #24319
Conversation
👋 Welcome back kdnilsen! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
Webrevs
|
@@ -155,6 +155,23 @@ inline size_t ShenandoahHeapRegion::get_live_data_bytes() const { | |||
return get_live_data_words() * HeapWordSize; | |||
} | |||
|
|||
inline size_t ShenandoahHeapRegion::get_mixed_candidate_live_data_bytes() const { | |||
assert(SafepointSynchronize::is_at_safepoint(), "Should be at Shenandoah safepoint"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use shenandoah_assert_safepoint
here (and other places) instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. I'll make this change.
@@ -75,6 +75,7 @@ ShenandoahHeapRegion::ShenandoahHeapRegion(HeapWord* start, size_t index, bool c | |||
_plab_allocs(0), | |||
_live_data(0), | |||
_critical_pins(0), | |||
_mixed_candidate_garbage_words(0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a new field to track this? During final_mark
, we call increase_live_data_alloc_words
to add TAMS + top
to _live_data
to account for objects allocated during mark. Could we "fix" get_live_data
so that it always returned marked objects (counted by increase_live_data_gc_words
) plus top - TAMS
. This way, the live data would not become stale after final_mark
and we wouldn't have another field to manage. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good idea. Let me experiment with this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My experiment with an initial attempt at this failed with over 60 failures. The "problem" is that we often consult get_live_data() in contexts from which it is "not appropriate" to add (top- TAMS) to the atomic volatile ShenandoahHeapRegion::_live_data() . I think most of these are asserts. I have so far confirmed that there are at least two different places that need to be fixed. Not sure how many total scenarios.
I'm willing to move forward with changes to the failing asserts to make this change work. I think the code would be cleaner with your suggested refactor. It just makes this PR a little more far-reaching than the original.
See the most recent commit on this PR to see the direction this would move us. Let me know if you think I should move forward with more refactoring, or revert this most recent change.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does look simpler. Do you have an example of one of the failing asserts?
One thing I hadn't considered is how "hot" ShenandoahHeapRegion::get_live_data_words
is. Is there going to be a significant performance hit if we make this method do more work? It does look like this method is called frequently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Examples:
FullGC worker:
void ShenandoahMCResetCompleteBitmapTask::work(uint worker_id) {
ShenandoahParallelWorkerSession worker_session(worker_id);
ShenandoahHeapRegion* region = _regions.next();
ShenandoahHeap* heap = ShenandoahHeap::heap();
ShenandoahMarkingContext* const ctx = heap->complete_marking_context();
while (region != nullptr) {
if (heap->is_bitmap_slice_committed(region) && !region->is_pinned() && region->has_marked()) {
// kelvin replacing has_live() with new method has_marked() because has_live() calls get_live_data_words()
// and pointer_delta() asserts out because TAMS is not less than top(). has_marked() does what has_live()
// used to do...
ctx->clear_bitmap(region);
}
region = _regions.next();
}
}
ShenandoahInitMarkUpdateRegionStateClosure::heap_region_do() {
- assert(!r->has_live(), "Region %zu should have no live data", r->index());
+ assert(!r->has_marked(), "Region %zu should have no marked data", r->index());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about performance impact, other than implementing and testing...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i suspect performance impact is minimal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've committed changes that endeavor to implement the suggested refactor. Performance impact does appear to be minimal. This broader refactoring does change behavior slightly. In particular:
- We now have a better understanding of live-memory evacuated during mixed evacuations. This allows the selection of old-candidates for mixed evacuations to be more conservative. We'll have fewer old regions in order to honor the intended budget.
- Potentially, this will result in more mixed evacuations, but each mixed evacuation should take less time.
- There should be no impact on behavior of traditional Shenandoah.
On one recently completed test run, we observed the following impacts compared to tip:
Shenandoah
+80.69% specjbb2015/trigger_failure p=0.00000
Control: 58.250 (+/- 13.48 ) 110
Test: 105.250 (+/- 33.13 ) 30
Genshen
-19.46% jme/context_switch_count p=0.00176
Control: 117.420 (+/- 28.01 ) 108
Test: 98.292 (+/- 32.76 ) 30
Perhaps we need more data to decide whether this is "significant".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This result seems to be consistent. The effect on traditional Shenandoah is apparently to reduce the size of traditional Shenandoah collection sets also because certain regions that would have been collected are now rejected due to "better awareness" of how much live data will need to be copied. The amount of garbage associated with candidate regions for the young collection set is reduced by the amount of allocations above TAMS.
Previously, this had been erroneously reported as garbage. This has the effect of delaying reclamation of some garbage, resulting in an increase in allocation failures on the specjbb 2025 workload.
We might argue that the original behavior was incorrect, in that it was allowing violation of the intended evacuation budget.
We apparently were getting away with this violation because we were able to flip mutator regions to collector space, and/or because evacuation waste was sufficient to accommodate the unbudgeted evacuations.
Now that we have more accurate accounting of live memory, we could perhaps slightly reduce the default evacuation waste budget if we want to claw back the losses in specjbb performance (to enable larger collection sets) as part of this PR.
Redefine the way ShenandoahHeapRegion::get_live_data_<type> works to simplify changes.
Haven't started looking at these changes, but I do wonder if it might be worthwhile to also consider (and implement under a tunable flag) the alternative policy of never adding to the collection set any regions that are still "active" at the point when the collection set for a marking cycle is first assembled at the end of the final marking. That way we don't have to do any re-computing, and the criterion for evacuation is garbage-first (or liveness-least) both of which remain invariant (and complements of each other) throughout the duration of evacuation and obviating entirely the need for recomputing the goodness/choice metric afresh. The downside is that we may leave some garbage on the table in the active regions, but this is probably a minor price for most workloads and heap configurations, and doesn't unnecessarily complicate or overengineer the solution. One question to consider is how G1 does this. May be regions placed in the collection set are retired (i.e. made inactive?) -- I prefer not to forcibly retire active regions as this wastes space that may have been usable. Thoughts? (Can add this comment and discuss on the ticket if that is logistically preferable.) |
@ysramakrishna : Interesting idea. Definitely worthy of an experiment. On the upside, this can make GC more "efficient" by procrastinating until the GC effort maximizes the returns of allocatable memory. On the downside, this can allow garbage to hide out for arbitrarily long times in regions that are not "fully used". This might also end up making it more difficult for the GC to reclaim garbage, in the case that the allocations that are added to the region after we declined to evacuate the first time might need to be evacuated when we come back to evacuate this region at a later time. I'd be in favor of proposing these experiments and possible feature enhancements in the context of a separate JBS ticket. |
@kdnilsen This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a |
/keepalive |
@ysramakrishna The pull request is being re-evaluated and the inactivity timeout has been reset. |
The existing implementation of get_live_data_bytes() and git_live_data_words() does not always behave as might be expected. In particular, the value returned ignores any allocations that occur subsequent to the most recent mark effort that identified live data within the region. This is typically ok for young regions, where the amount of live data determines whether a region should be added to the collection set during the final-mark safepoint.
However, old-gen regions that are placed into the set of candidates for mixed evacuation are more complicated. In particular, by the time the old-gen region is added to a mixed evacuation, its live data may be much larger than at the time concurrent old marking ended.
This PR provides comments to clarify the shortcomings of the existing functions, and adds new functions that provide more accurate accountings of live data for mixed-evacuation candidate regions.
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/24319/head:pull/24319
$ git checkout pull/24319
Update a local copy of the PR:
$ git checkout pull/24319
$ git pull https://git.openjdk.org/jdk.git pull/24319/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 24319
View PR using the GUI difftool:
$ git pr show -t 24319
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/24319.diff
Using Webrev
Link to Webrev Comment