Skip to content

Conversation

iwanowww
Copy link
Contributor

@iwanowww iwanowww commented May 20, 2025

This PR introduces C2 support for Reference.reachabilityFence().

After JDK-8199462 went in, it was discovered that C2 may break the invariant the fix relied upon [1]. So, this is an attempt to introduce proper support for Reference.reachabilityFence() in C2. C1 is left intact for now, because there are no signs yet it is affected.

Reference.reachabilityFence() can be used in performance critical code, so the primary goal for C2 is to reduce its runtime overhead as much as possible. The ultimate goal is to ensure liveness information is attached to interfering safepoints, but it takes multiple steps to properly propagate the information through compilation pipeline without negatively affecting generated code quality.

Also, I don't consider this fix as complete. It does fix the reported problem, but it doesn't provide any strong guarantees yet. In particular, since ReachabilityFence is CFG-only node, nothing explicitly forbids memory operations to float past Reference.reachabilityFence() and potentially reaching some other safepoints current analysis treats as non-interfering. Representing ReachabilityFence as memory barrier (e.g., MemBarCPUOrder) would solve the issue, but performance costs are prohibitively high. Alternatively, the optimization proposed in this PR can be improved to conservatively extend referent's live range beyond ReachabilityFence nodes associated with it. It would meet performance criteria, but I prefer to implement it as a followup fix.

Another known issue relates to reachability fences on constant oops. If such constant is GCed (most likely, due to a bug in Java code), similar reachability issues may arise. For now, RFs on constants are treated as no-ops, but there's a diagnostic flag PreserveReachabilityFencesOnConstants to keep the fences. I plan to address it separately.

[1] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/ref/Reference.java#L667
"HotSpot JVM retains the ref and does not GC it before a call to this method, because the JIT-compilers do not have GC-only safepoints."

Testing:

  • hs-tier1 - hs-tier8
  • hs-tier1 - hs-tier6 w/ -XX:+StressReachabilityFences -XX:+VerifyLoopOptimizations
  • java/lang/foreign microbenchmarks

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8290892: C2: Intrinsify Reference.reachabilityFence (Bug - P3)

Contributors

  • Tobias Holenstein <tholenstein@openjdk.org>
  • Vladimir Ivanov <vlivanov@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25315/head:pull/25315
$ git checkout pull/25315

Update a local copy of the PR:
$ git checkout pull/25315
$ git pull https://git.openjdk.org/jdk.git pull/25315/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25315

View PR using the GUI difftool:
$ git pr show -t 25315

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25315.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 20, 2025

👋 Welcome back vlivanov! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 20, 2025

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@iwanowww iwanowww changed the title Intrinsify Reference.reachabilityFence 8290892: C2: Intrinsify Reference.reachabilityFence May 20, 2025
@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label May 20, 2025
@openjdk
Copy link

openjdk bot commented May 20, 2025

@iwanowww
The hotspot-compiler label was successfully added.

@iwanowww
Copy link
Contributor Author

/contributor add tholenstein

@openjdk
Copy link

openjdk bot commented May 20, 2025

@iwanowww
Contributor Tobias Holenstein <tholenstein@openjdk.org> successfully added.

@iwanowww
Copy link
Contributor Author

/contributor add vlivanov

@openjdk
Copy link

openjdk bot commented May 20, 2025

@iwanowww
Contributor Vladimir Ivanov <vlivanov@openjdk.org> successfully added.

@iwanowww iwanowww marked this pull request as ready for review May 20, 2025 02:05
@openjdk openjdk bot added the rfr Pull request is ready for review label May 20, 2025
@mlbridge
Copy link

mlbridge bot commented May 20, 2025

Copy link
Member

@shipilev shipilev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an impressive piece of work. It still makes me yearn for simpler solution. (I am impressed how much it snowballed from my original @DontInline implementation!)

Representing ReachabilityFence as memory barrier (e.g., MemBarCPUOrder) would solve the issue, but performance costs are prohibitively high.

How bad is it? MemBarCPUOrder pinches all memory, so I assume this breaks a lot of optimizations when RF is sitting in the hot loop? I remember we went through a similar exercise with Blackholes: JDK-8296545 -- and decided to pinch only the control. I guessing this is not enough to fix RF, or is it?

Some other comments:

@@ -659,12 +659,10 @@ protected Object clone() throws CloneNotSupportedException {
* {@code null}, this method has no effect.
* @since 9
*/
@ForceInline
@IntrinsicCandidate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like we also want to restore @DontInline to cover the case when intrinsic is not available / disabled for some compiler. I vaguely remember some intrinsic handling code checks whether method is prohibited from inlining (maybe affects only global -XX:-Inline, not sure), so it might be as straightforward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to use -XX:DisableIntrinsic=_Reference_reachabilityFence to switch to current behavior (no fence).
Also, @DontInline would require special handling in C1 to unconditionally inline it.

@ForceInline was there primarily to communicate the interaction with JVM. (Existing inlining heuristics should just unconditionally inline empty methods.) Once @IntrinsicCandidate is there, I don't see much value in any other annotations.

Copy link
Member

@shipilev shipilev May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the whole point of this PR is that "current behavior" is incorrect, isn't it?

I think disabling _Reference_reachabilityFence intrinsic (or, failing to inline the intrinsic for some other reason) should fail-safe to non-inlined method, not fail-deadly to a broken RF. In other words, let's not rely on intrinsic to work for correctness; non-intrinsified version should be correct as well.

I agree @DontInline would require a bit of extra fiddling in C1, but I suspect it should be as easy as copy-pasting a few hunks around LIRGenerator::do_blackhole.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the whole point of this PR is that "current behavior" is incorrect, isn't it?

Strictly speaking, current implementation has a defect and it requires a complete rewrite on C2 side to properly fix it.

Current implementation is part of JDK for a long time (since 11). It's highly unlikely it'll be backported all the way to JDK 11 and it's an open question whether it should be backported at all. So, for diagnostic purposes it makes sense to provide a way to compare old and new implementations irrespective of whether old implementation still has the bug.

In other words, let's not rely on intrinsic to work for correctness; non-intrinsified version should be correct as well.

A question for you: do you think we should test non-intrinsified case?

Personally, I consider such requirement as way too strong. In this particular case, the method is unconditionally intrinsified in C2. If no intrinsification takes place, it's a bug. (I'm fine with adding an assert & abort compilation if C2 ever observes Reference.reachabilityFence() to be inlined.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think two general principles apply here: a) intrinsics are performance optimizations, not correctness building blocks; b) when forced to choose, we prefer correctness over performance.

This is not about the backports, but about having the correct fallback when we need it. Imagine, for a second, this fix has a major bug discovered later. We instruct users to disable the intrinsic to avoid that bug. Users then ask: "Is it safe to disable the intrinsic? Would my application crash/misbehave without it? Would it run slower?". Right now we have a choice what we can answer. With @DontInline, we say "Yes, it would run slower, but RF would still work against misbehaviors". Without @DontInline, we say "Yes, there is a possibility of misbehavior, but at least it would run as fast". One of these answers goes against the general principle (b).

I also cannot remember any existing intrinsic that is a counter-example for (a). Without @DontInline, RF would be a first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a) intrinsics are performance optimizations, not correctness building blocks;

The fact that many intrinsics are performance optimizations doesn't mean all intrinsics should be.

I also cannot remember any existing intrinsic that is a counter-example for (a)

Reference::get.

This is not about the backports, but about having the correct fallback when we need it.

I provided it as a justification why we may need access to current implementation even if we know it's not 100% correct.

Users then ask: "Is it safe to disable the intrinsic? Would my application crash/misbehave without it? Would it run slower?"

There's no single answer possible here and it depending on what users expect. If there's a severe regression after the fix, as the stop-the-gap solution, it's common to restore original behavior users had before the change arrived (and that's -XX:DisableIntrinsic=_Reference_reachabilityFence). If somebody asks for a workaround which is 100% correct, it can be suggested to also disable method inlining (-XX:CompileCommand=dontinline,java.lang.ref.Reference::reachabilityFence), but it won't be the mode they run before and it'll introduce additional risks for them (mostly, performance-wise).

So, there's a way to achieve both. Also, -XX:DisableIntrinsic=_Reference_reachabilityFence covers the most common scenario we may need and it's the simplest one from implementation POV.

// Skip next transformation if compressed oops are not used.
if ((UseCompressedOops && !Matcher::gen_narrow_oop_implicit_null_checks()) ||
(!UseCompressedOops && !UseCompressedClassPointers))
return;

// Go over ReachabilityFence nodes to skip DecodeN nodes for referents.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a cute optimization. Does it happen in our code anywhere? I would have expected DecodeN to be near the heap loads, and suppose RF is mostly called on locals, which are already uncompressed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I read the next hunk, should is_DecodeN be is_DecodeNarrowPtr to capture class loads (however unlikely that one is)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReachabilityFence accepts only OOPs as a referent and DecodeNKlass produces Klass pointer.

I suspect it may be the case for safepoints as well (and is_DecodeNarrowPtr() is a a leftover from PermGen world), but I didn't check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, nevermind about DecodeNKlass then. My question on heap loads still stands: do we actually get reachabilityFence(someField) from anywhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you asking specifically about ReachabilityFence -> DecodeN -> LoadN shape? Yes, it's common, especially after inlining.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iwanowww Can you add a code comment why this is safe to look through the ReachabilityFence?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

redundant_rfs.push(rf);
}
}
return found;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

found is always false here. I don't think this function even needs a return value, judging by the uses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention was to signal when not-yet-known redundant fences are found.
I prefer to keep it even if it is not used right now.
Fixed by updating found along with pushing newly discovered redundant node on the list.

}
}
}
redundant_rfs.push(rf);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see PhaseIdealLoop::optimize_reachability_fences asks for redundant_rfs.member(rf) before going into this analysis. Is it because we can have duplicate RFs in the C->reachability_fence list? Can we have duplicate here? Should we check .member(rf) here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No duplicated entries are allowed in Compile::_reachability_fences.

eliminate_reachability_fences() is called at the end of loop opts after optimize_reachability_fences() is done (possibly, multiple times).

eliminate_reachability_fences() migrates all referents from RFs to safepoints. So, every RF node is placed on redundant_rfs list and there are no RF nodes left after eliminate_reachability_fences() is over (there's C->reachability_fences_count() == 0 assert at the end).

After safepoints are directly linked to the referent, the corresponding RF node becomes redundant. So, I kept redundant_rfs as a name. I can choose a different one if you find it confusing.

Another important aspect when it comes to determining RF redundancy is that eliminate_reachability_fences() takes all users into account while optimize_reachability_fences() trusts only ReachabilityFences.

Copy link
Contributor Author

@iwanowww iwanowww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback, Aleksey, Tobias, and Christian.

redundant_rfs.push(rf);
}
}
return found;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention was to signal when not-yet-known redundant fences are found.
I prefer to keep it even if it is not used right now.
Fixed by updating found along with pushing newly discovered redundant node on the list.

}
}
}
redundant_rfs.push(rf);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No duplicated entries are allowed in Compile::_reachability_fences.

eliminate_reachability_fences() is called at the end of loop opts after optimize_reachability_fences() is done (possibly, multiple times).

eliminate_reachability_fences() migrates all referents from RFs to safepoints. So, every RF node is placed on redundant_rfs list and there are no RF nodes left after eliminate_reachability_fences() is over (there's C->reachability_fences_count() == 0 assert at the end).

After safepoints are directly linked to the referent, the corresponding RF node becomes redundant. So, I kept redundant_rfs as a name. I can choose a different one if you find it confusing.

Another important aspect when it comes to determining RF redundancy is that eliminate_reachability_fences() takes all users into account while optimize_reachability_fences() trusts only ReachabilityFences.

@@ -659,12 +659,10 @@ protected Object clone() throws CloneNotSupportedException {
* {@code null}, this method has no effect.
* @since 9
*/
@ForceInline
@IntrinsicCandidate
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to use -XX:DisableIntrinsic=_Reference_reachabilityFence to switch to current behavior (no fence).
Also, @DontInline would require special handling in C1 to unconditionally inline it.

@ForceInline was there primarily to communicate the interaction with JVM. (Existing inlining heuristics should just unconditionally inline empty methods.) Once @IntrinsicCandidate is there, I don't see much value in any other annotations.

// Skip next transformation if compressed oops are not used.
if ((UseCompressedOops && !Matcher::gen_narrow_oop_implicit_null_checks()) ||
(!UseCompressedOops && !UseCompressedClassPointers))
return;

// Go over ReachabilityFence nodes to skip DecodeN nodes for referents.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReachabilityFence accepts only OOPs as a referent and DecodeNKlass produces Klass pointer.

I suspect it may be the case for safepoints as well (and is_DecodeNarrowPtr() is a a leftover from PermGen world), but I didn't check.

@@ -3897,6 +3908,20 @@ void Compile::final_graph_reshaping_main_switch(Node* n, Final_Reshape_Counts& f
//------------------------------final_graph_reshaping_walk---------------------
// Replacing Opaque nodes with their input in final_graph_reshaping_impl(),
// requires that the walk visits a node's inputs before visiting the node.

static bool has_non_debug_uses(Node* n) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved it to Node::has_non_debug_uses().

@iwanowww
Copy link
Contributor Author

Representing ReachabilityFence as memory barrier (e.g., MemBarCPUOrder) would solve the issue, but performance costs are prohibitively high.

How bad is it? MemBarCPUOrder pinches all memory, so I assume this breaks a lot of optimizations when RF is sitting in the hot loop? I remember we went through a similar exercise with Blackholes: JDK-8296545 -- and decided to pinch only the control. I guessing this is not enough to fix RF, or is it?

Yes, if a barrier stays inside loop body, it breaks a lot of important optimizations. It may end up almost as bad as a full-blown call (except a barrier can be moved around while a call can't). And moving a node when it depends both on control and memory is more complicated than just a CFG node. Moreover, as you can see in the proposed solution, even CFG-only representation is problematic for loop opts, so additional care is needed to ensure RFs are moved out of loops.

As an alternative approach, I thought about reifying RF as a data node (think of CastPP) and then linking its referent to all safepoints it dominates after loop opts are over. But that would only affect optimize_reachability_fences(). Everything else would stay the same. So, I decided to stay with CFG-only representation for now.

Comment on lines +76 to +83
product(bool, OptimizeReachabilityFences, true, DIAGNOSTIC, \
"Optimize reachability fences") \
\
product(bool, PreserveReachabilityFencesOnConstants, false, DIAGNOSTIC, \
"Preserve reachability fences on constants") \
\
product(bool, StressReachabilityFences, false, DIAGNOSTIC, \
"Randomly insert ReachabilityFence nodes") \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by sniping: what about a hello-world test where you test out these flags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Added one.

* @bug 8290892
* @summary Tests to ensure that reachabilityFence() correctly keeps objects from being collected prematurely.
* @modules java.base/jdk.internal.misc
* @run main/othervm -Xbatch compiler.c2.TestReachabilityFence
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about some extra runs where you use your new flags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This particular test is carefully crafted to provoke a failure when reachability fence effects aren't properly modeled. Stressing RF implementation doesn't help here.

Copy link
Contributor

@eme64 eme64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm on my first pass through the code, just leaving a few first observations / comments :)


uint endoff = call->jvms()->endoff();
if (C->inlining_incrementally()) {
assert(endoff == call->req(), ""); // assert in SafePointNode::grow_stack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly are you asserting here? And what is the comment for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assert ensures there are no reachability edges present when incremental inlining takes place. Inlining logic doesn't expect any extra edges past debug info and the comment refers to the assert which fires the first.

assert(endoff == call->req(), ""); // assert in SafePointNode::grow_stack
} else {
if (call->req() > endoff) {
assert(OptimizeReachabilityFences, "");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a helpful assert message?

case CatchProjNode::fall_through_index: projs->fallthrough_catchproj = cpn; break;
case CatchProjNode::catch_all_index: projs->catchall_catchproj = cpn; break;
default: {
assert(cpn->_con > 1, ""); // exception table; rethrow case
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please turn this into a helpful assert message?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you quickly comment why you changed this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some call nodes inspected during expand_reachability_fences demonstrate this IR shape where some exception table projections are directly attached to the call node.

Looks like a missed case in CallNode::extract_projections we simply never hit before.

Comment on lines +495 to +497
// Are we guaranteed that this node is a safepoint? Not true for leaf calls and
// for some macro nodes whose expansion does not have a safepoint on the fast path.
virtual bool guaranteed_safepoint() { return true; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you only copied it. It makes me a little nervous when we call the "default" case safe. Because when you add more cases, you just assume it is safe... and if it is not we first have to discover that through a bug. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it's a SafePointNode class after all. I lifted it from CallNode subclass to avoid elaborate check on SafePoint nodes (!is_Call() || as_Call() && guaranteed_safepoint()`)).

If some node extends SafePointNode, but doesn't keep JVM state, it has to communicate it to users one way or another. And changing the default doesn't improve the situation IMO: reporting a safepoint node as a non-safepoint is still a bug.

Comment on lines 2521 to 2526
{ // No more loop opts. It is safe to eliminate reachability fence nodes.
TracePhase tp(_t_idealLoop);
PhaseIdealLoop::optimize(igvn, LoopOptsEliminateRFs);
print_method(PHASE_ELIMINATE_REACHABILITY_FENCES, 2);
if (failing()) return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will have to check the impact on compile time here. Running an extra round of loop opts might be significant.

Do you really need to use a loop opts phase for this? Or would something like process_for_post_loop_opts_igvn work for it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be helpful if you write in a comment if this eliminates all or just some of the reachability fences.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we limit it to cases where we actually have reachability fences?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Done.

Comment on lines 3956 to 3958
Node* rf = C->reachability_fence(i);
Node* in = rf->in(1);
if (in->is_DecodeN()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not:

Suggested change
Node* rf = C->reachability_fence(i);
Node* in = rf->in(1);
if (in->is_DecodeN()) {
ReachabilityFence* rf = C->reachability_fence(i);
DecodeNNode* dn = rf->in(1)->isa_DecodeN();
if (dn != nullptr) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, reshaped as you suggested.

// Skip next transformation if compressed oops are not used.
if ((UseCompressedOops && !Matcher::gen_narrow_oop_implicit_null_checks()) ||
(!UseCompressedOops && !UseCompressedClassPointers))
return;

// Go over ReachabilityFence nodes to skip DecodeN nodes for referents.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iwanowww Can you add a code comment why this is safe to look through the ReachabilityFence?

@@ -377,6 +378,7 @@ class Compile : public Phase {
// of Template Assertion Predicates themselves.
GrowableArray<OpaqueTemplateAssertionPredicateNode*> _template_assertion_predicate_opaques;
GrowableArray<Node*> _expensive_nodes; // List of nodes that are expensive to compute and that we'd better not let the GVN freely common
GrowableArray<Node*> _reachability_fences; // List of reachability fences
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not:

Suggested change
GrowableArray<Node*> _reachability_fences; // List of reachability fences
GrowableArray<ReachabilityFenceNode*> _reachability_fences; // List of all reachability fences

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done.

Comment on lines 735 to 741
void add_reachability_fence(Node *n) {
_reachability_fences.append(n);
}

void remove_reachability_fence(Node* n) {
_reachability_fences.remove_if_existing(n);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also add the type ReachabilityFenceNode* here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 64 to 78
//------------------------------unique_loop_exit_or_null----------------------
// Return the loop-exit projection if it is unique.
Node* IdealLoopTree::unique_loop_exit_or_null() {
Node* unique_loop_exit = nullptr;
if (is_loop()) {
if (head()->is_BaseCountedLoop()) {
unique_loop_exit = head()->as_BaseCountedLoop()->loopexit()->proj_out_or_null(0 /* false */);
} else if (head()->is_OuterStripMinedLoop()) {
unique_loop_exit = head()->as_OuterStripMinedLoop()->outer_loop_exit();
} else {
// For now, conservatively report multiple loop exits exist.
}
}
return unique_loop_exit;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proj_out_or_null returns a ProjNode (it is probably a IfTrue or IfFalse, right?) and outer_loop_exit returns a IfFalseNode. So we should be able to return a IfProjNode from this method. What do you think?

What is the benefit of the unique_loop_exit variable here? Why not return immediately?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was easier to inspect it in the debugger. Reshaped as you suggested.

Comment on lines 1476 to 1485
bool optimize_reachability_fences();
bool eliminate_reachability_fences();

bool is_redundant_rf(Node* rf, bool rf_only);
bool find_redundant_rfs(Unique_Node_List& redundant_rfs);
void insert_rf(Node* ctrl, Node* referent);
void replace_rf(Node* old_node, Node* new_node);
void remove_rf(Node* rf);
#ifdef ASSERT
bool has_redundant_rfs(Unique_Node_List& ignored_rfs, bool rf_only);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer if all the method names spelled out reachability_fences instead of rf / rfs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The arguments are less important for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 2 types of methods here: internal ones (used solely in reachability.cpp) and those which are called from loop optimization code (optimize_reachability_fences and eliminate_reachability_fences).

IMO it's counter-productive to repeatedly spell out what "RF" means inside reachability.cpp, so I kept the names intact. I split the declarations into public and private ones to stress the distinction.

@@ -977,6 +979,8 @@ void PhaseMacroExpand::process_users_of_allocation(CallNode *alloc) {
}
}
_igvn._worklist.push(ac);
} else if (use->is_ReachabilityFence() && OptimizeReachabilityFences) {
_igvn.replace_input_of(use, 1, _igvn.makecon(TypePtr::NULL_PTR)); // reset; redundant fence
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you quickly explain in a code comment how this does a "reset"? What happens with it next?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turned it into ReachabilityFenceNode::clear_referent(). Hope it makes it clearer.

Comment on lines 700 to 701
DEFINE_CLASS_ID(Initialize, MemBar, 0)
DEFINE_CLASS_ID(MemBarStoreStore, MemBar, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DEFINE_CLASS_ID(Initialize, MemBar, 0)
DEFINE_CLASS_ID(MemBarStoreStore, MemBar, 1)
DEFINE_CLASS_ID(Initialize, MemBar, 0)
DEFINE_CLASS_ID(MemBarStoreStore, MemBar, 1)

I don't think you needed to touch the lines above, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@@ -358,6 +358,7 @@ class Parse : public GraphKit {
bool _wrote_stable; // Did we write a @Stable field?
bool _wrote_fields; // Did we write any field?
Node* _alloc_with_final_or_stable; // An allocation node with final or @Stable field
Node* _stress_rf_hook; // StressReachabilityFences support
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could write out the rf

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid that. _stress_reachability_fence_hook is way too verbose IMO. The declaration and all the accesses are accompanied by StressReachabilityFences which should make it clear what rf refers to.

Comment on lines 373 to 379
if (StressReachabilityFences && type->isa_oopptr() != nullptr) {
Node* loc = local(index);
if (loc->bottom_type() != TypePtr::NULL_PTR) {
assert(loc->bottom_type()->isa_oopptr() != nullptr, "%s", Type::str(loc->bottom_type()));
_stress_rf_hook->add_req(loc);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a short code comment describing what you are doing here, please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 388 to 394
if (StressReachabilityFences && type->isa_oopptr() != nullptr) {
Node* stk = stack(index);
if (stk->bottom_type() != TypePtr::NULL_PTR) {
assert(stk->bottom_type()->isa_oopptr() != nullptr, "%s", Type::str(stk->bottom_type()));
_stress_rf_hook->add_req(stk);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A short code comment would be helpful

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -2187,6 +2218,15 @@ void Parse::return_current(Node* value) {
call_register_finalizer();
}

if (StressReachabilityFences) {
// Keep all oop arguments alive until method return.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Can you extend the comment a little?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Does it look better now?

Comment on lines 43 to 44
* (1) optimization pass during loop opts which eliminates redundant nodes and
* moves loop-invariant ones outside loops;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* (1) optimization pass during loop opts which eliminates redundant nodes and
* moves loop-invariant ones outside loops;
* (1) optimization pass during loop opts which eliminates redundant nodes and
* moves loop-invariant ones outside loops;

I'd prever consistent indentation, but optional/question of taste

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Comment on lines 45 to 46
* (2) reachability information is transferred to safepoint nodes (appended as edges after debug info);
* (3) reachability information from safepoints materialized as RF nodes attached to the safepoint node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand the explanation a little, please? I don't really understand. Why do you do this? What does it achieve?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be helpful if you wrote a paragraph (maybe at the top), about the interaction of SafePoint and ReachabilityFence. And you should also define "reachability information", I don't yet understand what that entails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I elaborated the description a bit and added more details. Let me know how it reads now.

Comment on lines +50 to +51
* It looks attractive to get rid of RF nodes early and transfer to safepoint-attached representation,
* but it is not correct until loop opts are done.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it not correct? What could go wrong? Why is it safe to do it after loop opts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Live ranges of values are routinely extended during loop opts. And it can break the invariant that all interfering safepoints contain the referent in their oop map. (If an interfering safepoint doesn't keep the referent alive, then it becomes possible for the referent to be prematurely GCed.)

After loop opts are over, it becomes possible to reliably enumerate all interfering safe points and ensure the referent present in their oop maps.

Comment on lines 53 to 55
* RF nodes may interfere with RA, so stand-alone RF nodes are eliminated and reachability information is
* transferred to corresponding safepoints. When safepoints are pruned during macro expansion, corresponding
* reachability info also goes away.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok that this info goes away?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what could currently go wrong with your implementation, or what would go wrong if we only used Safepoints?

@eme64
Copy link
Contributor

eme64 commented Jun 18, 2025

@iwanowww I reviewed half of the code. I left lots of minor comments, about code style and little code comments. They are not super important now.

I think most important to work on now is the description on the top of reachability.cpp. I struggle to understand the basic concepts / ideas. It would be nice if you described things in more detail, and added some definitions. I think that would be a better basis for me to jump into the implementation afterward :)

@bridgekeeper
Copy link

bridgekeeper bot commented Jul 16, 2025

@iwanowww This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@iwanowww
Copy link
Contributor Author

/keepalive

@openjdk
Copy link

openjdk bot commented Jul 24, 2025

@iwanowww The pull request is being re-evaluated and the inactivity timeout has been reset.

@openjdk
Copy link

openjdk bot commented Jul 24, 2025

@iwanowww this pull request can not be integrated into master due to one or more merge conflicts. To resolve these merge conflicts and update this pull request you can run the following commands in the local repository for your personal fork:

git checkout 8290892.rf
git fetch https://git.openjdk.org/jdk.git master
git merge FETCH_HEAD
# resolve conflicts and follow the instructions given by git merge
git commit -m "Merge master"
git push

@openjdk openjdk bot added the merge-conflict Pull request has merge conflict with target branch label Jul 24, 2025
@openjdk openjdk bot removed the merge-conflict Pull request has merge conflict with target branch label Aug 22, 2025
@eme64
Copy link
Contributor

eme64 commented Sep 3, 2025

@iwanowww Let me know whenever this is ready to review again 😊

@openjdk openjdk bot removed the rfr Pull request is ready for review label Sep 3, 2025
@iwanowww
Copy link
Contributor Author

iwanowww commented Sep 3, 2025

@eme64 please, take another look. Thanks!

@openjdk openjdk bot added the rfr Pull request is ready for review label Sep 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

5 participants