8358821: patch_verified_entry causes problems, use nmethod entry barriers instead #25764

dean-long · 2025-06-12T01:51:09Z

This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier.

We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier.

The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap.

For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance.

This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels.

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8358821: patch_verified_entry causes problems, use nmethod entry barriers instead (Bug - P3)

Reviewers

Martin Doerr (@TheRealMDoerr - Reviewer)

Contributors

Martin Doerr <mdoerr@openjdk.org>
Amit Kumar <amitkumar@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25764/head:pull/25764
$ git checkout pull/25764

Update a local copy of the PR:
$ git checkout pull/25764
$ git pull https://git.openjdk.org/jdk.git pull/25764/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25764

View PR using the GUI difftool:
$ git pr show -t 25764

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25764.diff

Using Webrev

Link to Webrev Comment

bridgekeeper · 2025-06-12T01:51:51Z

👋 Welcome back dlong! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2025-06-12T01:51:57Z

@dean-long This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8358821: patch_verified_entry causes problems, use nmethod entry barriers instead

Co-authored-by: Martin Doerr <mdoerr@openjdk.org>
Co-authored-by: Amit Kumar <amitkumar@openjdk.org>
Reviewed-by: mdoerr

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 94 new commits pushed to the master branch:

2b94b70: 8359428: Test 'javax/swing/JTabbedPane/bug4499556.java' failed because after selecting one of L&F items, the test case automatically failed when clicking on L&F Menu button again
1130c1b: 6798061: The removal of System.out.println from KeyboardManager
2f63d3a: 7116990: (spec) Socket.connect(addr,timeout) not clear if IOException because of TCP timeout
... and 91 more: https://git.openjdk.org/jdk/compare/9586817cea3f1cad8a49d43e9106e25dafa04765...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk · 2025-06-12T01:52:25Z

@dean-long The following labels will be automatically applied to this pull request:

hotspot
shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

mlbridge · 2025-06-12T01:56:34Z

Webrevs

dean-long · 2025-06-12T02:03:06Z

/label hotspot-gc hotspot-runtime

openjdk · 2025-06-12T02:03:29Z

@dean-long
The hotspot-gc label was successfully added.

The hotspot-runtime label was successfully added.

TheRealMDoerr

I appreciate this PR. I like using the nmethod entry barrier and getting rid of not-entrant patching.
I have some change requests. In addition, we can get rid of much more code. I've put my proposal in a Commit: TheRealMDoerr@4aed569
Please take a look!

src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.cpp

src/hotspot/share/gc/z/zBarrierSetNMethod.cpp

src/hotspot/share/gc/shared/barrierSetNMethod.cpp

dean-long · 2025-06-12T19:51:27Z

/contributor add @TheRealMDoerr
/contributor add mdoerr

openjdk · 2025-06-12T19:52:27Z

@dean-long TheRealMDoerr was not found in the census.

Syntax: /contributor (add|remove) [@user | openjdk-user | Full Name <email@address>]. For example:

/contributor add @openjdk-bot
/contributor add duke
/contributor add J. Duke <duke@openjdk.org>

User names can only be used for users in the census associated with this repository. For other contributors you need to supply the full name and email address.

dean-long · 2025-06-12T19:54:28Z

Thanks @TheRealMDoer for the suggestions, I incorporated all of them, however for shenandoahCodeRoots.cpp I decided to make arm() public, so the caller code doesn't need to know about the magic value 0.

TheRealMDoerr · 2025-06-12T20:05:59Z

Thanks @TheRealMDoer for the suggestions, I incorporated all of them, however for shenandoahCodeRoots.cpp I decided to make arm() public, so the caller code doesn't need to know about the magic value 0.

Excellent! Thank you! I'll take another look over it and run tests.

openjdk · 2025-06-12T20:15:49Z

@dean-long
Contributor Martin Doerr <mdoerr@openjdk.org> successfully added.

TheRealMDoerr

LGTM, now. I think is_sigill_not_entrant should also be removed completely: TheRealMDoerr@f61f54a
I've run tests on some platforms, but will run on more.
@offamitkumar, @RealFYang, @bulasevich: You may also want to check your platforms.
@fisk: You may want to take a look at the nmethod entry barrier changes.

TheRealMDoerr

Tests look good on our side. I'm only a bit concerned that the lock may become a bottleneck when many Java threads need to patch all nmethods. Especially with ZGC which does that more often. I think we should check performance.

fisk

Thanks for doing this! I have been wanting something like this for a while too and it looks great. I have some comments though...

src/hotspot/share/gc/z/zBarrierSetNMethod.cpp

fisk · 2025-06-16T00:54:48Z

src/hotspot/share/gc/z/zBarrierSetNMethod.cpp

+  ZLocker<ZReentrantLock> locker(ZNMethod::lock_for_nmethod(nm));
+  // Preserve the sticky bit
+  if (is_not_entrant(nm)) {
+    value |= not_entrant;


Is it possible to have a race where another thread sets an nmethod to not entrant and the thread calling this making the nmethod entry barrier not entrant?

If this was called to disarm a method and then enter it, it seems a bit sneaky in that case that we pass the nmethod entry barrier even though we under the lock see that it is not entrant. Probably okay but still feels like it might be more robust if the thread setting an nmethod to not entrant is always the one that arms the nmethod entry barrier.

If I understand your concern correctly, there is no race. The only caller of BarrierSetNMethod::make_not_entrant() is nmethod::make_not_entrant(), and it is done inside a NMethodState_lock critical section. After a call to nmethod::make_not_entrant(), the nmethod entry barrier is armed and stays that way.
And by design, a disarm only disarms at the inner nmethod_entry_barrier level, not the outer nmethod_stub_entry_barrier level.

My concern is that while thread 1 calls nmethod::make_not_entrant(), thread 2 racingly performs nmethod entry barrier; it makes the is_not_entrant check before it gets updated, but then it gets updated as the per nmethod lock is taken. The GC code "disarms" the GC barrier but in doing so finds that "oh this should be not entrant", but that's sort of not reflected as thread 2 will then proceed with entering the nmethod it just armed as not entrant in the nmethod entry barrier code.
Does that make sense?

offamitkumar · 2025-06-16T08:26:38Z

Just FYI, s390 build is broken with this change:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/home/amit/jdk/src/hotspot/share/gc/shared/barrierSetNMethod.cpp:196), pid=1779086, tid=1779117
#  assert(!nm->is_osr_method() || may_enter) failed: OSR nmethods should always be entrant after migration
#
# JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.amit.jdk)
# Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.amit.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-s390x)
# Problematic frame:
# V  [libjvm.so+0x40b196]  BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e
#
# Core dump will be written. Default location: Core dumps may be processed with "/lib/systemd/systemd-coredump %P %u %g %s %t 9223372036854775808 %h %d" (or dumping to /home/amit/jdk/make/core.1779086)
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#

stack trace:

Stack: [0x000003ff9e580000,0x000003ff9e680000],  sp=0x000003ff9e67b068,  free space=1004k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x40b196]  BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e  (barrierSetNMethod.cpp:196)
v  ~StubRoutines::method_entry_barrier 0x000003ff9050cd18
J 282% c2 sun.nio.fs.UnixPath.initOffsets()V java.base (189 bytes) @ 0x000003ff90c4f0c8 [0x000003ff90c4f080+0x0000000000000048]
j  sun.nio.fs.UnixPath.getFileName()Lsun/nio/fs/UnixPath;+1 java.base
j  sun.nio.fs.UnixFileSystemProvider.isHidden(Ljava/nio/file/Path;)Z+6 java.base
j  java.nio.file.Files.isHidden(Ljava/nio/file/Path;)Z+5 java.base
j  jdk.internal.module.ModulePath.isHidden(Ljava/nio/file/Path;)Z+1 java.base
j  jdk.internal.module.ModulePath.lambda$explodedPackages$0(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Z+11 java.base
j  jdk.internal.module.ModulePath$$Lambda+0x00000000a105cbe0.test(Ljava/lang/Object;Ljava/lang/Object;)Z+12 java.base
j  java.nio.file.Files.lambda$find$0(Ljava/util/function/BiPredicate;Ljava/nio/file/FileTreeWalker$Event;)Z+9 java.base
j  java.nio.file.Files$$Lambda+0x00000000a10646c0.test(Ljava/lang/Object;)Z+8 java.base
....

TheRealMDoerr · 2025-06-16T12:00:13Z

@offamitkumar: The problem is probably the initialization to -1: z_cfi(Z_R0_scratch, /* to be patched */ -1);.
Should be 0.

TheRealMDoerr · 2025-06-16T13:56:30Z

Seems like arm32 has the same issue:

jdk/src/hotspot/cpu/arm/gc/shared/barrierSetAssembler_arm.cpp

Line 199 in 9d06057

__ emit_int32(0xDEADBEAF);

The init value shouldn't have the sticky bit set.

offamitkumar · 2025-06-16T15:04:23Z

@offamitkumar: The problem is probably the initialization to -1: z_cfi(Z_R0_scratch, /* to be patched */ -1);. Should be 0.

Thank you Martin for the suggestion.

@dean-long would you please add this diff, fixing s390x build. I ran tier1 test with fastdebug, test are clean;

diff --git a/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp b/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp
index e78906708af..2d663061aec 100644
--- a/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp
+++ b/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp
@@ -180,7 +180,7 @@ void BarrierSetAssembler::nmethod_entry_barrier(MacroAssembler* masm) {
     __ z_lg(Z_R0_scratch, in_bytes(bs_nm->thread_disarmed_guard_value_offset()), Z_thread); // 6 bytes
 
     // Compare to current patched value:
-    __ z_cfi(Z_R0_scratch, /* to be patched */ -1); // 6 bytes (2 + 4 byte imm val)
+    __ z_cfi(Z_R0_scratch, /* to be patched */ 0); // 6 bytes (2 + 4 byte imm val)
 
     // Conditional Jump
     __ z_larl(Z_R14, (Assembler::instr_len((unsigned long)LARL_ZOPC) + Assembler::instr_len((unsigned long)BCR_ZOPC)) / 2); // 6 bytes
diff --git a/src/hotspot/cpu/s390/stubGenerator_s390.cpp b/src/hotspot/cpu/s390/stubGenerator_s390.cpp
index d3f6540a3ea..bb1d9ce6037 100644
--- a/src/hotspot/cpu/s390/stubGenerator_s390.cpp
+++ b/src/hotspot/cpu/s390/stubGenerator_s390.cpp
@@ -3197,7 +3197,7 @@ class StubGenerator: public StubCodeGenerator {
 
     // VM-Call: BarrierSetNMethod::nmethod_stub_entry_barrier(address* return_address_ptr)
     __ call_VM_leaf(CAST_FROM_FN_PTR(address, BarrierSetNMethod::nmethod_stub_entry_barrier));
-    __ z_ltr(Z_R0_scratch, Z_RET);
+    __ z_ltr(Z_RET, Z_RET);
 
     // VM-Call Epilogue
     __ restore_volatile_regs(Z_SP, frame::z_abi_160_size, true, false);

dean-long · 2025-06-16T23:30:17Z

/contributor add @offamitkumar

openjdk · 2025-06-16T23:30:47Z

@dean-long
Contributor Amit Kumar <amitkumar@openjdk.org> successfully added.

dean-long · 2025-06-16T23:31:48Z

Thanks @offamitkumar. Could you explain the __ z_ltr(Z_R0_scratch, Z_RET); change, for my curiosity? Thanks.

dean-long · 2025-06-16T23:35:43Z

Seems like arm32 has the same issue:

jdk/src/hotspot/cpu/arm/gc/shared/barrierSetAssembler_arm.cpp

Line 199 in 9d06057

__ emit_int32(0xDEADBEAF);

The init value shouldn't have the sticky bit set.

Thanks, I pushed a potential fix for that.

dean-long · 2025-06-17T00:11:02Z

Tests look good on our side. I'm only a bit concerned that the lock may become a bottleneck when many Java threads need to patch all nmethods. Especially with ZGC which does that more often. I think we should check performance.

For ZGC I am using a per-nmethod lock:
ZLocker locker(ZNMethod::lock_for_nmethod(nm));

I don't know what benchmarks to run to check the performance for functions like Deoptimization::deoptimize_all_marked, so I welcome any help with this.

One possible optimization that might help is skipping the lock if the make_not_entrant call is done during a safepoint.

offamitkumar · 2025-06-17T03:44:05Z

Thanks @offamitkumar. Could you explain the __ z_ltr(Z_R0_scratch, Z_RET); change, for my curiosity? Thanks.

ltr instruction stands for "load and test" (32 bit). Initially we were loading the value from Z_RET to Z_R0_scratch and then it will be compared against 0. But in this case there is no requirement of loading the value in Z_R0, as it's not being used further. So we can load the value again in Z_RET and the compare it against 0. There is nothing wrong in previous solution, it's just killing Z_R0 for nothing.

TheRealMDoerr · 2025-06-17T12:50:18Z

Seems like arm32 has the same issue:

jdk/src/hotspot/cpu/arm/gc/shared/barrierSetAssembler_arm.cpp

Line 199 in 9d06057

__ emit_int32(0xDEADBEAF);

The init value shouldn't have the sticky bit set.

Thanks, I pushed a potential fix for that.

Unfortunately, 0xBEAFDEAD also has the MSB set. Shouldn't we better use 0 like on all other platforms?

dean-long · 2025-06-17T20:52:29Z

Seems like arm32 has the same issue:

jdk/src/hotspot/cpu/arm/gc/shared/barrierSetAssembler_arm.cpp

Line 199 in 9d06057

__ emit_int32(0xDEADBEAF);

The init value shouldn't have the sticky bit set.

Thanks, I pushed a potential fix for that.

Unfortunately, 0xBEAFDEAD also has the MSB set. Shouldn't we better use 0 like on all other platforms?

Oops, I was tripped up trying to be clever. Yes, I'm fine with using 0. I'll fix it.

TheRealMDoerr

Recent changes look good. Thanks!

dean-long added 2 commits June 10, 2025 20:35

wip

f2c9ab6

cleanup

b20fb26

openjdk bot added the rfr Pull request is ready for review label Jun 12, 2025

openjdk bot added hotspot hotspot-dev@openjdk.org shenandoah shenandoah-dev@openjdk.org labels Jun 12, 2025

removed stale comment

824f1d6

... and stale code

0780d15

openjdk bot added hotspot-gc hotspot-gc-dev@openjdk.org hotspot-runtime hotspot-runtime-dev@openjdk.org labels Jun 12, 2025

TheRealMDoerr suggested changes Jun 12, 2025

View reviewed changes

src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.cpp Show resolved Hide resolved

src/hotspot/share/gc/z/zBarrierSetNMethod.cpp Outdated Show resolved Hide resolved

src/hotspot/share/gc/shared/barrierSetNMethod.cpp Outdated Show resolved Hide resolved

dean-long added 6 commits June 12, 2025 12:14

more cleanup

5a5dd6d

Fix Shenandoah build

924e87c

remove trailing space

726bddf

TheRealMDoerr suggestions

88c1d72

more TheRealMDoerr suggestions

40b9229

more cleanup

0950fc5

TheRealMDoerr reviewed Jun 13, 2025

View reviewed changes

remove is_sigill_not_entrant

c1ebde0

TheRealMDoerr approved these changes Jun 14, 2025

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Jun 14, 2025

fisk reviewed Jun 16, 2025

View reviewed changes

s390 fix courtesy of Amit Kumar

a7d784b

openjdk bot removed the ready Pull request is ready to be integrated label Jun 16, 2025

arm32 fix

c98f386

rename arm_with to guard_with

3ac6dec

2nd try at arm fix

3e306dd

TheRealMDoerr approved these changes Jun 18, 2025

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Jun 18, 2025

8358821: patch_verified_entry causes problems, use nmethod entry barriers instead #25764

Are you sure you want to change the base?

8358821: patch_verified_entry causes problems, use nmethod entry barriers instead #25764

Conversation

dean-long commented Jun 12, 2025 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Issue

Reviewers

Contributors

Reviewing

Uh oh!

bridgekeeper bot commented Jun 12, 2025

Uh oh!

openjdk bot commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openjdk bot commented Jun 12, 2025

Uh oh!

mlbridge bot commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

dean-long commented Jun 12, 2025

Uh oh!

openjdk bot commented Jun 12, 2025

Uh oh!

TheRealMDoerr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dean-long commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openjdk bot commented Jun 12, 2025

Uh oh!

dean-long commented Jun 12, 2025

Uh oh!

TheRealMDoerr commented Jun 12, 2025

Uh oh!

openjdk bot commented Jun 12, 2025

Uh oh!

TheRealMDoerr left a comment

Choose a reason for hiding this comment

Uh oh!

TheRealMDoerr left a comment

Choose a reason for hiding this comment

Uh oh!

fisk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fisk Jun 16, 2025

Choose a reason for hiding this comment

Uh oh!

dean-long Jun 17, 2025

Choose a reason for hiding this comment

Uh oh!

fisk Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

offamitkumar commented Jun 16, 2025

Uh oh!

TheRealMDoerr commented Jun 16, 2025

Uh oh!

TheRealMDoerr commented Jun 16, 2025

Uh oh!

offamitkumar commented Jun 16, 2025

Uh oh!

dean-long commented Jun 16, 2025

Uh oh!

openjdk bot commented Jun 16, 2025

Uh oh!

dean-long commented Jun 16, 2025

Uh oh!

dean-long commented Jun 16, 2025

Uh oh!

dean-long commented Jun 17, 2025

dean-long commented Jun 12, 2025 •

edited by openjdk bot

Loading

openjdk bot commented Jun 12, 2025 •

edited

Loading

mlbridge bot commented Jun 12, 2025 •

edited

Loading

dean-long commented Jun 12, 2025 •

edited

Loading