Skip to content

8358821: patch_verified_entry causes problems, use nmethod entry barriers instead #25764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

dean-long
Copy link
Member

@dean-long dean-long commented Jun 12, 2025

This PR removes patching of the verified entry point and related code, and replaces it by refactoring the existing nmethod entry barrier.

We used to patch the verified entry point to make sure it was not_entrant. The patched entry point then redirected to SharedRuntime::handle_wrong_method(), either directly with a jump to a stub, or indirectly with an illegal instruction and the help of the signal handler. The not_entrant state is a final state, so once an nmethod becomes not_entrant, it stays not_entrant. We can do the same thing with a permanently armed nmethod entry barrier.

The solution I went with reserves one bit of the entry barrier guard value. This bit must remain set, so I call it a "sticky" bit. Setting the guard value now is effectively like setting a bitfield, so I needed to add a lock around it. The alternative would be to change the platform-specific code to do compare-and-swap.

For the lock, I introduced a new NMethodEntryBarrier_lock, whose only purpose is to make the update to the guard value atomic. For ZGC, I decided to use the existing per-nmethod lock ZNMethod::lock_for_nmethod(). I suspect we could do the same for Shenandoah, if needed for performance.

This change also makes it a bit clearer that the nmethod entry barrier effectively has two levels. Level 0 is the outer level or layer controlled by BarrierSetNMethod::nmethod_stub_entry_barrier(), and the inner layer controlled by BarrierSetNMethod::nmethod_entry_barrier(). This could be generalized if we decide we need more flavors of entry barriers. The inner barrier is mostly ignorant of the fact that the outer guard is multiplexing for both levels.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8358821: patch_verified_entry causes problems, use nmethod entry barriers instead (Bug - P3)

Reviewers

Contributors

  • Martin Doerr <mdoerr@openjdk.org>
  • Amit Kumar <amitkumar@openjdk.org>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25764/head:pull/25764
$ git checkout pull/25764

Update a local copy of the PR:
$ git checkout pull/25764
$ git pull https://git.openjdk.org/jdk.git pull/25764/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25764

View PR using the GUI difftool:
$ git pr show -t 25764

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25764.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Jun 12, 2025

👋 Welcome back dlong! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Jun 12, 2025

@dean-long This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8358821: patch_verified_entry causes problems, use nmethod entry barriers instead

Co-authored-by: Martin Doerr <mdoerr@openjdk.org>
Co-authored-by: Amit Kumar <amitkumar@openjdk.org>
Reviewed-by: mdoerr

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 94 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the rfr Pull request is ready for review label Jun 12, 2025
@openjdk
Copy link

openjdk bot commented Jun 12, 2025

@dean-long The following labels will be automatically applied to this pull request:

  • hotspot
  • shenandoah

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added hotspot hotspot-dev@openjdk.org shenandoah shenandoah-dev@openjdk.org labels Jun 12, 2025
@mlbridge
Copy link

mlbridge bot commented Jun 12, 2025

@dean-long
Copy link
Member Author

/label hotspot-gc hotspot-runtime

@openjdk openjdk bot added hotspot-gc hotspot-gc-dev@openjdk.org hotspot-runtime hotspot-runtime-dev@openjdk.org labels Jun 12, 2025
@openjdk
Copy link

openjdk bot commented Jun 12, 2025

@dean-long
The hotspot-gc label was successfully added.

The hotspot-runtime label was successfully added.

Copy link
Contributor

@TheRealMDoerr TheRealMDoerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate this PR. I like using the nmethod entry barrier and getting rid of not-entrant patching.
I have some change requests. In addition, we can get rid of much more code. I've put my proposal in a Commit: TheRealMDoerr@4aed569
Please take a look!

@dean-long
Copy link
Member Author

dean-long commented Jun 12, 2025

/contributor add @TheRealMDoerr
/contributor add mdoerr

@openjdk
Copy link

openjdk bot commented Jun 12, 2025

@dean-long TheRealMDoerr was not found in the census.

Syntax: /contributor (add|remove) [@user | openjdk-user | Full Name <email@address>]. For example:

  • /contributor add @openjdk-bot
  • /contributor add duke
  • /contributor add J. Duke <duke@openjdk.org>

User names can only be used for users in the census associated with this repository. For other contributors you need to supply the full name and email address.

@dean-long
Copy link
Member Author

Thanks @TheRealMDoer for the suggestions, I incorporated all of them, however for shenandoahCodeRoots.cpp I decided to make arm() public, so the caller code doesn't need to know about the magic value 0.

@TheRealMDoerr
Copy link
Contributor

Thanks @TheRealMDoer for the suggestions, I incorporated all of them, however for shenandoahCodeRoots.cpp I decided to make arm() public, so the caller code doesn't need to know about the magic value 0.

Excellent! Thank you! I'll take another look over it and run tests.

@openjdk
Copy link

openjdk bot commented Jun 12, 2025

@dean-long
Contributor Martin Doerr <mdoerr@openjdk.org> successfully added.

Copy link
Contributor

@TheRealMDoerr TheRealMDoerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, now. I think is_sigill_not_entrant should also be removed completely: TheRealMDoerr@f61f54a
I've run tests on some platforms, but will run on more.
@offamitkumar, @RealFYang, @bulasevich: You may also want to check your platforms.
@fisk: You may want to take a look at the nmethod entry barrier changes.

Copy link
Contributor

@TheRealMDoerr TheRealMDoerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests look good on our side. I'm only a bit concerned that the lock may become a bottleneck when many Java threads need to patch all nmethods. Especially with ZGC which does that more often. I think we should check performance.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 14, 2025
Copy link
Contributor

@fisk fisk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this! I have been wanting something like this for a while too and it looks great. I have some comments though...

ZLocker<ZReentrantLock> locker(ZNMethod::lock_for_nmethod(nm));
// Preserve the sticky bit
if (is_not_entrant(nm)) {
value |= not_entrant;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have a race where another thread sets an nmethod to not entrant and the thread calling this making the nmethod entry barrier not entrant?

If this was called to disarm a method and then enter it, it seems a bit sneaky in that case that we pass the nmethod entry barrier even though we under the lock see that it is not entrant. Probably okay but still feels like it might be more robust if the thread setting an nmethod to not entrant is always the one that arms the nmethod entry barrier.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand your concern correctly, there is no race. The only caller of BarrierSetNMethod::make_not_entrant() is nmethod::make_not_entrant(), and it is done inside a NMethodState_lock critical section. After a call to nmethod::make_not_entrant(), the nmethod entry barrier is armed and stays that way.
And by design, a disarm only disarms at the inner nmethod_entry_barrier level, not the outer nmethod_stub_entry_barrier level.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that while thread 1 calls nmethod::make_not_entrant(), thread 2 racingly performs nmethod entry barrier; it makes the is_not_entrant check before it gets updated, but then it gets updated as the per nmethod lock is taken. The GC code "disarms" the GC barrier but in doing so finds that "oh this should be not entrant", but that's sort of not reflected as thread 2 will then proceed with entering the nmethod it just armed as not entrant in the nmethod entry barrier code.
Does that make sense?

@offamitkumar
Copy link
Member

Just FYI, s390 build is broken with this change:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/home/amit/jdk/src/hotspot/share/gc/shared/barrierSetNMethod.cpp:196), pid=1779086, tid=1779117
#  assert(!nm->is_osr_method() || may_enter) failed: OSR nmethods should always be entrant after migration
#
# JRE version: OpenJDK Runtime Environment (26.0) (fastdebug build 26-internal-adhoc.amit.jdk)
# Java VM: OpenJDK 64-Bit Server VM (fastdebug 26-internal-adhoc.amit.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-s390x)
# Problematic frame:
# V  [libjvm.so+0x40b196]  BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e
#
# Core dump will be written. Default location: Core dumps may be processed with "/lib/systemd/systemd-coredump %P %u %g %s %t 9223372036854775808 %h %d" (or dumping to /home/amit/jdk/make/core.1779086)
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
#

stack trace:

Stack: [0x000003ff9e580000,0x000003ff9e680000],  sp=0x000003ff9e67b068,  free space=1004k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x40b196]  BarrierSetNMethod::nmethod_stub_entry_barrier(unsigned char**)+0x15e  (barrierSetNMethod.cpp:196)
v  ~StubRoutines::method_entry_barrier 0x000003ff9050cd18
J 282% c2 sun.nio.fs.UnixPath.initOffsets()V java.base (189 bytes) @ 0x000003ff90c4f0c8 [0x000003ff90c4f080+0x0000000000000048]
j  sun.nio.fs.UnixPath.getFileName()Lsun/nio/fs/UnixPath;+1 java.base
j  sun.nio.fs.UnixFileSystemProvider.isHidden(Ljava/nio/file/Path;)Z+6 java.base
j  java.nio.file.Files.isHidden(Ljava/nio/file/Path;)Z+5 java.base
j  jdk.internal.module.ModulePath.isHidden(Ljava/nio/file/Path;)Z+1 java.base
j  jdk.internal.module.ModulePath.lambda$explodedPackages$0(Ljava/nio/file/Path;Ljava/nio/file/attribute/BasicFileAttributes;)Z+11 java.base
j  jdk.internal.module.ModulePath$$Lambda+0x00000000a105cbe0.test(Ljava/lang/Object;Ljava/lang/Object;)Z+12 java.base
j  java.nio.file.Files.lambda$find$0(Ljava/util/function/BiPredicate;Ljava/nio/file/FileTreeWalker$Event;)Z+9 java.base
j  java.nio.file.Files$$Lambda+0x00000000a10646c0.test(Ljava/lang/Object;)Z+8 java.base
....

@TheRealMDoerr
Copy link
Contributor

@offamitkumar: The problem is probably the initialization to -1: z_cfi(Z_R0_scratch, /* to be patched */ -1);.
Should be 0.

@TheRealMDoerr
Copy link
Contributor

Seems like arm32 has the same issue:


The init value shouldn't have the sticky bit set.

@offamitkumar
Copy link
Member

@offamitkumar: The problem is probably the initialization to -1: z_cfi(Z_R0_scratch, /* to be patched */ -1);. Should be 0.

Thank you Martin for the suggestion.

@dean-long would you please add this diff, fixing s390x build. I ran tier1 test with fastdebug, test are clean;

diff --git a/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp b/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp
index e78906708af..2d663061aec 100644
--- a/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp
+++ b/src/hotspot/cpu/s390/gc/shared/barrierSetAssembler_s390.cpp
@@ -180,7 +180,7 @@ void BarrierSetAssembler::nmethod_entry_barrier(MacroAssembler* masm) {
     __ z_lg(Z_R0_scratch, in_bytes(bs_nm->thread_disarmed_guard_value_offset()), Z_thread); // 6 bytes
 
     // Compare to current patched value:
-    __ z_cfi(Z_R0_scratch, /* to be patched */ -1); // 6 bytes (2 + 4 byte imm val)
+    __ z_cfi(Z_R0_scratch, /* to be patched */ 0); // 6 bytes (2 + 4 byte imm val)
 
     // Conditional Jump
     __ z_larl(Z_R14, (Assembler::instr_len((unsigned long)LARL_ZOPC) + Assembler::instr_len((unsigned long)BCR_ZOPC)) / 2); // 6 bytes
diff --git a/src/hotspot/cpu/s390/stubGenerator_s390.cpp b/src/hotspot/cpu/s390/stubGenerator_s390.cpp
index d3f6540a3ea..bb1d9ce6037 100644
--- a/src/hotspot/cpu/s390/stubGenerator_s390.cpp
+++ b/src/hotspot/cpu/s390/stubGenerator_s390.cpp
@@ -3197,7 +3197,7 @@ class StubGenerator: public StubCodeGenerator {
 
     // VM-Call: BarrierSetNMethod::nmethod_stub_entry_barrier(address* return_address_ptr)
     __ call_VM_leaf(CAST_FROM_FN_PTR(address, BarrierSetNMethod::nmethod_stub_entry_barrier));
-    __ z_ltr(Z_R0_scratch, Z_RET);
+    __ z_ltr(Z_RET, Z_RET);
 
     // VM-Call Epilogue
     __ restore_volatile_regs(Z_SP, frame::z_abi_160_size, true, false);

@openjdk openjdk bot removed the ready Pull request is ready to be integrated label Jun 16, 2025
@dean-long
Copy link
Member Author

/contributor add @offamitkumar

@openjdk
Copy link

openjdk bot commented Jun 16, 2025

@dean-long
Contributor Amit Kumar <amitkumar@openjdk.org> successfully added.

@dean-long
Copy link
Member Author

Thanks @offamitkumar. Could you explain the __ z_ltr(Z_R0_scratch, Z_RET); change, for my curiosity? Thanks.

@dean-long
Copy link
Member Author

Seems like arm32 has the same issue:

The init value shouldn't have the sticky bit set.

Thanks, I pushed a potential fix for that.

@dean-long
Copy link
Member Author

Tests look good on our side. I'm only a bit concerned that the lock may become a bottleneck when many Java threads need to patch all nmethods. Especially with ZGC which does that more often. I think we should check performance.

For ZGC I am using a per-nmethod lock:
ZLocker locker(ZNMethod::lock_for_nmethod(nm));

I don't know what benchmarks to run to check the performance for functions like Deoptimization::deoptimize_all_marked, so I welcome any help with this.

One possible optimization that might help is skipping the lock if the make_not_entrant call is done during a safepoint.

@offamitkumar
Copy link
Member

Thanks @offamitkumar. Could you explain the __ z_ltr(Z_R0_scratch, Z_RET); change, for my curiosity? Thanks.

ltr instruction stands for "load and test" (32 bit). Initially we were loading the value from Z_RET to Z_R0_scratch and then it will be compared against 0. But in this case there is no requirement of loading the value in Z_R0, as it's not being used further. So we can load the value again in Z_RET and the compare it against 0. There is nothing wrong in previous solution, it's just killing Z_R0 for nothing.

@TheRealMDoerr
Copy link
Contributor

Seems like arm32 has the same issue:

The init value shouldn't have the sticky bit set.

Thanks, I pushed a potential fix for that.

Unfortunately, 0xBEAFDEAD also has the MSB set. Shouldn't we better use 0 like on all other platforms?

@dean-long
Copy link
Member Author

Seems like arm32 has the same issue:

The init value shouldn't have the sticky bit set.

Thanks, I pushed a potential fix for that.

Unfortunately, 0xBEAFDEAD also has the MSB set. Shouldn't we better use 0 like on all other platforms?

Oops, I was tripped up trying to be clever. Yes, I'm fine with using 0. I'll fix it.

Copy link
Contributor

@TheRealMDoerr TheRealMDoerr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recent changes look good. Thanks!

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot hotspot-dev@openjdk.org hotspot-gc hotspot-gc-dev@openjdk.org hotspot-runtime hotspot-runtime-dev@openjdk.org ready Pull request is ready to be integrated rfr Pull request is ready for review shenandoah shenandoah-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

4 participants