8304824: NMT should not use ThreadCritical #20852

roberttoyonaga · 2024-09-04T14:17:08Z

Summary

This PR just replaces ThreadCritical with a lock specific to NMT. ThreadCritical is a big lock and is unnecessary for the purposes of NMT. I've implemented the new lock with a semaphore so that it can be used early before VM init. There is also the possibility of adding assertions in places we expect NMT to have synchronization. I haven't added assertions yet in many places because some code paths such as the (NMT tests) don't lock yet. I think it makes sense to close any gaps in locking in another PR in which I can also add more assertions.

Testing:

hotspot_nmt
gtest:VirtualSpace
tier1

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8304824: NMT should not use ThreadCritical (Enhancement - P4)

Reviewers

David Holmes (@dholmes-ora - Reviewer) Review applies to 7ed996e2
Johan Sjölen (@jdksjolen - Reviewer) Review applies to 7ed996e2
Thomas Stuefe (@tstuefe - Reviewer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/20852/head:pull/20852
$ git checkout pull/20852

Update a local copy of the PR:
$ git checkout pull/20852
$ git pull https://git.openjdk.org/jdk.git pull/20852/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 20852

View PR using the GUI difftool:
$ git pr show -t 20852

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/20852.diff

Webrev

Link to Webrev Comment

bridgekeeper · 2024-09-04T14:17:46Z

👋 Welcome back roberttoyonaga! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2024-09-04T14:18:55Z

@roberttoyonaga This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8304824: NMT should not use ThreadCritical

Reviewed-by: stuefe, dholmes, jsjolen

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 588 new commits pushed to the master branch:

d49f210: 8342040: Further improve entry lookup performance for multi-release JARs
d2e716e: 8331958: Update PC/SC Lite for Suse Linux to 2.3.0
a95374f: 8343101: Rework BasicTest.testTemp test cases
00fe9f7: 8343100: Consolidate EmptyFolderTest and EmptyFolderPackageTest jpackage tests into single java file
9f6d5b4: 8343020: (fs) Add support for SecureDirectoryStream on macOS
1341b81: 8341666: FileInputStream doesn't support readAllBytes() or readNBytes(int) on pseudo devices
52382e2: 8338021: Support new unsigned and saturating vector operators in VectorAPI
e659d9d: 8342975: C2: Micro-optimize PhaseIdealLoop::Dominators()
9f6211b: 8341371: CDS cannot load archived heap objects with -XX:+UseSerialGC -XX:-UseCompressedOops
120a935: 8342561: Metaspace for generated reflection classes is no longer needed
... and 578 more: https://git.openjdk.org/jdk/compare/f0ae90f30c346544e87217ef1832d6a350fe1985...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@dholmes-ora, @gerard-ziemski, @jdksjolen, @tstuefe) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

openjdk · 2024-09-04T14:19:29Z

@roberttoyonaga To determine the appropriate audience for reviewing this pull request, one or more labels corresponding to different subsystems will normally be applied automatically. However, no automatic labelling rule matches the changes in this pull request. In order to have an "RFR" email sent to the correct mailing list, you will need to add one or more applicable labels manually using the /label pull request command.

Applicable Labels

build
client
compiler
core-libs
graal
hotspot
hotspot-compiler
hotspot-gc
hotspot-jfr
hotspot-runtime
i18n
ide-support
javadoc
jdk
jmx
kulla
net
nio
security
serviceability
shenandoah

roberttoyonaga · 2024-09-04T14:30:40Z

/label serviceability

openjdk · 2024-09-04T14:31:39Z

@roberttoyonaga
The serviceability label was successfully added.

roberttoyonaga · 2024-09-05T21:10:27Z

The GHA macos-aarch64 / test - Test (tier1) is failing due to TEST: runtime/cds/DeterministicDump.java with the message java.lang.RuntimeException: File content different at byte #4, b0 = -126, b1 = -52.
But I don't think it's related to the changes in this PR. It also looks like this test is failing on other recent PRs.

mlbridge · 2024-09-05T21:30:15Z

Webrevs

dholmes-ora

Okay in principal except for the reentrancy aspect - see below.

A few other minor nits.

Thanks

dholmes-ora · 2024-09-06T04:23:52Z

src/hotspot/share/nmt/nmtCommon.cpp

+Semaphore NmtGuard::_nmt_semaphore(1);
+intx volatile NmtGuard::_owner((intx) -1);


Please add explanatory comments.

dholmes-ora · 2024-09-06T04:24:20Z

src/hotspot/share/nmt/nmtCommon.hpp

+#include "runtime/os.hpp"
+#include "runtime/atomic.hpp"
+#include "runtime/javaThread.hpp"


Please list in alphabetic order.

dholmes-ora · 2024-09-06T04:27:44Z

src/hotspot/share/nmt/nmtCommon.hpp

+    NmtGuard() {
+      intx const current =  os::current_thread_id();
+      intx const owner = Atomic::load(&_owner);
+      assert(current != owner, "Lock is not reentrant");


ThreadCritical is reentrant though. We would need to do a lot of testing and/or code analysis to ensure we don't have the possibility of needing a reentrant lock with NMT. The most likely problems would be if we triggered a crash whilst in a locked section of code.

We have a healthy amount of tests already. We need a wide use of assert_locked and need to take a close look at error reporting. Locking, in NMT, is rather simple.

Error handling: My pragmatic "good enough for this rare scenario" proposal would be:

At the entry of error reporting, if the NMT mutex is locked by the crashing thread, just unlock it manually and try to continue. The chance is high that this will actually work well enough to survive the few mallocs we may do during error reporting.

Reasoning: We use NMT mutex as we did ThreadCritical: rather widely scoped. The time windows where things are out of sync and reentrant use of the same functionality can bite us are small. Moreover, NMT's two subsystems - malloc tracker and mmap tracker - are mostly independent from each other, and even leaving the mmap tracker in an inconsistent state will not prevent us from doing malloc during error reporting. We don't use mmap during error reporting, just malloc. Finally, moreover, NMT malloc code is comparatively simple and stable; mmap code is more complex, and we plan to rework parts of it in the near future.

TL;DR I think just manually releasing the lock in error reporting has a high chance of succeeding. Especially if we disable the NMT report in the hs-err file I am not even sure this is necessary, though. We have safeguards against error reporting locking up (step timeouts and secondary crash guards).

Thinking through this more, we can probably shorten my advice to this:

When entering error handling, if NMT lock is held by crashing thread and NMT level is detail, switch back to NMT level summary.

That's it.

Reasoning: we only do mallocs inside hs-err generation. For malloc, we only lock inside NMT if the NMT level is detail. And this should never change: malloc is too hot to justify a lock in summary mode.

The only point where possible crashes or deadlocks can occur in error reporting is when doing the NMT report itself - but chances for this to happen are remote, and we have secondary crash handling and error step timeouts to deal with this if it happens.

Ok I've added some logic in VMError::report to switch to summary mode (I think this is the right place). However, if we need to do error reporting while the NmtGuard is already held, then I think we might still have reentrancy even in summary mode. After the latest commit that replaces MemoryFileTracker's locker with NmtGuard, MemBaseline::baseline_summary() now acquires NmtGuard too.

Because of this I've also made NmtGuard reentrant. Maybe handling reentrancy would be beneficial in future/unforeseen situations as well. But if this is unnecessary, I don't mind removing it.

dholmes-ora · 2024-09-06T04:29:45Z

src/hotspot/share/nmt/nmtCommon.hpp

+      intx const current = os::current_thread_id();
+      intx const owner = Atomic::load(&_owner);
+      assert(current == owner, "NMT lock should be acquired in this section.");


Please put the entire body in ifdef DEBUG so we don't execute anything in release builds.

Also the assert should report the values of owner and current when they are not as expected.

tstuefe · 2024-09-06T08:32:41Z

I plan to look at this start of next week.

jdksjolen · 2024-09-06T08:51:31Z

Thank you Robert.

I think that the MemoryFileTracker's locker should probably also be replaced with this semaphore, as in the future we have plans to have a globally shared NativeCallStackStorage.

tstuefe · 2024-09-06T10:36:15Z

Thank you Robert.

I think that the MemoryFileTracker's locker should probably also be replaced with this semaphore, as in the future we have plans to have a globally shared NativeCallStackStorage.

+1

tstuefe · 2024-09-06T10:37:24Z

@roberttoyonaga Why don't we use a normal hotspot mutex here?

gerard-ziemski · 2024-09-06T20:18:36Z

src/hotspot/share/nmt/nmtCommon.hpp

+    static void assert_locked(){
+      intx const current = os::current_thread_id();
+      intx const owner = Atomic::load(&_owner);
+      assert(current == owner, "NMT lock should be acquired in this section.");


How about we try to re-use assert_locked() in constructor and destructor, something like:

NmtGuard() { assert(!assert_locked(), "Lock is not reentrant"); _nmt_semaphore.wait(); Atomic::store(&_owner, current); } ~NmtGuard() { assert(assert_locked(), "NMT lock should be acquired in this section."); Atomic::store(&_owner, (intx) -1); _nmt_semaphore.signal(); } static bool assert_locked(){ intx const current = os::current_thread_id(); intx const owner = Atomic::load(&_owner); return (current == owner); }

I think it's a good idea to use it in the destructor. But based on the discussion above, it seems like we shouldn't assert the lock can't be reentered.

…er. make reentrant.

roberttoyonaga · 2024-09-07T17:28:53Z

@roberttoyonaga Why don't we use a normal hotspot mutex here?

Hi @tstuefe. I tried using a regular mutex along with MutexLocker, but it seems like the mutex isn't initialized early enough to be used during VM init. After building I get:

assert(NMT_lock != nullptr) failed: not initialized!

Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x178448c]  VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x948  (nmtCommon.hpp:137)
V  [libjvm.so+0x1783aed]  (vmError.cpp:1611)
V  [libjvm.so+0xa8e2d2]  report_vm_status_error(char const*, int, char const*, int, char const*)+0x0  (debug.cpp:193)
V  [libjvm.so+0x8cb317]  NMTUtil::nmt_lock()+0x69  (nmtCommon.hpp:137)
V  [libjvm.so+0x13a0459]  MemTracker::record_virtual_memory_reserve(void*, unsigned long, NativeCallStack const&, MEMFLAGS)+0x36  (memTracker.hpp:128)
V  [libjvm.so+0x139d292]  os::reserve_memory(unsigned long, bool, MEMFLAGS)+0x80  (os.cpp:1877)
V  [libjvm.so+0xa8fc31]  initialize_assert_poison()+0x1f  (debug.cpp:712)
V  [libjvm.so+0x16e5720]  Threads::create_vm(JavaVMInitArgs*, bool*)+0x1be  (threads.cpp:489)
V  [libjvm.so+0xf62555]  JNI_CreateJavaVM_inner(JavaVM_**, void**, void*)+0xbd  (jni.cpp:3582)
V  [libjvm.so+0xf629e3]  JNI_CreateJavaVM+0x32  (jni.cpp:3673)
C  [libjli.so+0x8c04]  InitializeJVM+0x14d  (java.c:1592)
C  [libjli.so+0x5337]  JavaMain+0xe8  (java.c:505)
C  [libjli.so+0xc61e]  ThreadJavaMain+0x27  (java_md.c:653)
C  [libc.so.6+0x97507]  start_thread+0x377

roberttoyonaga · 2024-09-07T17:29:41Z

Thank you Robert.

I think that the MemoryFileTracker's locker should probably also be replaced with this semaphore, as in the future we have plans to have a globally shared NativeCallStackStorage.

Hi @jdksjolen. No problem! Ok I've replaced it.

tstuefe · 2024-09-11T06:10:11Z

@roberttoyonaga Why don't we use a normal hotspot mutex here?

Hi @tstuefe. I tried using a regular mutex along with MutexLocker, but it seems like the mutex isn't initialized early enough to be used during VM init. After building I get:

assert(NMT_lock != nullptr) failed: not initialized!

Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x178448c]  VMError::report_and_die(int, char const*, char const*, __va_list_tag*, Thread*, unsigned char*, void*, void*, char const*, int, unsigned long)+0x948  (nmtCommon.hpp:137)
V  [libjvm.so+0x1783aed]  (vmError.cpp:1611)
V  [libjvm.so+0xa8e2d2]  report_vm_status_error(char const*, int, char const*, int, char const*)+0x0  (debug.cpp:193)
V  [libjvm.so+0x8cb317]  NMTUtil::nmt_lock()+0x69  (nmtCommon.hpp:137)
V  [libjvm.so+0x13a0459]  MemTracker::record_virtual_memory_reserve(void*, unsigned long, NativeCallStack const&, MEMFLAGS)+0x36  (memTracker.hpp:128)
V  [libjvm.so+0x139d292]  os::reserve_memory(unsigned long, bool, MEMFLAGS)+0x80  (os.cpp:1877)
V  [libjvm.so+0xa8fc31]  initialize_assert_poison()+0x1f  (debug.cpp:712)
V  [libjvm.so+0x16e5720]  Threads::create_vm(JavaVMInitArgs*, bool*)+0x1be  (threads.cpp:489)
V  [libjvm.so+0xf62555]  JNI_CreateJavaVM_inner(JavaVM_**, void**, void*)+0xbd  (jni.cpp:3582)
V  [libjvm.so+0xf629e3]  JNI_CreateJavaVM+0x32  (jni.cpp:3673)
C  [libjli.so+0x8c04]  InitializeJVM+0x14d  (java.c:1592)
C  [libjli.so+0x5337]  JavaMain+0xe8  (java.c:505)
C  [libjli.so+0xc61e]  ThreadJavaMain+0x27  (java_md.c:653)
C  [libc.so.6+0x97507]  start_thread+0x377

Hi @roberttoyonaga,

sorry for the sluggish responses, I am swamped atm. Thanks a lot for your work here.

I dislike the self-written lock because of duplication and because Mutex gives us infrastructure benefits we don't have here. E.g. locking monitoring, deadlock detection, and things like assert_lock_strong.

You run into this problem at the initialization phase. I expected that, and here is where things gets interesting - here and in deadlock detection.

I see two better solutions:

During initialization, we don't need a lock. We are single-threaded (note that it may be different threads at different times, since different threads may handle CreateJavaVM and the initial invocation, but only one thread will run simultaneously). In any case, before the mutex system is initialized and the Mutexes are created, we don't need to lock.

So, you could use Mutex and MutexLocker but short-circuit locking if the mutex is still zero. This should work and should be lock/unlock balanced too. I believe MutexLocker already handles this case under the hood. You can call it with a nullptr mutex, and it will not lock. So you don't even have to do anything. Just don't assert.

That place where we mmap is a rare (and I think probably the only one) case where we mmap before mutex initialization. Pretty sure at least. This is just for poisoning the assertion page. This code could be moved downward a bit, since the assert poisoning is not essential for getting assertions to work, just a nice to have for added error info. That info does not really matter here, its more for assertion in compiled code.

I prefer (1), since then, locking would work as it does in all other places in hotspot.

roberttoyonaga · 2024-09-12T14:47:56Z

Hi @tstuefe,

I prefer (1), since then, locking would work as it does in all other places in hotspot.

Ok I've switched to using a regular hotspot Mutex instead of my own lock based on a semaphore.

I had to decrease the rank of SharedDecoder_lock in order for it to be acquired after NMT_lock inside MemTracker::print_containing_region
I needed to add a check for Thread::current_or_null() != nullptr in the constructor/destructor for MutexLockerImpl because, during VM init, NMT_lock seems to be used by detached threads.
I'm still a little concerned about possible reentrancy due to the previous comment above so I manually unlock NMT_lock inside error reporting in addition to reducing tracking to summary level.

roberttoyonaga · 2024-09-13T14:02:38Z

After switching to a Hotspot Mutex, it looks like the windows-x64 / test (hs/tier1 common) GHA is failing because the test release_bad_ranges in test_os.cpp is expecting an assertion and an error message to be printed. However, when this printing happens, tty_lock gets acquired out of rank order with the already held NMT_lock, causing the lock rank assertion fail. One solution would be to lower the rank of tty_lock. I'm not sure that's the best solution because that might cause rank conflicts with other locks (and it makes sense to give locks the highest possible rank to minimize future conflicts).

tstuefe · 2024-09-13T16:12:59Z

After switching to a Hotspot Mutex, it looks like the windows-x64 / test (hs/tier1 common) GHA is failing because the test release_bad_ranges in test_os.cpp is expecting an assertion and an error message to be printed. However, when this printing happens, tty_lock gets acquired out of rank order with the already held NMT_lock, causing the lock rank assertion fail. One solution would be to lower the rank of tty_lock. I'm not sure that's the best solution because that might cause rank conflicts with other locks (and it makes sense to give locks the highest possible rank to minimize future conflicts).

What code exactly locks tty_lock?

tstuefe · 2024-09-13T16:17:56Z

After switching to a Hotspot Mutex, it looks like the windows-x64 / test (hs/tier1 common) GHA is failing because the test release_bad_ranges in test_os.cpp is expecting an assertion and an error message to be printed. However, when this printing happens, tty_lock gets acquired out of rank order with the already held NMT_lock, causing the lock rank assertion fail. One solution would be to lower the rank of tty_lock. I'm not sure that's the best solution because that might cause rank conflicts with other locks (and it makes sense to give locks the highest possible rank to minimize future conflicts).

What code exactly locks tty_lock?

This is weird. We print the error in this case via print_error_for_unit_test(), which takes care only to use raw fprintf(stderr) for this very reason... therefore I am interested in understanding the lock semantics, to know if deadlock detection found a real problem or if this is just a weird error setup.

tstuefe

Good!

roberttoyonaga · 2024-10-28T19:39:55Z

Thank you everyone for the review!
/integrate

openjdk · 2024-10-28T19:40:26Z

@roberttoyonaga
Your change (at version 976e26c) is now ready to be sponsored by a Committer.

tstuefe · 2024-10-29T07:50:04Z

/sponsor

openjdk · 2024-10-29T07:50:44Z

Going to push as commit 0abfa3b.
Since your change was applied there have been 593 commits pushed to the master branch:

88dc655: 8342988: GHA: Build JTReg in single step
df3473e: 8343178: Test BasicTest.java javac compile fails cannot find symbol
54327bc: 8342962: [s390x] TestOSRLotsOfLocals.java crashes
f0075d5: 8343115: SkipIfEqual class is not used after JDK-8335946
90bd544: 8342958: Use jvmArgs consistently in microbenchmarks
d49f210: 8342040: Further improve entry lookup performance for multi-release JARs
d2e716e: 8331958: Update PC/SC Lite for Suse Linux to 2.3.0
a95374f: 8343101: Rework BasicTest.testTemp test cases
00fe9f7: 8343100: Consolidate EmptyFolderTest and EmptyFolderPackageTest jpackage tests into single java file
9f6d5b4: 8343020: (fs) Add support for SecureDirectoryStream on macOS
... and 583 more: https://git.openjdk.org/jdk/compare/f0ae90f30c346544e87217ef1832d6a350fe1985...master

Your commit was automatically rebased without conflicts.

openjdk · 2024-10-29T07:50:54Z

@tstuefe @roberttoyonaga Pushed as commit 0abfa3b.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

jdksjolen · 2024-10-30T12:32:17Z

Hi @tstuefe, @roberttoyonaga

We saw some test failures in non-generational ZGC due to this change. That GC is removed now, so there's no worry there. I did however look at all of our remaining usages of ThreadCritical and saw that there are thrre left that are related to NMT:

src/hotspot/share/nmt/nmtUsage.cpp
56     ThreadCritical tc;

src/hotspot/share/nmt/mallocTracker.cpp
65     // Use ThreadCritical to make sure that mtChunks don't get deallocated while the
68     ThreadCritical tc;


src/hotspot/share/memory/arena.cpp
178      ThreadCritical tc;  // Free chunks under TC lock so that NMT adjustment is stable.

I am not that familiar with this code. Is leaving these as they are intentional, or have we introduced a potential bug?

Thank you.

roberttoyonaga · 2024-10-30T13:39:00Z

Hi @jdksjolen, I originally replaced those instances, but then put them back in because I saw that they are related to chunks allocation/deallocation code where it seemed ThreadCritical was used for other reasons.

I'm not certain, but looking at it again, it seems that the ThreadCritical uses in ChunkPool::deallocate_chunk and ChunkPool::prune() are only needed for NMT and are independent of the other ThreadCritical uses in that code.

roberttoyonaga · 2024-10-30T13:55:34Z

Also, those uses of ThreadCritical are not protecting NMT's model of virtual memory. They are only protecting the reporting of malloc info specifically. Although, it may still make sense to replace those instances with NmtVirtualMemory_lock since it's is still related to NMT.

What do you think @tstuefe?

jdksjolen · 2024-10-30T15:11:31Z

I'm not certain, but looking at it again, it seems that the ThreadCritical uses in ChunkPool::deallocate_chunk and ChunkPool::prune() are only needed for NMT and are independent of the other ThreadCritical uses in that code.

At least for prune, that's needed for the chunk pool itself as it would otherwise be accessed concurrently.

Also, those uses of ThreadCritical are not protecting NMT's model of virtual memory. They are only protecting the reporting of malloc info specifically. Although, it may still make sense to replace those instances with NmtVirtualMemory_lock since it's is still related to NMT.

OK, I understand. That seems reasonable.

tstuefe · 2024-10-30T15:35:07Z

We should get rid of this awful value correcting for arenas. I get headaches trying to think about this every time.

To remind everyone including me, some of the stuff is explained here: complex .

About locking around os::free - not sure if we can do this. Are all paths inside os::free lock free? It does call NMT. I think all cases inside NMT that touches are lock free, but am not sure. Have to think about this a biot.

tstuefe · 2024-10-30T15:44:16Z

About locking around os::free - not sure if we can do this. Are all paths inside os::free lock free? It does call NMT. I think all cases inside NMT that touches are lock free, but am not sure. Have to think about this a bit.

Ah okay should be fine. Both mallocs and frees should be lock free in NMT. In summary mode we just modify counters. In detail mode, we add to the malloc site table, but that one is lock free as well.

roberttoyonaga · 2024-10-30T15:57:57Z

Ok so there shouldn't be reentrancy if we use the NmtVirtualMemory_lock around os::free. But we also free in ChunkPool::prune() :

I'm not certain, but looking at it again, it seems that the ThreadCritical uses in ChunkPool::deallocate_chunk and ChunkPool::prune() are only needed for NMT and are independent of the other ThreadCritical uses in that code.

At least for prune, that's needed for the chunk pool itself as it would otherwise be accessed concurrently.

So that means we'd need to have both ThreadCritical and NmtVirtualMemory_lock in that method (if we were to do the other replacements). One to protect the chunks and one to protect the malloc accounting. It might also be good to rename NmtVirtualMemory_lock then too.

tstuefe · 2024-10-31T09:03:16Z

So that means we'd need to have both ThreadCritical and NmtVirtualMemory_lock in that method (if we were to do the other replacements). One to protect the chunks and one to protect the malloc accounting. It might also be good to rename NmtVirtualMemory_lock then too.

Please split this code in two parts. First part removes retired chunks from the arena chunk list and adds them to a new temporary list. Second part goes through the temp list and deletes the chunks. First part under ThreadCritical, second part under NmtVirtualMemory_lock.

tstuefe · 2024-10-31T10:53:12Z

I had to analyze this again, to understand why we need this locking, since my mind slips.

I updated my findings in https://bugs.openjdk.org/browse/JDK-8325890 . Please see details there.

But I don't think the current locking makes much sense, tbh (even before this patch). Or I don't get it.

We have three counters in play here:
A) the malloc[mtChunk] malloc counter for the mtChunk tag
B) the global malloc counter
C) the arena[xxx] arena counter for the mtXXX tag the arena is tagged with

Upon arena growth, we incremement all four. In two steps - first (A) and (B) under os::malloc, then (C) in the arena code.

Upon arena death, we may delete the chunk, in which case we adjust all three counters as well.

But since we don't want (B) to include arena sizes, we have several places where we lazily adjust (B) and (A) from the sum of all arena counters (see JBS issue). These adjustments must be synchronized with arena allocations. But we don't do this. The lock does not include (C). For that, the lock would have to be at a different place, inside arena code. We only lock around (A) and (B).

I am not sure what the point of this whole thing is. I am also not convinced that it always works correctly. This whole "updating counters lazily" business makes the code brittle.

Also, the current locking does not guarantee correctness anyway. There are several places in the coding where we calculate e.g. the sum of all mallocs, but malloc counters are modified (rightly so) out of lock protection.

To make matters even more complex, we have not two locks here but three:

ThreadCritical
the new NmtVirtualMemory_lock
we also have the NMTQuery_lock, which guards NMT reports from jcmd exclusively

Not sure what the point of the latter even is. It certainly does not prevent us from getting reports while underlying data structures are being changed, unless I am overlooking something.

It would be very nice if someone other than me could analyze this.

jdksjolen · 2024-10-31T11:11:21Z

I had to analyze this again, to understand why we need this locking, since my mind slips.

I updated my findings in https://bugs.openjdk.org/browse/JDK-8325890 . Please see details there.

But I don't think the current locking makes much sense, tbh (even before this patch). Or I don't get it.

We have three counters in play here: A) the malloc[mtChunk] malloc counter for the mtChunk tag B) the global malloc counter C) the arena[xxx] arena counter for the mtXXX tag the arena is tagged with

Upon arena growth, we incremement all four. In two steps - first (A) and (B) under os::malloc, then (C) in the arena code.

Upon arena death, we may delete the chunk, in which case we adjust all three counters as well.

But since we don't want (B) to include arena sizes, we have several places where we lazily adjust (B) and (A) from the sum of all arena counters (see JBS issue). These adjustments must be synchronized with arena allocations. But we don't do this. The lock does not include (C). For that, the lock would have to be at a different place, inside arena code. We only lock around (A) and (B).

I am not sure what the point of this whole thing is. I am also not convinced that it always works correctly. This whole "updating counters lazily" business makes the code brittle.

Also, the current locking does not guarantee correctness anyway. There are several places in the coding where we calculate e.g. the sum of all mallocs, but malloc counters are modified (rightly so) out of lock protection.

To make matters even more complex, we have not two locks here but three:
* ThreadCritical

* the new `NmtVirtualMemory_lock`

* we also have the `NMTQuery_lock`, which guards NMT reports from jcmd exclusively
Not sure what the point of the latter even is. It certainly does not prevent us from getting reports while underlying data structures are being changed, unless I am overlooking something.

It would be very nice if someone other than me could analyze this.

If this is so brittle and complex, then I wonder what you even get out of us doing the ThreadCritical trick. In other words, if we just removed it, would anyone notice and be able to discern what's occurred? Open question, I might do that change and run the tests on it.

tstuefe · 2024-10-31T12:34:38Z

If this is so brittle and complex, then I wonder what you even get out of us doing the ThreadCritical trick. In other words, if we just removed it, would anyone notice and be able to discern what's occurred? Open question, I might do that change and run the tests on it.

@jdksjolen good question. I am not sure who introduced that, may have been Afshin as a fix for some NMT test.

roberttoyonaga · 2024-10-31T16:48:59Z

Hi @tstuefe,

But since we don't want (B) to include arena sizes, we have several places where we lazily adjust (B) and (A) from the sum of all arena counters (see JBS issue). These adjustments must be synchronized with arena allocations. But we don't do this. The lock does not include (C). For that, the lock would have to be at a different place, inside arena code

Yes, updating all three counters (A/B/C) is not atomic when growing an arena or when destroying an arena. I think the ThreadCritical, that's meant to keep the accounting in sync, needs to be further up the stack when destroying arenas. What's strange is that there's no ThreadCritical protecting the growing operation at all. Why was there only an attempt to protect the freeing?

I think this is also confusing because its not consistent whether we count arena sizes as part of malloc. In the NMT report total, arena is counted toward malloc (so we have to subtract mtChunk). But in the reported individual categories, arena is disjunct from malloc.

The solution you mentioned in https://bugs.openjdk.org/browse/JDK-8325890 seems like it would solve this consistency problem while also avoiding the double counting problem with mtChunk. I guess the main problem with this is that the change will break things if people are relying on the reported arena values to not overlap with malloc.

roberttoyonaga · 2024-10-31T18:05:37Z

The easiest action to take would be to remove the ThreadCritical that attempts to keep NMT accounting in-sync in arena.cpp. It doesn't actually cover all cases, so we might as well not bother locking at all for slightly better performance and reduce the risk of bad lock ordering.

The next best solution would be to make arena size a subset of malloc size, and make mtChunk only responsible for free chunks not yet associated with other NMT categories. Then there would be no need for special adjustments when reporting, and thus no need for the special ThreadCritical locking here. And when calculating total sizes, we can simplify the calculation to summing up all "malloc" in MemoryCounter. No need to make a distinction between when it is/isn't appropriate to add on arena sizes. This would be a breaking change since it changes the meaning of arena, malloc, and mtChunk.

The third and highest effort solution would be to remove the chunk pooling altogether as suggested in https://bugs.openjdk.org/browse/JDK-8325890. This shares the benefits of solution 2, but also requires performance testing and has consequences that extend beyond NMT.

Also, the current locking does not guarantee correctness anyway. There are several places in the coding where we calculate e.g. the sum of all mallocs, but malloc counters are modified (rightly so) out of lock protection.

Yes, still none of the three above solutions solve this. Even with solution 2 or 3 it would be possible to update the arena sizes for each category concurrently with reporting (unless we introduce new locking).

mgronlun · 2024-11-01T21:56:36Z

src/hotspot/share/runtime/mutexLocker.hpp

@@ -28,6 +28,7 @@
 #include "memory/allocation.hpp"
 #include "runtime/flags/flagSetting.hpp"
 #include "runtime/mutex.hpp"
+#include "runtime/thread.hpp"


This include is not needed because there are no uses that require the definition of Thread.

After pull, I am getting circularity errors from runtime/thread.hpp -> jfr/support/jfrThreadLocal.hpp (which now needs runtime/mutexLocker.hpp (-> runtime/thread.hpp)

The include of "runtime/thread.hpp" can instead be placed in "nmt/nmtCommon.hpp" which has a use in the NmtVirtualMemoryLocker constructor.

This include is not needed because there are no uses that require the definition of Thread.

Right, seems like the forward declaration used to be provided by memory/allocation.hpp. Let's get rid of the include and use a forward declaration of our own instead.

Should I open another PR to remove it? I can also remove the locking related to NMT in arena.cpp in the same PR as well (as per this comment).

I've fixed that particular bug and integrated it already :-).

openjdk bot changed the title ~~8304824~~ 8304824: NMT should not use ThreadCritical Sep 4, 2024

openjdk bot added the serviceability serviceability-dev@openjdk.org label Sep 4, 2024

roberttoyonaga force-pushed the JDK-8304824 branch 2 times, most recently from 7ac88d2 to a9731e2 Compare September 4, 2024 23:24

Replace ThreadCritical with a specific NMT lock

97f8296

roberttoyonaga force-pushed the JDK-8304824 branch from a9731e2 to 97f8296 Compare September 5, 2024 19:18

roberttoyonaga marked this pull request as ready for review September 5, 2024 21:25

openjdk bot added the rfr Pull request is ready for review label Sep 5, 2024

dholmes-ora suggested changes Sep 6, 2024

View reviewed changes

gerard-ziemski suggested changes Sep 6, 2024

View reviewed changes

Comments. Hide assertion behind DEBUG. Replace MemoryFileTracker lock…

38d440b

…er. make reentrant.

Switch to using a Hotspot Mutex.

4ae0262

add a comment explaining lock rank

976e26c

tstuefe approved these changes Oct 28, 2024

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Oct 28, 2024

openjdk bot added the sponsor Pull request is ready to be sponsored label Oct 28, 2024

openjdk bot added the integrated Pull request has been integrated label Oct 29, 2024

openjdk bot closed this Oct 29, 2024

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Oct 29, 2024

mgronlun reviewed Nov 1, 2024

View reviewed changes

roberttoyonaga mentioned this pull request Nov 6, 2024

8343244: Aarch64: Internal Error virtualMemoryTracker.cpp:444 assert(reserved_rgn != nullptr) failed: Add committed region, No reserved region found #21928

Closed

3 tasks

roberttoyonaga mentioned this pull request Dec 18, 2024

8346123: [REDO] NMT should not use ThreadCritical #22745

Closed

3 tasks

		Semaphore NmtGuard::_nmt_semaphore(1);
		intx volatile NmtGuard::_owner((intx) -1);

8304824: NMT should not use ThreadCritical #20852

8304824: NMT should not use ThreadCritical #20852

Uh oh!

Conversation

roberttoyonaga commented Sep 4, 2024 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Progress

Issue

Reviewers

Reviewing

Webrev

Uh oh!

bridgekeeper bot commented Sep 4, 2024

Uh oh!

openjdk bot commented Sep 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openjdk bot commented Sep 4, 2024

Uh oh!

roberttoyonaga commented Sep 4, 2024

Uh oh!

openjdk bot commented Sep 4, 2024

Uh oh!

roberttoyonaga commented Sep 5, 2024

Uh oh!

mlbridge bot commented Sep 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

dholmes-ora left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tstuefe commented Sep 6, 2024

Uh oh!

jdksjolen commented Sep 6, 2024

Uh oh!

tstuefe commented Sep 6, 2024

Uh oh!

tstuefe commented Sep 6, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

roberttoyonaga commented Sep 7, 2024

Uh oh!

roberttoyonaga commented Sep 7, 2024

Uh oh!

tstuefe commented Sep 11, 2024

Uh oh!

roberttoyonaga commented Sep 12, 2024

Uh oh!

roberttoyonaga commented Sep 13, 2024

Uh oh!

tstuefe commented Sep 13, 2024

Uh oh!

tstuefe commented Sep 13, 2024

Uh oh!

tstuefe left a comment

Choose a reason for hiding this comment

Uh oh!

roberttoyonaga commented Oct 28, 2024

Uh oh!

openjdk bot commented Oct 28, 2024

Uh oh!

roberttoyonaga commented Sep 4, 2024 •

edited by openjdk bot

Loading

openjdk bot commented Sep 4, 2024 •

edited

Loading

mlbridge bot commented Sep 5, 2024 •

edited

Loading

roberttoyonaga commented Oct 30, 2024 •

edited

Loading

roberttoyonaga commented Oct 30, 2024 •

edited

Loading

roberttoyonaga commented Oct 31, 2024 •

edited

Loading

mgronlun Nov 1, 2024 •

edited

Loading