Skip to content

Conversation

@txominpelu
Copy link
Contributor

@txominpelu txominpelu commented May 30, 2024

Print the stack traces of mounted virtual threads when calling jcmd <pid> Thread.print.


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8330846: Add stacks of mounted virtual threads to the HotSpot thread dump (Enhancement - P3)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/19482/head:pull/19482
$ git checkout pull/19482

Update a local copy of the PR:
$ git checkout pull/19482
$ git pull https://git.openjdk.org/jdk.git pull/19482/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 19482

View PR using the GUI difftool:
$ git pr show -t 19482

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/19482.diff

Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented May 30, 2024

👋 Welcome back txominpelu! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented May 30, 2024

@txominpelu This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8330846: Add stacks of mounted virtual threads to the HotSpot thread dump

Reviewed-by: dholmes, alanb

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 41 new commits pushed to the master branch:

  • 81083a0: 8299487: Test java/net/httpclient/whitebox/SSLTubeTestDriver.java timed out
  • 81ca0ec: 8334028: HttpClient: NPE thrown from assert statement
  • bd750b6: 8319933: Disable tests for JDK-8280481 on Graal
  • c80e2eb: 8333886: Explicitly specify that asSlice and reinterpret return a memory segment backed by the same region of memory.
  • a0318bc: 8334077: Fix problem list entries for compiler tests
  • a7e4ab9: 8333730: ubsan: FieldIndices/libFieldIndicesTest.cpp:276:11: runtime error: null pointer passed as argument 2, which is declared to never be null
  • abbf45b: 8332699: ubsan: jfrEventSetting.inline.hpp:31:43: runtime error: index 163 out of bounds for type 'jfrNativeEventSetting [162]'
  • bd046d9: 8222884: ConcurrentClassDescLookup.java times out intermittently
  • 1c80ddb: 8333940: Ensure javax/swing/TestUngrab.java run on all platforms
  • a7205cc: 8333926: Shenandoah: Lower default immediate garbage threshold
  • ... and 31 more: https://git.openjdk.org/jdk/compare/a941397327972f130e683167a1b429f17603df46...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@dholmes-ora, @AlanBateman) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the rfr Pull request is ready for review label May 30, 2024
@openjdk
Copy link

openjdk bot commented May 30, 2024

@txominpelu The following labels will be automatically applied to this pull request:

  • hotspot-runtime
  • serviceability

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing lists. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added serviceability serviceability-dev@openjdk.org hotspot-runtime hotspot-runtime-dev@openjdk.org labels May 30, 2024
@mlbridge
Copy link

mlbridge bot commented May 30, 2024

@AlanBateman
Copy link
Contributor

Thanks for take this one. Here's the result with the changes in 1a75277.

"ForkJoinPool-1-worker-1" #25 [33795] daemon prio=5 os_prio=31 cpu=46574.42ms elapsed=47.15s tid=0x00007f81670d1a00  [0x000070000e9a4000]
   Carrying virtual thread #24
	at jdk.internal.vm.Continuation.run(java.base/Continuation.java:262)
	at java.lang.VirtualThread.runContinuation(java.base/VirtualThread.java:283)
	at java.lang.VirtualThread$$Lambda/0x00000001220b2868.run(java.base/Unknown Source)
	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(java.base/ForkJoinTask.java:1726)
	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(java.base/ForkJoinTask.java:1717)
	at java.util.concurrent.ForkJoinTask$InterruptibleTask.exec(java.base/ForkJoinTask.java:1641)
	at java.util.concurrent.ForkJoinTask.doExec(java.base/ForkJoinTask.java:507)
	at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(java.base/ForkJoinPool.java:1455)
	at java.util.concurrent.ForkJoinPool.runWorker(java.base/ForkJoinPool.java:2031)
	at java.util.concurrent.ForkJoinWorkerThread.run(java.base/ForkJoinWorkerThread.java:189)
   Carrying virtual thread #24
	at Main.lambda$main$0(Main.java:7)
	at java.lang.VirtualThread.run(java.base/VirtualThread.java:381)

Note that the line "Carrying virtual thread #24" is printed twice. Also it's not immediately clear that there are two stack traces.

You'll likely get different opinions on how mounted virtual threads should be presented. A few things to try

  • indent the stack trace of the mounted virtual thread
  • list the mounted virtual threads at the end

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you look at JavaThread::print_vthread_stack_on? The last thing we need is yet-another-version of stack printing code.

@dholmes-ora
Copy link
Member

I'd probably give preference to the stack of the virtual thread, as the stack of the carrier when a vthread is mounted is generally quite uninteresting:

"ForkJoinPool-1-worker-1" #25 [33795] daemon prio=5 os_prio=31 cpu=46574.42ms elapsed=47.15s tid=0x00007f81670d1a00  [0x000070000e9a4000]
       Carrying virtual thread #24
             at Main.lambda$main$0(Main.java:7)
             at java.lang.VirtualThread.run(java.base/VirtualThread.java:381)
	at jdk.internal.vm.Continuation.run(java.base/Continuation.java:262)
	at java.lang.VirtualThread.runContinuation(java.base/VirtualThread.java:283)
	at java.lang.VirtualThread$$Lambda/0x00000001220b2868.run(java.base/Unknown Source)
	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(java.base/ForkJoinTask.java:1726)
	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(java.base/ForkJoinTask.java:1717)
	at java.util.concurrent.ForkJoinTask$InterruptibleTask.exec(java.base/ForkJoinTask.java:1641)
	at java.util.concurrent.ForkJoinTask.doExec(java.base/ForkJoinTask.java:507)
	at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(java.base/ForkJoinPool.java:1455)
	at java.util.concurrent.ForkJoinPool.runWorker(java.base/ForkJoinPool.java:2031)
	at java.util.concurrent.ForkJoinWorkerThread.run(java.base/ForkJoinWorkerThread.java:189)

@txominpelu
Copy link
Contributor Author

Thanks for your comments @AlanBateman and @dholmes-ora !

  • The format proposed by @dholmes-ora definitely makes sense and I'll of course drop the duplicated "Carrying" statement.

  • Regarding using JavaThread::print_vthread_stack_on, I agree that it'd be good to use it instead of creating a new way of printing stack traces. However print_vthread_stack_on does not have the const modifier so it cannot be called directly from Thread.print_on. Making print_vthread_stack_on more strict, seems to be fairly complicated and I don't know if I'd be OK with relaxing the const constraint in Thread.print_on for the purpose of reusing print_vthread_stack_on

@AlanBateman
Copy link
Contributor

  • The format proposed by @dholmes-ora definitely makes sense

You will different get different opinions on how it is presented. Can you also try putting a new section that lists the mounted virtual threads and their stack trace. If the virtual thread has a name then it can be shown too.

@txominpelu
Copy link
Contributor Author

txominpelu commented Jun 3, 2024

In ae690b2

I've :

  • Removed the duplicated "Carrying" sentence.
  • Indented the virtual thread stack

This is the way that the output looks like with the new changes:

"ForkJoinPool-1-worker-1" #21 [32771] daemon prio=5 os_prio=31 cpu=12975.11ms elapsed=13.11s tid=0x000000015b820210  [0x0000000172491000]
Thread: 0x000000015b820210  [0x8003] State: _at_safepoint _at_poll_safepoint 1
   JavaThread state: _thread_blocked
   Carrying virtual thread #20
		at Main$FunkyLambda.compute(Main.java:38)
		at Main$FunkyLambda.computeFunkyName(Main.java:31)
		at java.lang.VirtualThread.run(java.base@23-internal/VirtualThread.java:329)
	at jdk.internal.vm.Continuation.run(java.base@23-internal/Continuation.java:248)
	at java.lang.VirtualThread.runContinuation(java.base@23-internal/VirtualThread.java:245)
	at java.lang.VirtualThread$$Lambda/0x0000780001046740.run(java.base@23-internal/Unknown Source)
	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(java.base@23-internal/ForkJoinTask.java:1726)
	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(java.base@23-internal/ForkJoinTask.java:1717)
	at java.util.concurrent.ForkJoinTask$InterruptibleTask.exec(java.base@23-internal/ForkJoinTask.java:1641)
	at java.util.concurrent.ForkJoinTask.doExec(java.base@23-internal/ForkJoinTask.java:507)
	at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(java.base@23-internal/ForkJoinPool.java:1489)
	at java.util.concurrent.ForkJoinPool.scan(java.base@23-internal/ForkJoinPool.java:2071)
	at java.util.concurrent.ForkJoinPool.runWorker(java.base@23-internal/ForkJoinPool.java:2033)
	at java.util.concurrent.ForkJoinWorkerThread.run(java.base@23-internal/ForkJoinWorkerThread.java:189)

@tstuefe
Copy link
Member

tstuefe commented Jun 3, 2024

I also find the duplication of the stack printing code unfortunate. It would be nice to reuseJavaThread::print_vthread_stack_on. I don't understand why it cannot be const?

@dholmes-ora
Copy link
Member

dholmes-ora commented Jun 3, 2024

  • The format proposed by @dholmes-ora definitely makes sense

You will different get different opinions on how it is presented. Can you also try putting a new section that lists the mounted virtual threads and their stack trace. If the virtual thread has a name then it can be shown too.

That suggests to me that we have to iterate the list of all threads twice ?? If so that seems not ideal.

@dholmes-ora
Copy link
Member

I also find the duplication of the stack printing code unfortunate. It would be nice to reuseJavaThread::print_vthread_stack_on. I don't understand why it cannot be const?

Just what I was about to query :) I'm not sure what the const issue is. Printing a stack certainly should not modify anything.

txominpelu and others added 2 commits June 3, 2024 13:18
@txominpelu
Copy link
Contributor Author

Sorry, I don't know what I did wrong last week, but when I tried to rely on print_vthread_stack_on I was getting compilation errors.

In any case, it seems to be working now with 59b18db. Thanks for insisting, and sorry for the confusion.

I also find the duplication of the stack printing code unfortunate. It would be nice to reuseJavaThread::print_vthread_stack_on. I don't understand why it cannot be const?

Just what I was about to query :) I'm not sure what the const issue is. Printing a stack certainly should not modify anything.

@txominpelu
Copy link
Contributor Author

  • The format proposed by @dholmes-ora definitely makes sense

You will different get different opinions on how it is presented. Can you also try putting a new section that lists the mounted virtual threads and their stack trace. If the virtual thread has a name then it can be shown too.

That suggests to me that we have to iterate the list of all threads twice ?? If so that seems not ideal.

I think that both approaches suggested by @AlanBateman would work, but I personally like the fact that having the virtual thread being displayed on top of the carrier thread allows seeing the full stack trace together. I've tried to preserve that approach while calling print_vthread_stack_on in 6a3b779.

@pron
Copy link
Member

pron commented Jun 3, 2024

About the output format:

  1. The text Carrying virtual thread #N should appear, as it does, in the header of the output for the platform thread.
  2. The stack for the mounted virtual thread should appear, indented after the stack of the platform thread, with the header Mounted virtual thread #N.

@AlanBateman
Copy link
Contributor

I don't think showing the frames of the mounted virtual thread before the carrier thread frames is the best way to show this. For one thing, it appears immediately after the carrier's thread name/details and state. So I think Ron's suggestion to clearly label it as the mounted virtual thread after the carrier stack trace would be good to try.

@txominpelu
Copy link
Contributor Author

txominpelu commented Jun 3, 2024

Thanks @AlanBateman and @pron ! Based on your comments I've added 9acbf29.

That provides the following output:

"ForkJoinPool-1-worker-1" #21 [27907] daemon prio=5 os_prio=31 cpu=29561.73ms elapsed=29.71s tid=0x0000000147089c10  [0x000000017178d000]
   Carrying virtual thread #20
Thread: 0x0000000147089c10  [0x6d03] State: _at_safepoint _at_poll_safepoint 1
   JavaThread state: _thread_blocked
	at jdk.internal.vm.Continuation.run(java.base@23-internal/Continuation.java:248)
	at java.lang.VirtualThread.runContinuation(java.base@23-internal/VirtualThread.java:245)
	at java.lang.VirtualThread$$Lambda/0x0000300001046740.run(java.base@23-internal/Unknown Source)
	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(java.base@23-internal/ForkJoinTask.java:1726)
	at java.util.concurrent.ForkJoinTask$RunnableExecuteAction.compute(java.base@23-internal/ForkJoinTask.java:1717)
	at java.util.concurrent.ForkJoinTask$InterruptibleTask.exec(java.base@23-internal/ForkJoinTask.java:1641)
	at java.util.concurrent.ForkJoinTask.doExec(java.base@23-internal/ForkJoinTask.java:507)
	at java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(java.base@23-internal/ForkJoinPool.java:1489)
	at java.util.concurrent.ForkJoinPool.scan(java.base@23-internal/ForkJoinPool.java:2071)
	at java.util.concurrent.ForkJoinPool.runWorker(java.base@23-internal/ForkJoinPool.java:2033)
	at java.util.concurrent.ForkJoinWorkerThread.run(java.base@23-internal/ForkJoinWorkerThread.java:189)
   	Mounted virtual thread #20
		at Main$FunkyLambda.compute(Main.java:38)
		at Main$FunkyLambda.computeFunkyName(Main.java:31)
		at Main$$Lambda/0x0000300001000c50.run(Unknown Source)
		at java.lang.Thread.runWith(java.base@23-internal/Thread.java:1588)
		at java.lang.VirtualThread.run(java.base@23-internal/VirtualThread.java:329)
		at java.lang.VirtualThread$VThreadContinuation$1.run(java.base@23-internal/VirtualThread.java:209)
		at jdk.internal.vm.Continuation.enter0(java.base@23-internal/Continuation.java:320)
		at jdk.internal.vm.Continuation.enter(java.base@23-internal/Continuation.java:312)

Copy link
Member

@dholmes-ora dholmes-ora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. Leaving the indentation to another PR is fine. Thanks.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Jun 12, 2024
output.shouldMatch(".*at " + Pattern.quote(DummyRunnable.class.getName()) + "\\.run.*");
output.shouldMatch(".*at " + Pattern.quote(DummyRunnable.class.getName()) + "\\.compute.*");
output.shouldMatch("Mounted virtual thread " + "\"Dummy Vthread\"" + " #" + vthread.threadId());
shouldFinish.set(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One other suggestion is to use a try-finally block here. Put L48-53 in the block and set shouldFinish in the finally block. That way if the test fails then it won't leave a spinning thread to disrupt the next test that runs in the agent VM.

Also just to say that we've mostly used JUnit for new tests in recent releases, moving away from TestNG for new tests.

Copy link
Contributor Author

@txominpelu txominpelu Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AlanBateman ! In commits 05d861c and 4d6a8cc I've incorporated your suggestions.

* @summary Test of diagnostic command Thread.print with virtual threads
* @requires vm.continuations
* @library /test/lib
* @modules java.base
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @modules line isn't needed, it's not possible to have a run-time image without java.base.

Copy link
Contributor

@AlanBateman AlanBateman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking feedback, I don't have any other comments.

@txominpelu
Copy link
Contributor Author

Thanks for taking feedback, I don't have any other comments.

Thanks a lot for all your comments ! 🙇

@txominpelu
Copy link
Contributor Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Jun 12, 2024
@openjdk
Copy link

openjdk bot commented Jun 12, 2024

@txominpelu
Your change (at version e2b8551) is now ready to be sponsored by a Committer.

@dholmes-ora
Copy link
Member

/sponsor

@openjdk
Copy link

openjdk bot commented Jun 12, 2024

Going to push as commit fcedde8.
Since your change was applied there have been 47 commits pushed to the master branch:

  • 74468bc: 8209092: Remove outdated wording from RC5ParameterSpec
  • b697b48: 8315655: [macos] Right click and dragging over a component with a popup menu will open the popup
  • 2c1da6c: 8332139: SymbolTableHash::Node allocations allocates twice the required memory
  • ba67ad6: 8319457: Update jpackage to support WiX v4 and v5 on Windows
  • 2c9185e: 8321308: AArch64: Fix matching predication for cbz/cbnz
  • 5a8a9fd: 8333382: [s390x] Enhance popcnt Instruction to use Z15 facilities
  • 81083a0: 8299487: Test java/net/httpclient/whitebox/SSLTubeTestDriver.java timed out
  • 81ca0ec: 8334028: HttpClient: NPE thrown from assert statement
  • bd750b6: 8319933: Disable tests for JDK-8280481 on Graal
  • c80e2eb: 8333886: Explicitly specify that asSlice and reinterpret return a memory segment backed by the same region of memory.
  • ... and 37 more: https://git.openjdk.org/jdk/compare/a941397327972f130e683167a1b429f17603df46...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Jun 12, 2024
@openjdk openjdk bot closed this Jun 12, 2024
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Jun 12, 2024
@openjdk
Copy link

openjdk bot commented Jun 12, 2024

@dholmes-ora @txominpelu Pushed as commit fcedde8.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@dholmes-ora
Copy link
Member

I should have asked before sponsoring this but what testing was done with this? Do all our serviceability tests pass with this change? (I guess I will find out later today in our CI.)

@AlanBateman
Copy link
Contributor

I've created JDK-8334215 with issues found when trying to sync up the loom repo. The tests in that repo are typically run with JTREG_TEST_THREAD_FACTORY=Virtual and can tickle issue that we don't always see immediately in main line.

@txominpelu
Copy link
Contributor Author

Thanks for the follow up @AlanBateman. I can take care of creating the PR with the proposed fix.

I'm still struggling to understand this sentence though:

The test also fixes a potential issue in the test where CountDownLatch.countDown unparks the main (virtual) thread and the main thread observes the dummy thread is transition.

Would you mind explaining why unparking the main (virtual) thread can end up with the main thread observing the dummy thread in transition ?

@AlanBateman
Copy link
Contributor

Would you mind explaining why unparking the main (virtual) thread can end up with the main thread observing the dummy thread in transition ?

The unpark of main requires queueing the task for main. This can trigger a new FJP worker to be started and there are several points where it may park. For reason that are complicated to go into here, this happens in the context of the carrier. The thread dump at this point is seeing a mounted continuation but JavaThread._vthread is pointing to self temporarily.

@txominpelu
Copy link
Contributor Author

In that case the one that would transition would be the "dummy vthread" or the "main virtual thread" ?

Based on the explanation it sounds like the main virtual thread is the one having to be scheduled and that may go through transitions but in the JBS description I understood that the one that could be subject to transitions after the countdown call was the dummy vthread.

@AlanBateman
Copy link
Contributor

It's the countDown by "dummy" thread will unpark "main". The queuing and starting of the second FJ worker thread is happening in the context of dummy's carrier (not dummy itself). We want to eventually remove these temporary transitions, at least for the default scheduler, but aren't there yet. I've attached an initial patch to JDK-8334215 for the issues.

@txominpelu
Copy link
Contributor Author

Thanks a lot @AlanBateman ! That helps me to get a better understanding of what's going on.

I've created a PR with the patch that you shared in JDK-8334215

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hotspot-runtime hotspot-runtime-dev@openjdk.org integrated Pull request has been integrated serviceability serviceability-dev@openjdk.org

Development

Successfully merging this pull request may close these issues.

6 participants