Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hotspot test serviceability/sa/ClhsdbCDSCore.java hangs on adoptium infra #3745

Open
zzambers opened this issue Sep 18, 2024 · 5 comments
Open
Labels

Comments

@zzambers
Copy link

zzambers commented Sep 18, 2024

I can see, that this test hangs on adoptium infra, being killed on timeout (seems reliable):
serviceability/sa/ClhsdbCDSCore.java

I can see this both in dev.openjdk run and when ran in grinder.

Output:

Starting ClhsdbCDSCore test
Command line: [/home/jenkins/workspace/Grinder/jdkbinary/j2sdk-image/bin/java -cp /home/jenkins/workspace/Grinder/aqa-tests/TKG/output_17265736059645/hotspot_custom_0/work/classes/0/serviceability/sa/ClhsdbCDSCore.d:/home/jenkins/workspace/Grinder/aqa-tests/openjdk/openjdk-jdk/test/hotspot/jtreg/serviceability/sa:/home/jenkins/workspace/Grinder/aqa-tests/TKG/output_17265736059645/hotspot_custom_0/work/classes/0/test/lib:/home/jenkins/workspace/Grinder/aqa-tests/openjdk/openjdk-jdk/test/lib:/home/jenkins/workspace/Grinder/jvmtest/openjdk/jtreg/lib/javatest.jar:/home/jenkins/workspace/Grinder/jvmtest/openjdk/jtreg/lib/jtreg.jar -ea -esa -Xmx512m -XX:+UseCompressedOops -Xshare:dump -Xlog:cds,cds+hashtables -XX:SharedArchiveFile=./ArchiveForClhsdbCDSCore.jsa ]
[2024-09-17T11:46:52.145720Z] Gathering output for process 25719
[ELAPSED: 447 ms]
[logging stdout to serviceability.sa.ClhsdbCDSCore.java-0000-dump.stdout]
[logging stderr to serviceability.sa.ClhsdbCDSCore.java-0000-dump.stderr]
[STDERR]

[2024-09-17T11:46:52.603422Z] Waiting for completion for process 25719
[2024-09-17T11:46:52.603687Z] Waiting for completion finished for process 25719
Command line: [/home/jenkins/workspace/Grinder/jdkbinary/j2sdk-image/bin/java -cp /home/jenkins/workspace/Grinder/aqa-tests/TKG/output_17265736059645/hotspot_custom_0/work/classes/0/serviceability/sa/ClhsdbCDSCore.d:/home/jenkins/workspace/Grinder/aqa-tests/openjdk/openjdk-jdk/test/hotspot/jtreg/serviceability/sa:/home/jenkins/workspace/Grinder/aqa-tests/TKG/output_17265736059645/hotspot_custom_0/work/classes/0/test/lib:/home/jenkins/workspace/Grinder/aqa-tests/openjdk/openjdk-jdk/test/lib:/home/jenkins/workspace/Grinder/jvmtest/openjdk/jtreg/lib/javatest.jar:/home/jenkins/workspace/Grinder/jvmtest/openjdk/jtreg/lib/jtreg.jar -ea -esa -Xmx512m -XX:+UseCompressedOops -Xmx512m -XX:+UnlockDiagnosticVMOptions -XX:SharedArchiveFile=ArchiveForClhsdbCDSCore.jsa -XX:+CreateCoredumpOnCrash -Xshare:auto -XX:+ProfileInterpreter --add-exports=java.base/jdk.internal.misc=ALL-UNNAMED -XX:-AlwaysPreTouch CrashApp ]
[2024-09-17T11:46:52.610596Z] Gathering output for process 25735
[2024-09-17T11:46:52.611510Z] Waiting for completion for process 25735
[2024-09-17T11:46:52.628039Z] Waiting for completion finished for process 25735
Run test with ulimit -c: unlimited
[2024-09-17T11:46:52.630845Z] Gathering output for process 25738
Timeout signalled after 19200 seconds

Notes:
I have tried to reproduce this locally or on our ifra both manually invoking jtreg and through aqa-tests, but failed to reproduce it. Maybe it is inra/environment issue? Test first intentionally crashes the VM using Unsafe class to produce core file. However this hangs when ran on adoptium infra. Maybe something with core dump settings? I don't know.

@zzambers
Copy link
Author

zzambers commented Sep 18, 2024

This could be related to JDK-8283410, but on Adoptium infra it seems to affect linux (not windows?).

@sophia-guo
Copy link

sophia-guo commented Sep 23, 2024

@zzambers I did run it on a different agent ClhsdbCDSCore.java and it passed https://ci.adoptium.net/view/Test_grinder/job/Grinder/10970/ ( failed one is due to no test selected.) So it might be related with infra as you can't reproduce it on your environment. Could you please move it to infra repo? Or I can move it if you agree?

@zzambers
Copy link
Author

@sophia-guo by moving you mean filling the same issue there and closing this one?

@sophia-guo
Copy link

There is a transfer issue link at the right side of the issue.
Screenshot 2024-09-25 at 9 37 10 AM

I'm not sure if it's clickable for you as it might be related with the permission. I will just do this.

@sophia-guo sophia-guo transferred this issue from adoptium/aqa-tests Sep 25, 2024
@sxa
Copy link
Member

sxa commented Oct 5, 2024

@zzambers I did run it on a different agent ClhsdbCDSCore.java and it passed https://ci.adoptium.net/view/Test_grinder/job/Grinder/10970/ ( failed one is due to no test selected.) So it might be related with infra as you can't reproduce it on your environment. Could you please move it to infra repo? Or I can move it if you agree?

@sophia-guo Can you get a list of which machines/distributions it passes and fails on? Your one was run on RHEL. Both of zzambers' runs were on an (old, out of support) Ubuntu distribution (although neither were in containers). At the moment I'm not sure we have enough information to be able to be able to take an action this one in the infrastructure repo since it's not clear what is needed to resolve it.

@sxa sxa added the testFail label Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: No status
Development

No branches or pull requests

3 participants