Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JITServer failures with vmState=0x0005ffff #10663

Closed
mpirvu opened this issue Sep 21, 2020 · 19 comments
Closed

JITServer failures with vmState=0x0005ffff #10663

mpirvu opened this issue Sep 21, 2020 · 19 comments
Labels
bug comp:jitserver Artifacts related to JIT-as-a-Service project

Comments

@mpirvu
Copy link
Contributor

mpirvu commented Sep 21, 2020

Recently we started to see relatively many failures in JITServer mode with a vmState of 0x0005ffff.
Some examples:
Java8

cmdLineTester_decompilationTests_1 
variation: Mode612-OSR
JVM_OPTIONS: -XX:+UseJITServer -Xcompressedrefs -Xgcpolicy:gencon -Xjit:enableOSR,count=0 
Testing: decomp001
...
Unhandled exception
Type=Segmentation error vmState=0x0005ffff
...
Module=/home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_x86-64_linux_jit_Personal_testList_0/openjdkbinary/j2sdk-image/jre/lib/amd64/compressedrefs/libj9jit29.so
Module_base_address=00007F5D041BD000

Method_being_compiled=java/util/TreeSet.contains(Ljava/lang/Object;)Z
cmdLineTester_jvmtitests_hcr_SE80_7
JVM_OPTIONS: -XX:+UseJITServer -Xgcpolicy:metronome -Xcompressedrefs 
Testing: rc019a
...
Unhandled exception
Type=Segmentation error vmState=0x0005ffff
...
Module=/home/jenkins/workspace/Test_openjdk8_j9_sanity.functional_x86-64_linux_jit_Personal_testList_1/openjdkbinary/j2sdk-image/jre/lib/amd64/compressedrefs/libj9jit29.so
Module_base_address=00007FEDA8136000

Method_being_compiled=java/lang/StringBuilder.setLength(I)V

Java11

JCL_Test_2
variation: -XX:+CompactStrings
JVM_OPTIONS: -XX:+UseJITServer -XX:+CompactStrings 
...
Unhandled exception
Type=Segmentation error vmState=0x0005ffff
...
Module=/home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_jit_Personal_testList_0/openjdkbinary/j2sdk-image/lib/compressedrefs/libj9jit29.so
Module_base_address=00007EFF7B1BA000

Method_being_compiled=java/util/zip/ZipFile$ZipFileInputStream.initDataOffset()J
pthreadDestructor_1
variation: Mode610
JVM_OPTIONS: -XX:+UseJITServer -Xcompressedrefs -Xjit -Xgcpolicy:gencon
...
Type=Segmentation error vmState=0x0005ffff
cmdLineTester_jvmtitests_hcr_7
variation: Mode351
JVM_OPTIONS: -XX:+UseJITServer -Xgcpolicy:metronome -Xcompressedrefs 
Testing: rc019a

Running command: "/home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_jit_Personal_testList_1/openjdkbinary/j2sdk-image/bin/java" -XX:+UseJITServer -Xgcpolicy:metronome -Xcompressedrefs  -Xdump    -agentlib:jvmtitest=test:rc019a -cp "/home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_jit_Personal_testList_1/openjdk-tests/TKG/../../jvmtest/functional/cmdLineTests/jvmtitests/jvmtitest.jar:/home/jenkins/workspace/Test_openjdk11_j9_sanity.functional_x86-64_linux_jit_Personal_testList_1/openjdk-tests/TKG/../TKG/lib/asm-all.jar" com.ibm.jvmti.tests.util.TestRunner
Unhandled exception
Type=Segmentation error vmState=0x0005ffff

I believe that some JITServer incompatible change was delivered to the code base and we need to find what.

@mpirvu mpirvu added bug comp:jitserver Artifacts related to JIT-as-a-Service project labels Sep 21, 2020
@mpirvu
Copy link
Contributor Author

mpirvu commented Sep 21, 2020

Attn: @dmitry-ten

@mpirvu
Copy link
Contributor Author

mpirvu commented Sep 22, 2020

There are several tests that crash in sanity. One of them is cmdLineTester_SCHelperCompatibilityTests_unix_1
though the reproduction rate decreases a lot when running this alone.
A stack trace of a crash at the client:

#12 <signal handler called>
#13 0x00007fc903399b36 in TR_J9VM::getBaseComponentClass(TR_OpaqueClassBlock*, int&) ()
   from /home/mpirvu/FullVM/openj9-openjdk-jdk8/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/amd64/compressedrefs/libj9jit29.so
#14 0x00007fc9034e38ea in JITServerHelpers::packRemoteROMClassInfo[abi:cxx11](J9Class*, J9VMThread*, TR_Memory*, bool) ()
   from /home/mpirvu/FullVM/openj9-openjdk-jdk8/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/amd64/compressedrefs/libj9jit29.so
#15 0x00007fc9034a2042 in handleServerMessage(JITServer::ClientStream*, TR_J9VM*, JITServer::MessageType&) ()
   from /home/mpirvu/FullVM/openj9-openjdk-jdk8/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/amd64/compressedrefs/libj9jit29.so
#16 0x00007fc9034afc4a in remoteCompile(J9VMThread*, TR::Compilation*, TR_ResolvedMethod*, J9Method*, TR::IlGeneratorMethodDetails&, TR::CompilationInfoPerThreadBase*) ()
   from /home/mpirvu/FullVM/openj9-openjdk-jdk8/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/amd64/compressedrefs/libj9jit29.so
#17 0x00007fc903354f41 in TR::CompilationInfoPerThreadBase::compile(J9VMThread*, TR::Compilation*, TR_ResolvedMethod*, TR_J9VMBase&, TR_OptimizationPlan*, TR::SegmentAllocator const&) ()
   from /home/mpirvu/FullVM/openj9-openjdk-jdk8/build/linux-x86_64-normal-server-release/images/j2sdk-image/jre/lib/amd64/compressedrefs/libj9jit29.so
#18 0x00007fc903355cfa in TR::CompilationInfoPerThreadBase::wrappedCompile(J9PortLibrary*, void*) ()

The class that TR_J9VM::getBaseComponentClass works on looks bogus.

@dmitry-ten
Copy link
Contributor

Last time I saw the crash in this place, it was due to #10397

@dmitry-ten
Copy link
Contributor

dmitry-ten commented Sep 22, 2020

I got cmdLineTester_SCHelperCompatibilityTests_unix_1 to segfault on the server, inside TR::SymbolValidationManager::isClassWorthRemembering. One of the classes passed inside it to _fej9->isSameOrSuperClass seems to be invalid.
So #10644 is probably the culprit, as it's one of the few changes between passing and failing nightly builds, and the above method has been reworked in it.
I'm running sanity.functional testing with before #10644 was merged, and so far Java 8 passed, Java 11 is still running.
However, if this is the problematic PR, I haven't found any problems it could be causing for JITServer yet.
The only significant change is that java/lang/StringBuffer is now considered to be a class not worth remembering but that class doesn't even trigger isSameOrSuperClass check.

@pshipton
Copy link
Member

@dmitry-ten #10664 isn't merged yet, is there a typo?

@dmitry-ten
Copy link
Contributor

Oops, yeah, I mean #10644

@dmitry-ten
Copy link
Contributor

Java 11 sanity tests also passed without the SVM change.

@mpirvu
Copy link
Contributor Author

mpirvu commented Sep 22, 2020

I think this line can create problems for JITServer:
static SystemClassNotWorthRemembering _systemClassesNotWorthRemembering[];
Being a static it will apply to all clients and a j9method for one client may be a bogus pointer for another client.

@dmitry-ten
Copy link
Contributor

Oh, true. Yeah, then it has to be due to system class addresses persisting across multiple clients.

@mpirvu
Copy link
Contributor Author

mpirvu commented Sep 22, 2020

One possible fix could be to store _systemClassesNotWorthRemembering[]; array into the client session and have a function that returns a pointer to this array. For non-jitserver it could return the address of the static field as defined by Irwin and for JITServer it could return a pointer to a per-client array stored in the session data.

@dmitry-ten
Copy link
Contributor

Yeah, that's what I was thinking as well, I'll make a pr.

@dmitry-ten
Copy link
Contributor

I think we could cache the classes not worth remembering in VMInfo safely, since they are all system classes, so I think they should be always loaded.

@dmitry-ten
Copy link
Contributor

@dsouzai does SVM need to know if the class not worth remembering gets redefined? Because if a class gets redefined, its J9Class pointer might change, although I am not sure if this applies to system classes. Current implementation of _systemClassesNotWorthRemembering does not handle this case so I assume that's not an issue.

@dsouzai
Copy link
Contributor

dsouzai commented Sep 23, 2020

@dsouzai does SVM need to know if the class not worth remembering gets redefined?

The class not worth remembering is a heuristic; if the class gets redefined, the _systemClassesNotWorthRemembering array could be updated, but I don't think it's worth the effort.

@mpirvu
Copy link
Contributor Author

mpirvu commented Sep 23, 2020

In the following code:

    if (systemClassNotWorthRemembering->_checkIsSuperClass)
         {
         if (systemClassNotWorthRemembering->_clazz &&
             _fej9->isSameOrSuperClass((J9Class *)systemClassNotWorthRemembering->_clazz, (J9Class *)clazz))
            {
            if (_comp->getOption(TR_TraceRelocatableDataCG))
               traceMsg(_comp, "isClassWorthRemembering: clazz %p is or inherits from %s (%p)\n",
                        clazz, systemClassNotWorthRemembering->_className, systemClassNotWorthRemembering->_clazz);

            worthRemembering = false;
            }

we call _fej9->isSameOrSuperClass((J9Class *)systemClassNotWorthRemembering->_clazz, (J9Class *)clazz
If there is a chance of systemClassNotWorthRemembering->_clazz becoming bogus due to redefinition things may blow up.

@dsouzai
Copy link
Contributor

dsouzai commented Sep 23, 2020

we call _fej9->isSameOrSuperClass((J9Class *)systemClassNotWorthRemembering->_clazz, (J9Class *)clazz
If there is a chance of systemClassNotWorthRemembering->_clazz becoming bogus due to redefinition things may blow up.

When classes are redefined, they get redefined in place. So even if systemClassNotWorthRemembering->_clazz is now a completely different class, it will still be a valid J9Class * pointer.

So what I said in #10663 (comment) might not even be necessary to do.

@mpirvu
Copy link
Contributor Author

mpirvu commented Sep 23, 2020

they get redefined in place.

That's what I wanted to hear! I thought that might happen, but wasn't sure.

@dmitry-ten
Copy link
Contributor

This issue has been resolved and should be closed.

@mpirvu
Copy link
Contributor Author

mpirvu commented Oct 8, 2020

Fixed by #10682

@mpirvu mpirvu closed this as completed Oct 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug comp:jitserver Artifacts related to JIT-as-a-Service project
Projects
None yet
Development

No branches or pull requests

4 participants