Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JDK17 Segmentation error vmState=0x00020019 #15592

Closed
connglli opened this issue Jul 21, 2022 · 4 comments
Closed

JDK17 Segmentation error vmState=0x00020019 #15592

connglli opened this issue Jul 21, 2022 · 4 comments

Comments

@connglli
Copy link

Java -version output

openjdk version "17.0.4-internal" 2022-07-19
OpenJDK Runtime Environment (build 17.0.4-internal+0-adhoc..openj9-openjdk-jdk17)
Eclipse OpenJ9 VM (build master-79f0b73fa, JRE 17 Linux amd64-64-Bit Compressed References 20220701_000000 (JIT enabled, AOT enabled)
OpenJ9   - 79f0b73fa
OMR      - d018241d7
JCL      - c6e2f71170b based on jdk-17.0.4+7)

Summary of problem

This test is a bit triky. It looks quite similar (see vMeth) to the test in #15575 (see vMeth1) and #15474 (see vMeth1). But this time we can trigger a different crash, even though undeterministic and a bit similar to them.

Also a JIT bug (cannot reproduce with Xint).

class Test {
  int N = 256;
  double[][] dArrFld = new double[N][N];

  {
    init(dArrFld, 124.75520);
  }

  void vMeth(double d) {
    int ax$7 = 0x26cb6487;
    byte[] ax$6 = new byte[ax$7];
    int ax$9 = ax$6.length;
    for (; ax$9 > 0; ax$9--) {
      int ax$8 = ax$6.length - ax$9;
      ax$6[ax$8] = (byte) 0xff;
    }
  }

  void mainTest(String[] strArr1) {
    int iArr1[][] = new int[N][N];
    try {
      vMeth(0.18146873927708507);
    } catch (Throwable ax$12) {
    } finally {
    }
  }

  public static void main(String[] strArr) {
    Test _instance = new Test();
    for (; ; ) _instance.mainTest(strArr);
  }

  public static void init(double[] a, double seed) {
    for (int j = 0; j < a.length; j++) {
      a[j] = (j % 2 == 0) ? seed + j : seed - j;
    }
  }

  public static void init(double[][] a, double seed) {
    for (int j = 0; j < a.length; j++) {
      init(a[j], seed);
    }
  }
}

Diagnostic files

By issuing

$ java -Xmx1G -Xshareclasses:none Test

the following crash log is given (the stack trace is similar to but different from 15474):

Unhandled exception
Type=Segmentation error vmState=0x00020019
J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000001
Handler1=00007F906D936EF0 Handler2=00007F906D138D70 InaccessibleAddress=0000000000000004
RDI=FFFFFFFFFFFFFFFC RSI=0000000000000000 RAX=0000000000000000 RBX=FFFFFFFFFFFFFFFC
RCX=0000000000000001 RDX=0000000000000000 R8=0000000000000000 R9=00000000E6DCB938
R10=0000000000000001 R11=0000000000000000 R12=FFFFFFFFFFFFFFFF R13=00007F906813B690
R14=0000000026CB6498 R15=0000000000000000
RIP=00007F9066C896D0 GS=0000 FS=0000 RSP=00007F906E1C4830
EFlags=0000000000010282 CS=0033 RBP=0000000000000000 ERR=0000000000000004
TRAPNO=000000000000000E OLDMASK=0000000000000000 CR2=0000000000000004
xmm0 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm1 00000000ff000000 (f: 4278190080.000000, d: 2.113707e-314)
xmm2 0000000048d067ff (f: 1221617664.000000, d: 6.035593e-315)
xmm3 bfdff557482d234b (f: 1210917760.000000, d: -4.993494e-01)
xmm4 000000003f804000 (f: 1065369600.000000, d: 5.263625e-315)
xmm5 bff0000000000000 (f: 0.000000, d: -1.000000e+00)
xmm6 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm7 0000000042150c66 (f: 1108675712.000000, d: 5.477586e-315)
xmm8 6332313578766100 (f: 2021024000.000000, d: 6.865676e+169)
xmm9 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm10 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm11 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm12 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm13 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm14 0000000000000000 (f: 0.000000, d: 0.000000e+00)
xmm15 0000000000000000 (f: 0.000000, d: 0.000000e+00)
Module=/zdata/congli/OpenJ9/jdk17/lib/default/libj9gc29.so
Module_base_address=00007F9066AD1000
Target=2_90_20220701_000000 (Linux 5.4.0-122-generic)
CPU=amd64 (128 logical CPUs) (0x3ee84da000 RAM)
----------- Stack Backtrace -----------
_ZN36MM_MemoryPoolSplitAddressOrderedList16internalAllocateEP18MM_EnvironmentBasembP27MM_LargeObjectAllocateStats+0x260 (0x00007F9066C896D0 [libj9gc29.so+0x1b86d0])
_ZN40MM_MemoryPoolSplitAddressOrderedListBase14allocateObjectEP18MM_EnvironmentBaseP22MM_AllocateDescription+0x38 (0x00007F9066C8EEA8 [libj9gc29.so+0x1bdea8])
_ZN25MM_MemoryPoolLargeObjects14allocateObjectEP18MM_EnvironmentBaseP22MM_AllocateDescription+0x9b (0x00007F9066C8587B [libj9gc29.so+0x1b487b])
_ZN24MM_MemorySubSpaceGeneric14allocateObjectEP18MM_EnvironmentBaseP22MM_AllocateDescriptionP17MM_MemorySubSpaceS5_b+0x22a (0x00007F9066C92E3A [libj9gc29.so+0x1c1e3a])
_ZN21MM_MemorySubSpaceFlat14allocateObjectEP18MM_EnvironmentBaseP22MM_AllocateDescriptionP17MM_MemorySubSpaceS5_b+0x70 (0x00007F9066C91230 [libj9gc29.so+0x1c0230])
_ZN29MM_MemorySubSpaceGenerational14allocateObjectEP18MM_EnvironmentBaseP22MM_AllocateDescriptionP17MM_MemorySubSpaceS5_b+0x1ed (0x00007F9066CABB0D [libj9gc29.so+0x1dab0d])
_ZN26MM_MemorySubSpaceSemiSpace14allocateObjectEP18MM_EnvironmentBaseP22MM_AllocateDescriptionP17MM_MemorySubSpaceS5_b+0x70 (0x00007F9066CAD430 [libj9gc29.so+0x1dc430])
_ZN24MM_MemorySubSpaceGeneric14allocateObjectEP18MM_EnvironmentBaseP22MM_AllocateDescriptionP17MM_MemorySubSpaceS5_b+0x203 (0x00007F9066C92E13 [libj9gc29.so+0x1c1e13])
_ZN26MM_MemorySubSpaceSemiSpace14allocateObjectEP18MM_EnvironmentBaseP22MM_AllocateDescriptionP17MM_MemorySubSpaceS5_b+0x70 (0x00007F9066CAD430 [libj9gc29.so+0x1dc430])
_ZN12MM_Collector14garbageCollectEP18MM_EnvironmentBaseP17MM_MemorySubSpaceP22MM_AllocateDescriptionjP28MM_ObjectAllocationInterfaceS3_P20MM_AllocationContext+0x1eb (0x00007F9066BDE94B [libj9gc29.so+0x10d94b])
_ZN26MM_MemorySubSpaceSemiSpace23allocationRequestFailedEP18MM_EnvironmentBaseP22MM_AllocateDescriptionN17MM_MemorySubSpace14AllocationTypeEP28MM_ObjectAllocationInterfacePS4_S8_+0x1ed (0x00007F9066CACF9D [libj9gc29.so+0x1dbf9d])
_ZN24MM_MemorySubSpaceGeneric14allocateObjectEP18MM_EnvironmentBaseP22MM_AllocateDescriptionP17MM_MemorySubSpaceS5_b+0x37b (0x00007F9066C92F8B [libj9gc29.so+0x1c1f8b])
_ZN25MM_TLHAllocationInterface14allocateObjectEP18MM_EnvironmentBaseP22MM_AllocateDescriptionP14MM_MemorySpaceb+0x19e (0x00007F9066C0808E [libj9gc29.so+0x13708e])
_Z21OMR_GC_AllocateObjectP12OMR_VMThreadP25MM_AllocateInitialization+0xca (0x00007F9066C0E28A [libj9gc29.so+0x13d28a])
J9AllocateIndexableObject+0x7f7 (0x00007F9066B1A757 [libj9gc29.so+0x49757])
old_slow_jitNewArray+0xba (0x00007F90678FE0CA [libj9jit29.so+0x9620ca])
 (0x00007F9067911F41 [libj9jit29.so+0x975f41])
---------------------------------------
JVMDUMP039I Processing dump event "gpf", detail "" at 2022/07/21 13:09:41 - please wait.
JVMDUMP032I JVM requested System dump using '/zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/aaa/core.20220721.130941.983468.0001.dmp' in response to an event
JVMDUMP010I System dump written to /zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/aaa/core.20220721.130941.983468.0001.dmp
JVMDUMP032I JVM requested Java dump using '/zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/aaa/javacore.20220721.130941.983468.0002.txt' in response to an event
JVMDUMP010I Java dump written to /zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/aaa/javacore.20220721.130941.983468.0002.txt
JVMDUMP032I JVM requested Snap dump using '/zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/aaa/Snap.20220721.130941.983468.0003.trc' in response to an event
JVMDUMP010I Snap dump written to /zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/aaa/Snap.20220721.130941.983468.0003.trc
JVMDUMP032I JVM requested JIT dump using '/zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/aaa/jitdump.20220721.130941.983468.0004.dmp' in response to an event
JVMDUMP051I JIT dump occurred in 'main' thread 0x000000000004DD00
JVMDUMP053I JIT dump is recompiling Test.vMeth(D)V

But note, this crash does not always happen. You sometimes trigger the same crash as 15575, sometimes a different assertion failure like

15:21:24.375 0x4dd00    j9mm.107    *   ** ASSERTION FAILED ** at /root/hostdir/openj9-openjdk-jdk17/omr/gc/base/MemoryPoolSplitAddressOrderedList.cpp:402: ((false && (env->getExtensions()->objectModel.isDeadObject((omrobjectptr_t)freeEntry))))
Bad scan type for object pointer 00000000E6D8E610
15:21:24.375 0x221c00    j9mm.141    *   ** ASSERTION FAILED ** at /root/hostdir/openj9-openjdk-jdk17/openj9/runtime/gc_glue_java/ScavengerDelegate.cpp:392: ((false))
JVMDUMP039I Processing dump event "traceassert", detail "" at 2022/07/20 17:21:24 - please wait.
JVMDUMP039I Processing dump event "traceassert", detail "" at 2022/07/20 17:21:24 - please wait.
JVMDUMP032I JVM requested System dump using '/zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/red/core.20220720.172124.39739.0001.dmp' in response to an event
JVMDUMP010I System dump written to /zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/red/core.20220720.172124.39739.0001.dmp
JVMDUMP032I JVM requested Java dump using '/zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/red/javacore.20220720.172124.39739.0003.txt' in response to an event
JVMDUMP032I JVM requested System dump using '/zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/red/core.20220720.172124.39739.0002.dmp' in response to an event
JVMDUMP010I Java dump written to /zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/red/javacore.20220720.172124.39739.0003.txt
JVMDUMP010I System dump written to /zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/red/core.20220720.172124.39739.0002.dmp
JVMDUMP032I JVM requested Java dump using '/zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/red/javacore.20220720.172124.39739.0005.txt' in response to an event
JVMDUMP032I JVM requested Snap dump using '/zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/red/Snap.20220720.172124.39739.0004.trc' in response to an event
JVMDUMP010I Snap dump written to /zdata/congli/ax-exp/ax-eval/2-ax-only/110.openj9/mutant/red/Snap.20220720.172124.39739.0004.trc
JVMDUMP013I Processed dump event "traceassert", detail "".

You may try it for several times.

You can also try the unreduced test in openj9-bug-110.tar.gz. But it is not deterministic, either. This zip contains all the logs (core, snap, etc.), the reduced test (Test.java, Test.class), and unreduced (Test.java.orig, FuzzerUtils.java).

@dmitripivkine
Copy link
Contributor

dmitripivkine commented Jul 21, 2022

The reason for triggering assertions is heap corruption most likely:

15:21:24.375 0x4dd00    j9mm.107    *   ** ASSERTION FAILED ** at /root/hostdir/openj9-openjdk-jdk17/omr/gc/base/MemoryPoolSplitAddressOrderedList.cpp:402: ((false && (env->getExtensions()->objectModel.isDeadObject((omrobjectptr_t)freeEntry))))

This assertion is triggered in Object Allocation path. There is attempt to get memory from Memory Pool. The problem is Memory Pool expects to discover Linked Free Header at the next list element location (aka Dead Object) but it is not there.

Bad scan type for object pointer 00000000E6D8E610
15:21:24.375 0x221c00    j9mm.141    *   ** ASSERTION FAILED ** at /root/hostdir/openj9-openjdk-jdk17/openj9/runtime/gc_glue_java/ScavengerDelegate.cpp:392: ((false))

Scavenger expects object at 0xE6D8E610 but can not recognize object type (garbage most likely)

Crash is caused by heap corruption as well:
Again, this is Object Allocation path.
Memory Pool expects Linked Free Header at 0xE6DCB938 but there is garbage:

> !j9modronfreelist 0x00007F906813CB30
J9ModronFreeList at 0x7f906813cb30 {
  Fields for J9ModronFreeList:
	0x0: class MM_LightweightNonReentrantLock _lock = !mm_lightweightnonreentrantlock 0x00007F906813CB30
	0x148: class MM_HeapLinkedFreeHeader* _freeList = !mm_heaplinkedfreeheader 0x00000000E6DCB938 <--------
	0x150: U64 _timesLocked = 0x0000000000000004 (4)
	0x158: U64 _freeSize = 0x000000000C373AC8 (204946120)
	0x160: U64 _freeCount = 0x0000000000000001 (1)
	0x168: struct J9ModronAllocateHint* _hintActive = !j9modronallocatehint 0x0000000000000000
	0x170: struct J9ModronAllocateHint* _hintInactive = !j9modronallocatehint 0x00007F906813CD88
	0x178: struct J9ModronAllocateHint[] _hintStorage = !j9x 0x00007F906813CCA8
	0x278: U64 _hintLru = 0x0000000000000001 (1)
}

0xE6DCB920 :  00000000 00000000 00000000 00000000 [ ................ ]
0xE6DCB930 :  00000000 ffff0000 ffffffff ffffffff [ ................ ] <-----
0xE6DCB940 :  ffffffff ffffffff ffffffff ffffffff [ ................ ]

Most likely all three PRs (this, #15474, #15575) are manifestations of the same problem.

@0xdaryl FYI

@hzongaro
Copy link
Member

I can reproduce this easily with 0.33 but not with 0.32, so it could be a new problem. I'll take a closer look. As @dmitripivkine said, it might be the same problem as #15474 and #15575.

@hzongaro
Copy link
Member

Duplicate of #15474

@hzongaro hzongaro marked this as a duplicate of #15474 Sep 12, 2022
@hzongaro hzongaro self-assigned this Sep 12, 2022
@hzongaro
Copy link
Member

Fixed by pull request #15870.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants