Skip to content

Segmentation fault during tests of Node 10.15.3 on PowerPC musl #27068

Closed
@awilfox

Description

@awilfox

I've been doing some really deep digging on this issue to attempt to make it as easy as possible for an expert to fix, but sadly, I am not that expert; I've never done anything with V8 before.

I'm the project lead of Adélie Linux, and we're hoping to bring Node.js to our distribution. We are using the musl libc instead of glibc (like Void/musl and others). One of our primary CPU architectures is the 64-bit PowerPC. Eight tests fail on this platform:

=== release test-fs-watch-close-when-destroyed ===                            
Path: parallel/test-fs-watch-close-when-destroyed
internal/fs/watchers.js:173
    throw error;
    ^

Error: EMFILE: too many open files, watch '/usr/src/packages/user/node/src/node-v10.15.3/test/.tmp.16/watched-directory'
    at FSWatcher.start (internal/fs/watchers.js:165:26)
    at Object.watch (fs.js:1253:11)
    at Object.<anonymous> (/usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-close-when-destroyed.js:15:20)
    at Module._compile (internal/modules/cjs/loader.js:701:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:712:10)
    at Module.load (internal/modules/cjs/loader.js:600:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:539:12)
    at Function.Module._load (internal/modules/cjs/loader.js:531:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:754:12)
    at startup (internal/bootstrap/node.js:283:19)
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-close-when-destroyed.js
=== release test-fs-watch-encoding ===                                  
Path: parallel/test-fs-watch-encoding
internal/fs/watchers.js:173
    throw error;
    ^

Error: EMFILE: too many open files, watch '/usr/src/packages/user/node/src/node-v10.15.3/test/.tmp.19'
    at FSWatcher.start (internal/fs/watchers.js:165:26)
    at Object.watch (fs.js:1253:11)
    at Object.<anonymous> (/usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-encoding.js:45:21)
    at Module._compile (internal/modules/cjs/loader.js:701:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:712:10)
    at Module.load (internal/modules/cjs/loader.js:600:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:539:12)
    at Function.Module._load (internal/modules/cjs/loader.js:531:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:754:12)
    at startup (internal/bootstrap/node.js:283:19)
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-encoding.js
=== release test-fs-watch-enoent ===                              
Path: parallel/test-fs-watch-enoent
assert.js:85
  throw new AssertionError(obj);
  ^

AssertionError [ERR_ASSERTION]: Input A expected to strictly equal input B:
+ expected - actual

- 'EMFILE: too many open files, watch \'/usr/src/packages/user/node/src/node-v10.15.3/test/.tmp.2/non-existent\''
+ 'ENODEV: no such device, watch \'/usr/src/packages/user/node/src/node-v10.15.3/test/.tmp.2/non-existent\''
    at Object.validateError (/usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-enoent.js:27:14)
    at expectedException (assert.js:570:19)
    at expectsError (assert.js:659:16)
    at Function.throws (assert.js:690:3)
    at Object.<anonymous> (/usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-enoent.js:36:10)
    at Module._compile (internal/modules/cjs/loader.js:701:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:712:10)
    at Module.load (internal/modules/cjs/loader.js:600:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:539:12)
    at Function.Module._load (internal/modules/cjs/loader.js:531:3)
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watch-enoent.js
=== release test-fs-watchfile ===                             
Path: parallel/test-fs-watchfile
internal/fs/watchers.js:173
    throw error;
    ^

Error: EMFILE: too many open files, watch '/usr/src/packages/user/node/src/node-v10.15.3/test/.tmp.12/watch'
    at FSWatcher.start (internal/fs/watchers.js:165:26)
    at Object.watch (fs.js:1253:11)
    at /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watchfile.js:95:8
    at /usr/src/packages/user/node/src/node-v10.15.3/test/common/index.js:340:15
    at FSReqWrap.args [as oncomplete] (fs.js:140:20)
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-fs-watchfile.js
=== release test-process-euid-egid ===                                        
Path: parallel/test-process-euid-egid
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-process-euid-egid.js
--- CRASHED (Signal: 11) ---
=== release test-process-uid-gid ===                                          
Path: parallel/test-process-uid-gid
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/parallel/test-process-uid-gid.js
--- CRASHED (Signal: 11) ---
=== release test-querywrap ===                                                
Path: async-hooks/test-querywrap
Command: out/Release/node --expose-gc /usr/src/packages/user/node/src/node-v10.15.3/test/async-hooks/test-querywrap.js
--- CRASHED (Signal: 11) ---
=== release test-tlswrap ===                                 
Path: async-hooks/test-tlswrap
Command: out/Release/node /usr/src/packages/user/node/src/node-v10.15.3/test/async-hooks/test-tlswrap.js
--- CRASHED (Signal: 11) ---
[03:14|% 100|+ 2423|-   8]: Done                                              

I believe the EMFILE ones are possibly related to customisations in the musl library that we have done and this issue does not cover them. This issue is about test-querywrap.

This crash happens every time. A core dump, from Release or Debug node binaries, can be provided if helpful. The crash happens with the following backtrace:

(gdb) bt
#0  __pthread_mutex_lock (m=0x5870) at src/thread/pthread_mutex_lock.c:5
#1  0x000000010144dc50 in v8::base::LockNativeHandle (mutex=<optimized out>) at ../deps/v8/src/base/platform/mutex.cc:130
#2  v8::base::RecursiveMutex::Lock (this=<optimized out>) at ../deps/v8/src/base/platform/mutex.cc:130
#3  0x0000000100cca420 in v8::internal::ExecutionAccess::Lock (isolate=<optimized out>) at ../deps/v8/src/isolate.h:1773
#4  v8::internal::ExecutionAccess::ExecutionAccess (isolate=<optimized out>, this=<synthetic pointer>) at ../deps/v8/src/isolate.h:1769
#5  v8::internal::StackGuard::CheckAndClearInterrupt (this=0x101d8ef40, flag=<optimized out>) at ../deps/v8/src/execution.cc:409
#6  0x0000000100cca9e8 in v8::internal::StackGuard::HandleInterrupts (this=0x101d8ef40) at ../deps/v8/src/execution.cc:509
#7  0x0000000100fdbe28 in v8::internal::NativeRegExpMacroAssembler::CheckStackGuardState (isolate=0x101d896a0, start_index=<optimized out>, is_direct_call=<optimized out>, return_address=0x3fffffffd040, re_code=0x2af21ec6ae1, subject=0x3fffffffd080, 
    input_start=0x3fffffffd090, input_end=0x3fffffffd098) at ../deps/v8/src/isolate.h:898
#8  0x0000000101212afc in v8::internal::RegExpMacroAssemblerPPC::CheckStackGuardState (return_address=<optimized out>, re_code=<optimized out>, re_frame=<error reading variable: value has been optimized out>)
    at ../deps/v8/src/regexp/ppc/regexp-macro-assembler-ppc.cc:1156
#9  0x000002af21c88190 in ?? ()

Breaking at the CheckStackGuardState function:

Thread 1 "node" hit Breakpoint 1, v8::internal::RegExpMacroAssemblerPPC::CheckStackGuardState (return_address=0x3fffffffd3c0, re_code=0x3c2bdfc6ae1, re_frame=70368744166464) at ../deps/v8/src/regexp/ppc/regexp-macro-assembler-ppc.cc:1156
1156    static T* frame_entry_address(Address re_frame, int frame_offset) {
(gdb) x/2x re_frame+176
0x3fffffffd4f0: 0x00000001
0x3fffffffd4f4: 0x01d48140
(gdb) c
Continuing.

Thread 1 "node" hit Breakpoint 1, v8::internal::RegExpMacroAssemblerPPC::CheckStackGuardState (return_address=0x3fffffffd0d0, re_code=0x3c2bdfc6ae1, re_frame=70368744165712) at ../deps/v8/src/regexp/ppc/regexp-macro-assembler-ppc.cc:1156
1156    static T* frame_entry_address(Address re_frame, int frame_offset) {
(gdb) x/2x re_frame+176
0x3fffffffd200: 0x00000001
0x3fffffffd204: 0x01d896a0
(gdb) c
Continuing.

Thread 1 "node" received signal SIGSEGV, Segmentation fault.
__pthread_mutex_lock (m=0x5870) at src/thread/pthread_mutex_lock.c:5
5               if ((m->_m_type&15) == PTHREAD_MUTEX_NORMAL

This shows that the isolate_ pointer in the frame is being corrupted. Next I tried using the Debug build. I had to merge a single line patch from Node master (which removed the ra != r0 assertion from D_TYPE instructions in the PPC assembler), and also correct the V8 memory allocator to align properly on PPC (the hint at heap/spaces.cc:130 had to be changed from GetRandomMmapAddr() to (GetRandomMmapAddr() & ~(alignment-1)) for debugging purposes only). Using this new Debug build and hardware watchpoints, I was able to find:

Thread 1 "node" hit Breakpoint 1, v8::internal::RegExpMacroAssemblerPPC::CheckStackGuardState (return_address=0x3fffffffcfd0, re_code=0x384bb6107a1, re_frame=70368744165456) at ../deps/v8/src/regexp/ppc/regexp-macro-assembler-ppc.cc:1165
1165          frame_entry<Isolate*>(re_frame, kIsolate),
(gdb) x/2x re_frame+176
0x3fffffffd100: 0x00000001      0x04608de0
(gdb) watch *0x3fffffffcb04
Hardware watchpoint 2: *0x3fffffffcb04
(gdb) c
Continuing.
Thread 1 "node" hit Hardware watchpoint 5: *0x3fffffffcb04

Old value = -13472
New value = 73009584
0x0000000102c28c3c in v8::internal::RegExpStack::stack_base (this=0x0) at ../deps/v8/src/regexp/regexp-stack.h:47
47        Address stack_base() {
(gdb) bt
#0  0x0000000102c28c3c in v8::internal::RegExpStack::stack_base (this=0x0) at ../deps/v8/src/regexp/regexp-stack.h:47
#1  0x0000000102c2a424 in v8::internal::NativeRegExpMacroAssembler::Execute (code=0x3afd98107a1, input=0x9c967027b9, start_offset=13, input_start=0x9c967027dd "\356\375\257", input_end=0x9c967027dd "\356\375\257", output=0x104611b6c, output_size=2, isolate=0x104608de0)
    at ../deps/v8/src/regexp/regexp-macro-assembler.cc:286
#2  0x0000000102c2a33c in v8::internal::NativeRegExpMacroAssembler::Match (regexp_code=..., subject=..., offsets_vector=0x104611b6c, offsets_vector_length=2, previous_index=13, isolate=0x104608de0) at ../deps/v8/src/regexp/regexp-macro-assembler.cc:263
#3  0x0000000102bec8dc in v8::internal::RegExpImpl::IrregexpExecRaw (regexp=..., subject=..., index=13, output=0x104611b6c, output_size=2) at ../deps/v8/src/regexp/jsregexp.cc:478
#4  0x0000000102becbf0 in v8::internal::RegExpImpl::IrregexpExec (regexp=..., subject=..., previous_index=13, last_match_info=...) at ../deps/v8/src/regexp/jsregexp.cc:571
#5  0x0000000102beae60 in v8::internal::RegExpImpl::Exec (regexp=..., subject=..., index=13, last_match_info=...) at ../deps/v8/src/regexp/jsregexp.cc:196
#6  0x0000000102d00890 in v8::internal::__RT_impl_Runtime_RegExpExec (args=..., isolate=0x104608de0) at ../deps/v8/src/runtime/runtime-regexp.cc:918
#7  0x0000000102d003f0 in v8::internal::Runtime_RegExpExec (args_length=4, args_object=0x3fffffffd110, isolate=0x104608de0) at ../deps/v8/src/runtime/runtime-regexp.cc:906
#8  0x000003afd96e7250 in ?? ()
(gdb) x/i $pc
=> 0x102c28c3c <v8::internal::RegExpStack::stack_base()+24>:    std     r31,-8(r1)
(gdb) x/i $pc-4
   0x102c28c38 <v8::internal::RegExpStack::stack_base()+20>:    std     r30,-16(r1)
(gdb) info registers
r0             0x102c2a424         4341277732
r1             0x3fffffffcb10      70368744164112
r2             0x1045c5d00         4368129280
r3             0x10464a920         4368673056
r4             0x0                 0
r5             0xd                 13
r6             0x9c967027dd        672538830813
r7             0x9c967027dd        672538830813
r8             0x2                 2
r9             0x10464a920         4368673056
r10            0x104681b20         4368898848
r11            0x3fffffffd118      70368744165656
r12            0x3ffff7fa2c00      70368609577984
r13            0x3ffff80069a8      70368609986984
r14            0x4                 4
r15            0x102d002a0         4342153888
r16            0x1                 1
r17            0xd00000000         55834574848
r18            0x1                 1
r19            0x0                 0
r20            0xd00000000         55834574848
r21            0x17                23
r22            0x44                68
r23            0x4                 4
r24            0x4                 4
r25            0xf                 15
r26            0x85d28fe519        574763296025
r27            0x104608de0         4368403936
r28            0x3afd96b9348       4053801866056
r29            0x3afd98107a1       4053803272097
r30            0x1045a09b0         4367976880
r31            0x3fffffffcb10      70368744164112
pc             0x102c28c3c         0x102c28c3c <v8::internal::RegExpStack::stack_base()+24>
msr            0x900000004800d032  10376293542669635634
cr             0x40000222          1073742370
lr             0x102c2a424         0x102c2a424 <v8::internal::NativeRegExpMacroAssembler::Execute(v8::internal::Code*, v8::internal::String*, int, unsigned char const*, unsigned char const*, int*, int, v8::internal::Isolate*)+144>
ctr            0x3ffff7fa2c00      70368609577984
xer            0x0                 0
fpscr          0xa3004100          2734702848
vscr           0x10000             65536
vrsave         0xffffffff          -1
orig_r3        0xc00000000000c120  -4611686018427338464
trap           0xd00               3328
(gdb) c
Continuing.

Thread 1 "node" hit Breakpoint 1, v8::internal::RegExpMacroAssemblerPPC::CheckStackGuardState (return_address=0x3fffffffc9d0, re_code=0x3afd98107a1, re_frame=70368744163920) at ../deps/v8/src/regexp/ppc/regexp-macro-assembler-ppc.cc:1165
1165          frame_entry<Isolate*>(re_frame, kIsolate),
(gdb) x/2x re_frame+176
0x3fffffffcb00: 0x00000001      0x045a09b0
(gdb) x/60x re_frame
0x3fffffffca50: 0x00000000      0x0000000f      0x00000085      0xd28fe519
0x3fffffffca60: 0x00000001      0x04608de0      0x000003af      0xd96b9348
0x3fffffffca70: 0x000003af      0xd98107a1      0x00000001      0x045a09b0
0x3fffffffca80: 0x00003fff      0xffffca90      0x00000001      0x02c2a81c
0x3fffffffca90: 0x00003fff      0xffffcb10      0x00000001      0x04608de0
0x3fffffffcaa0: 0x000003af      0xd9810800      0x00000001      0x045c5d00
0x3fffffffcab0: 0x00003fff      0xffffcb10      0x00003fff      0xffffcac0
0x3fffffffcac0: 0x00003fff      0xffffcb10      0x00003fff      0xffffcad0
0x3fffffffcad0: 0x00000001      0x02c2a75c      0x00000001      0x0464a920
0x3fffffffcae0: 0x00000001      0x02c38e48      0x000003af      0xd98107a1
0x3fffffffcaf0: 0x00000001      0x04608de0      0x000003af      0xd98107a1
0x3fffffffcb00: 0x00000001      0x045a09b0      0x00003fff      0xffffcb10
0x3fffffffcb10: 0x00003fff      0xffffcc10      0x00003fff      0xffffcb20
0x3fffffffcb20: 0x00000001      0x02c2a48c      0x00003fff      0xffffcb90
0x3fffffffcb30: 0x00003fff      0xffffcbd8      0x0000009c      0x967027b9
(gdb) x/2x re_frame+16
0x3fffffffca60: 0x00000001      0x04608de0

It appears that the frame pointer is being improperly adjusted at some point. The Isolate pointer is being written at re_frame+16, when it belongs at re_frame+176.

In a Release build, the object being created at re_frame+176 is a RegExpStackScope; its first member is capacity, which is set to 0; trying to use this object as an Isolate is causing the NULL pointer dereference. The Debug build crashes with a different backtrace, because the object is a RegExpStack, which has valid pointers (but not correct pointers):

(gdb) bt
#0  0x0000000101b3a2a4 in v8::internal::HandleScope::HandleScope (this=0x3fffffffc8a0, isolate=0x1045a09b0) at ../deps/v8/src/handles-inl.h:34
#1  0x0000000102c29b80 in v8::internal::NativeRegExpMacroAssembler::CheckStackGuardState (isolate=0x1045a09b0, start_index=21, is_direct_call=false, return_address=0x3fffffffc9d0, re_code=0xe2164907a1, subject=0x3fffffffca10, input_start=0x3fffffffca20, input_end=0x3fffffffca28) at ../deps/v8/src/regexp/regexp-macro-assembler.cc:168

If I manually set the Isolate pointer at re_frame+176 to the correct value, and continue, then the third time it checks the StackGuard state, the Isolate point is correct when it enters the method. The fourth time, it is incorrect again. This repeats every even numbered time the StackGuard state is checked. If I manually correct the value on each even-numered attempt, the test passes correctly.

I am unsure on how to proceed to actually fix this bug. The V8 regexp code is, frankly, a twisty maze that is very hard for me to follow; it is spread out amongst multiple files and even some headers. I am hopeful that the Node community may know of a way to find the root issue so that we may fix it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ppcIssues and PRs related to the Power architecture.v8 engineIssues and PRs related to the V8 dependency.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions