Closed
Description
- Version: v10.15.0
- Platform: docker with linux-alpine on centos
- Subsystem: BackgroundRunner ?
Node.js v10.15.0 segfault in BackgroundRunner → CancelableTask::Run → ConcurrentMarking::Run
We are running node.js in docker on centos nodes:
$ uname -a
Linux *redacted* 3.10.0-862.14.4.el7.x86_64 #1 SMP Wed Sep 26 15:12:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
$ docker --version
Docker version 18.06.1-ce, build e68fc7a
$ cat /etc/centos-release
CentOS Linux release 7.5.1804 (Core)
Recently, we migrated our image to new node version:
FROM node:8.12.0-alpine → FROM node:10.15.0-alpine
We started to observe lots of segfaults in prod:
[Wed Jan 30 13:16:31 2019] node[19293]: segfault at 55717a726770 ip 000055717a726770 sp 00007f7e965317f8 error 15
[Wed Jan 30 13:16:31 2019] node[19292]: segfault at 55717a726770 ip 000055717a726770 sp 00007f7e96d347f8 error 15
[Wed Jan 30 13:29:02 2019] node[2609]: segfault at 560f719ce130 ip 0000560f719ce130 sp 00007f895ffe17f8 error 15
[Wed Jan 30 13:29:02 2019] node[2608]: segfault at 560f719ce130 ip 0000560f719ce130 sp 00007f89607e47f8 error 15
[Wed Jan 30 13:29:02 2019] node[2607]: segfault at 560f719ce130 ip 0000560f719ce130 sp 00007f8960fe77f8 error 15
[Wed Jan 30 13:29:02 2019] node[2610]: segfault at 560f719ce130 ip 0000560f719ce130 sp 00007f895f7de7f8 error 15
[Wed Jan 30 13:42:49 2019] node[30532]: segfault at 55ef5d41a090 ip 000055ef5d41a090 sp 00007f7910e378e8 error 15
We use node to spawn a lot of puppeteer scrapers (adding this, because puppeteer/puppeteer#2872 may be related)
I was able to get a few core dumps from inside container, here is the stack:
(llnode) bt all
* thread #1, name = 'node', stop reason = signal SIGSEGV
* frame #0: 0x00007fa93ebebae0 node`node::PromiseWrap::~PromiseWrap()
frame #1: 0x000055fcfe72a87f node`v8::internal::ConcurrentMarking::Run(int, v8::internal::ConcurrentMarking::TaskState*) + 9855
frame #2: 0x000055fcfe42af7d node`v8::internal::CancelableTask::Run() + 61
frame #3: 0x000055fcfe1b37fd node`node::BackgroundRunner(void*) + 317
thread #2, stop reason = signal 0
frame #0: 0x00007fa93e65e5e4 node
thread #3, stop reason = signal 0
frame #0: 0x00007fa93e65e5e4 node
thread #4, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #5, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #6, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #7, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #8, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #9, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #10, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #11, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #12, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #13, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #14, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #15, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #16, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #17, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #18, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #19, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #20, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #21, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #22, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #23, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #24, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #25, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #26, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #27, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #28, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #29, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #30, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #31, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #32, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #33, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #34, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #35, stop reason = signal 0
frame #0: 0x00007fa93e65e5e4 node
thread #36, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #37, stop reason = signal 0
frame #0: 0x00007fa93e65e5e4 node
thread #38, stop reason = signal 0
frame #0: 0x00007fa93e65e5e4 node
thread #39, stop reason = signal 0
frame #0: 0x00007fa93ebebae0 node`node::PromiseWrap::~PromiseWrap()
frame #1: 0x000055fcfe72a87f node`v8::internal::ConcurrentMarking::Run(int, v8::internal::ConcurrentMarking::TaskState*) + 9855
frame #2: 0x000055fcfe42af7d node`v8::internal::CancelableTask::Run() + 61
frame #3: 0x000055fcfe1b37fd node`node::BackgroundRunner(void*) + 317
thread #40, stop reason = signal 0
frame #0: 0x00007fa93e62adc3 node
frame #1: node`uv_run(loop=0xffffffffffffffff, mode=UV_RUN_DEFAULT) at core.c:370
frame #2: 0x000055fcfe1b7029 node`node::BackgroundTaskRunner::DelayedTaskScheduler::Start()::'lambda'(void*)::_FUN(void*) + 137
thread #41, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #42, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #43, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #44, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #45, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #46, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #47, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #48, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #49, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #50, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #51, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #52, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #53, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #54, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #55, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #56, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #57, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #58, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #59, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #60, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #61, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #62, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #63, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #64, stop reason = signal 0
frame #0: 0x00007fa93e65c636 node
thread #65, stop reason = signal 0
frame #0: 0x00007fa93ebebae0 node`node::PromiseWrap::~PromiseWrap()
frame #1: 0x000055fcfe72a87f node`v8::internal::ConcurrentMarking::Run(int, v8::internal::ConcurrentMarking::TaskState*) + 9855
frame #2: 0x000055fcfe42af7d node`v8::internal::CancelableTask::Run() + 61
frame #3: 0x000055fcfe1b37fd node`node::BackgroundRunner(void*) + 317
Other core dumps also contained ConcurrentMarking::Run as last instruction, ~PromiseWrap was not always there.
Env parameters that may be useful:
PUPPETEER_NO_SANDBOX=1
--ulimit nofile=100000:100000
UV_THREADPOOL_SIZE=64
Metadata
Metadata
Assignees
Labels
No labels