Description
This has recently started happening a lot in the macOS hosts in CI:
Some test times out. For example, in https://ci.nodejs.org/job/node-test-commit-osx/nodes=osx1010/17983/console, sequential/test-benchmark-http
times out.
As a result, a stray subprocess is left that ends up causing subsequent jobs to fail. So, for example, https://ci.nodejs.org/job/node-test-commit-osx/nodes=osx1010/17988/console:
# Clean up any leftover processes, error if found.
ps awwx | grep Release/node | grep -v grep | cat
79201 ?? R 145:42.29 /Users/iojs/build/workspace/node-test-commit-osx/nodes/osx1010/out/Release/node /Users/iojs/build/workspace/node-test-commit-osx/nodes/osx1010/benchmark/http/cluster.js c=1 len=1 type=asc benchmarker=test-double chunkedEnc=true chunks=0 dur=0.1 key="" method=write n=1 res=normal
make[1]: *** [test-ci] Error 1
To fix this, someone from the Build WG (in this specific case, me) logs in and does a kill -9
on the PID. In theory, the PID should have been terminated by one of the instances of xargs kill
that appears in the Makefile
. My guess (that I keep forgetting to test when this comes up) is that the problem is that xargs kill
needs to be xargs kill -9
to be effective in these cases on the macOS hosts.