Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird hanging bug with --kill-others #104

Open
Maistho opened this issue Apr 17, 2017 · 12 comments
Open

Weird hanging bug with --kill-others #104

Maistho opened this issue Apr 17, 2017 · 12 comments
Labels

Comments

@Maistho
Copy link

Maistho commented Apr 17, 2017

Hi,

I've found a superweird bug that I'd like some help to figure out what to do with.

When running concurrently in my node-alpine docker container, PIDs are (consistently) one off.

To test this, I modified concurrently to print the PID first when spawning a child process, and also when trying to kill them in killOtherProcesses. http-server is constantly one higher than the saved PID.

Example output:

bash-4.3# ps aux
PID   USER     TIME   COMMAND
    1 root       0:00 bash
 1293 root       0:00 ps aux

bash-4.3# concurrently --kill-others "http-server ." "npm --version" &
[1] 1294
1300
1307
[0] Starting up http-server, serving .
[0] Available on:
[0]   http://127.0.0.1:8080
[0]   http://172.17.0.3:8080
[0] Hit CTRL-C to stop the server
[1] 4.4.4
[1] npm --version exited with code 0
--> Sending SIGTERM to other processes..
1300
bash-4.3#  ps
PID   USER     TIME   COMMAND
    1 root       0:00 bash
 1294 root       0:00 node /usr/bin/concurrently --kill-others http-server . npm --version
 1301 root       0:00 node /usr/bin/http-server .
 1315 root       0:00 ps

So it's trying to kill http-server on 1300, but http-server is actually on 1301. If I change the code to kill child.pid+1 instead, all is fine...

Any ideas? I'm stumped for ideas where to look further.

@Maistho
Copy link
Author

Maistho commented Apr 17, 2017

Oh, just when I've filed an issue I find out what the problem is, as always.

Apparently alpine uses busybox as /bin/sh, and spawn-command always uses /bin/sh to launch commands, and busybox shell forks all commands.

Solution? I'm not sure. Perhaps spawn-command shouldn't wrap in /bin/sh? Or perhaps I should just replace busybox shell?

@Maistho
Copy link
Author

Maistho commented Apr 17, 2017

So just when I've made an issue I find out what the problem is, as always.

alpine uses busybox ash for /bin/sh, which is the shell that spawn-command will always use.

Busybox ash will fork any commands given by -c, so the pid changes.

Solution?

Replace /bin/sh with something else, e.g. bash

Or change spawn-command to not spawn inside a shell

Not sure which would be the best idea.

Not a concurrently issue, but may be related to other issues here

@Maistho Maistho closed this as completed Apr 18, 2017
@gustavohenke
Copy link
Member

Great that you found a solution, @Maistho.
But if there's something feasible we could do on our side, it would be great to let us know.

By the way, you could try concurrently v3.2 or older; they used another dependency to spawn commands: spawn-default-shell

@cuipengfei
Copy link

Hi,

I am seeing the same behavior when running in a alpine docker container.

The same npm script, same node_modules directory, just mapped into a ubuntu docker container, will kill the other process.

@Maistho
Copy link
Author

Maistho commented Sep 1, 2017

@cuipengfei yup, you'll need to use bash instead of the builtin shell in alpine.

I have a docker image that uses bash instead, kadadev/node-small, that you might use. If you want to run your own image you can take a look at my dockerfile to see how to switch to bash
https://github.com/KadaDev/node-small/blob/master/Dockerfile

roschaefer referenced this issue in Human-Connection/Nitro-Backend Dec 3, 2018
I give up with running docker-compose on Travis CI. Using `bash` instead
of Linux alpine's `sh` does **not** fix the issue as suggested here:
https://github.com/kimmobrunfeldt/concurrently/issues/104
@OrangeDog
Copy link

OrangeDog commented Dec 3, 2019

@Maistho linking /bin/sh to /bin/bash doesn't appear to work. It still hangs at "Sending SIGTERM to other processes". As far as I can tell, /bin/ash doesn't fork any more as you describe.

@OrangeDog
Copy link

@gustavohenke should this be reopened?

@gustavohenke gustavohenke reopened this Dec 15, 2019
@gustavohenke
Copy link
Member

Ideas on how to fix this in concurrently?

@cben
Copy link

cben commented Apr 2, 2020

Untested workaround: concurrently 'exec cmd1' 'exec cmd2'. IIUC, this result in /bin/sh -c 'exec cmd1' which should make any posix-ish shell replace itself rather than forking?

BTW I'm confused why OP's shell exits after forking. If the shell stuck around, concurrently should be able to kill both it and its children (it uses https://github.com/pkrumins/node-tree-kill).
But the ps output proves it forked, then exited!
I can reproduce such behavior by backgrounding: concurrently 'sleep 100 &', results in sleep being orphaned and reparented to pid 1, but I don't think people use & under concurrently 😄

Untested idea: what if concurrently put each command in its own process group, then sent signal to whole groups rather than specific pids it saved? This is roughly how job control works in shells, and how Ctrl+C is delivered reliably.
(Of course a this won't catch a process deliberately "daemonizing" itself, but that takes more than a simple fork by a shell.)

@cben
Copy link

cben commented Apr 2, 2020

FWIW, concurrently --kill-others 'sleep 100 &' 'sleep 2' reproduces the hanging.
(normal linux with bash, Fedora 30)

@akauppi
Copy link

akauppi commented Jun 29, 2021

I was bit by this, and it took a while to narrow down on concurrently and then to notice that the behaviour was different in Docker than in desktop.

In my case (details below), it was about a piping command making concurrently -k not kill the process, but only on Alpine.

    "start": "firebase emulators:start --project=demo-1 | $(filter-emulators)",

concurrently is such a nice and reliable workhorse; it would be nice if this can somehow be addressed.

Steps:

You need: Node 14 or 16; npm 7.7+, Docker

$ git clone https://github.com/akauppi/firebase-jest-testing.git
$ cd firebase-jest-testing
$ git checkout next
$ npm install

Works in desktop (anything with bash and grep):

$ npm test    # should eventually exit to command prompt

To try with Alpine, let's build a Docker image.

$ pushd ci/firebase-ci-builder.sub
$ ./build
$ popd

..and run it

$ docker run -it --rm -v $(pwd):/workspace -w /workspace firebase-ci-builder:9.12.1-node16-npm7 /bin/bash

# npm test

This gets stuck, after the tests.

Work-around:

Have separate npm test (does filtering) and npm run ci (doesn't), so that I can run CI within Alpine.


If you get time outs, comment out the testTimeout line in sample/jest.config.default.js. Docker seems to need a bit more time, at least on my machine.

@Bonnev
Copy link

Bonnev commented May 24, 2024

Still reproducible in 7.6.0. I noticed that referencing another npm script (i.e. "npm:dev") causes the hanging, but writing the script directly seems to work fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants