Skip to content

IPC can freeze the process #7657

Closed
Closed
@cvillemure

Description

@cvillemure

Since we upgraded our app from 0.10.38 to v6, we experienced a lot of problem with IPC messaging.

Essentially, we have a web application with a few workers to handle all the requests. The workers contains caches to speed up the request and theses caches are synchronized between process with IPC messaging. We were also using log4js as a logging library with the clustered appender that uses IPC to send all child logs back to the master to have a single process handling the logs.

All was working fine under 0.10.38, but when we upgraded to 6.0.0 (and then 6.2.0) our app kept crashing under various circumstances

We soon realized that if we send too much data (or too fast) through IPC, that it was freezing our application.

We began refactoring our entire app to use IPC to the strict minimum.

  • We created a custom logging process that receive logs by TCP instead of IPC
  • We refactored our entire master/worker process so the workers could load all the information on their own and restrict IPC messages to only "trigger" messages instead of sending all the data.

All thoses changes are good for our application, since it reduced dependencies from master/worker and did a better separation of responsibilities, but I still see it as a flaw in Node.JS since the IPC is a fairly simple communication mechanism to exchange information between workers, but it seems so fragile now that we are afraid of using it.

I attached a simple script that reproduce the problem. It is not a real scenario, just a test case I created to reproduce the problem of the application that stop responding.

ipc_test_scripts.zip

On my laptop, the app crash at startup (or before the first log) with 5 forks (maybe because I have 4 physical core)

At first I tested with 3 workers and It froze after 5-10 minutes (all process CPU go down to 0 and there's no more log output)

If I remove the "bacon ipsum" from the worker message, it works (might freeze after a while)
If I increase the message interval from 1ms to 10ms, it works (might freeze after a while)
If I spawn only 4 workers it works (will probably freeze after 5-10 minutes)

If I execute it with 0.10.38 it works (as long as I ran it)

So if you play with the timings, size of messages and/or number of forks, you should be able to reproduce the problem.

One thing I observed is that the IPC messaging seem to have improve in performance big time from 0.10 to 6. If i run the test with 3 workers for 10 seconds with 0.10.38 the master only handle 1902 messages and in comparison with 6.3.0, in the same 10 seconds, the master handles 25514 messages.

I also tested it with 4.4.7 and it freeze at startup with 5 forks and after 4 minutes with 4 forks

My specs :
NodeJS Windows 6.3.0 64 bits (bug)
NodeJS Windows 6.2.0 64 bits (bug)
NodeJS Windows 4.4.7 64 bits (bug)
NodeJS Windows 0.10.38 64 bits (OK)

Metadata

Metadata

Assignees

No one assigned

    Labels

    child_processIssues and PRs related to the child_process subsystem.libuvIssues and PRs related to the libuv dependency or the uv binding.windowsIssues and PRs related to the Windows platform.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions