Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

[Cluster] AssertionError: Resource leak detected. #9409

Closed
jshkurti opened this issue Mar 13, 2015 · 14 comments
Closed

[Cluster] AssertionError: Resource leak detected. #9409

jshkurti opened this issue Mar 13, 2015 · 14 comments
Labels

Comments

@jshkurti
Copy link

Hello guys.

I'm doing some intensive use of cluster module and I'm still getting this error sometimes (nodejs v0.12.0).

AssertionError: Resource leak detected.
  at removeWorker (cluster.js:346:9)
  at ChildProcess.<anonymous> (cluster.js:366:34)
  at ChildProcess.g (events.js:199:16)
  at ChildProcess.emit (events.js:110:17)
  at Process.ChildProcess._handle.onexit (child_process.js:1067:12)

I can see that it happens when there is still some handles left after all workers are deleted.
https://github.com/joyent/node/blob/master/lib/cluster.js#L347-L348
I also noticed that you call removeHandlesForWorker() only on 'disconnect' event, not on 'exit'.
https://github.com/joyent/node/blob/master/lib/cluster.js#L382

Is it possible that 'exit' event could be fired before 'disconnect' and thus be the reason for this bug to happen ?

Is this a Node.js bug or am I misusing the cluster module at some point ?
If so, what scenario could possibly trigger this bug ? Could you show me a sample code which intentionally triggers this exception ?

Thanks a lot.

@hertzg
Copy link

hertzg commented Mar 17, 2015

I think i have the same issue

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory

assert.js:86
  throw new assert.AssertionError({
        ^
AssertionError: Resource leak detected.
  at removeWorker (cluster.js:346:9)
  at ChildProcess.<anonymous> (cluster.js:366:34)
  at ChildProcess.g (events.js:199:16)
  at ChildProcess.emit (events.js:110:17)
  at Process.ChildProcess._handle.onexit (child_process.js:1067:12)

@jshkurti
Copy link
Author

@hertzg Do you, by any chance, have a .removeAllListeners('disconnect') or .removeListener('disconnect', ...) in your code ?

@hertzg
Copy link

hertzg commented Mar 19, 2015

@jshkurti I'm using removeAllListeners and removeListener but not for disconnect events on cluster, process or any specific workers.

@jshkurti
Copy link
Author

Hey @hertzg , I did a Pull Request to fix this (which actually works) : #9418

But until it's accepted and merged I have found a work around.
Everytime I create a worker, I apply this function to it :

  function workAround(worker) {
    var listeners = null;

    listeners = worker.process.listeners('exit')[0];
    var exit = listeners[Object.keys(listeners)[0]];

    listeners = worker.process.listeners('disconnect')[0];
    var disconnect = listeners[Object.keys(listeners)[0]];

    worker.process.removeListener('exit', exit);
    worker.process.once('exit', function(exitCode, signalCode) {
      if (worker.state != 'disconnected')
        disconnect();
      exit(exitCode, signalCode);
    });
  }

Works like a charm ;)

@hertzg
Copy link

hertzg commented Mar 21, 2015

@jshkurti I'm unable to reproduce this issue again but will keep that in mind. Thanks

@megastef
Copy link

@jshkurti THX! - happened to me with a simple test script on ARM. It happens when a worker terminates (in my case there was an Error in the http request handler).

@Splaktar
Copy link

We ran into this with one of our servers last night :(
I see that the PR was never merged.

Is this fixed in v4.0.0?

@herkyl
Copy link

herkyl commented Sep 20, 2015

@Splaktar I had the same error today on a machine running v4.0.0

Splaktar added a commit to gdg-x/hub that referenced this issue Sep 28, 2015
Add initWorker() to server which should clean up worker resources on both disconnect and exit.
Reference nodejs/node-v0.x-archive#9409
Update JSHint for server and fix some issues including removing unused requires.
Add key entries for frisbee keys.
Remove some newRelic cruft.

Fixes #54.
@Smbc1
Copy link

Smbc1 commented Sep 30, 2015

Is anybody knows how to reproduce such a bug "in vitro"? Is it still actual for node 4.1.1?

@nullivex
Copy link

nullivex commented Oct 7, 2015

I can confirm this error is still happening for me on spinning down my cluster.

Stopping mock cluster...

assert.js:89
  throw new assert.AssertionError({
  ^
AssertionError: Resource leak detected.
    at removeWorker (cluster.js:328:9)
    at ChildProcess.<anonymous> (cluster.js:348:34)
    at ChildProcess.g (events.js:260:16)
    at emitTwo (events.js:87:13)
    at ChildProcess.emit (events.js:172:7)
    at Process.ChildProcess._handle.onexit (internal/child_process.js:200:12)
      2) "after all" hook

I am going to try implementing this workaround into https://www.npmjs.com/package/infant and see if that works.

@nullivex
Copy link

nullivex commented Oct 7, 2015

I implemented this fix into Infant 0.10.0 it seems to be working for me I have yet to see this error appear again.

I will keep an eye on this issue and if this gets resolved upstream I will update accordingly.

@Smbc1
Copy link

Smbc1 commented Oct 8, 2015

Workaround is works, but in production it's provokes the cluster stuck.

@jshkurti
Copy link
Author

It looks like nodejs/node#3510 fixes the issue.

@Splaktar
Copy link

Great! Looking forward to it getting into a release.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

9 participants