Skip to content

Cluster workers not sharing ports after reopening a listener #6693

Closed
@gpkeene

Description

@gpkeene
  • Version: 6.0.0
  • Platform: Windows 7 64-bit

Based on the responses to this question, I'm trying to figure out why having multiple workers call server.listen() on the same port/address doesn't cause any issues, but having an old worker call server.close() followed by a server.listen() on the same port will repeatedly give the error EADDRINUSE.

It does not seem to be a case of the listener not closing correctly, as a close event is emitted, which is when I attempt to set up the new listener. While this worker is getting EADDRINUSE, newly spawned workers are able call server.listen() with no issues.

Here is a simple test that will demonstrate the problem. As workers are forked every 100ms, they will establish a listener on port 16000. When worker 10 is forked, it will establish a timeout to tear down its listener after 1s. Once a close event is emitted, it will attempt to call server.listen() on port 16000 again and get the EADDRINUSE error. For consistency, this test explicitly provides the same address during binding to avoid any potential issues with core modules dealing with a null address.

This particular implementation will cause worker 10 to then take up all cycles once it hits the error during binding, thereby keeping the master process from forking new workers. If a delay is added before calling server.listen(), worker 10 will still continue to hit EADDRINUSE while the master continually forks new workers that are capable of establishing listeners.

var cluster = require('cluster');
var net     = require('net');

if (cluster.isMaster) {
    setInterval(function(){cluster.fork()},100);
} else {
    var workerID = cluster.worker.id;
    var server;
    var setup = function() {
        console.log('Worker ' + workerID + ' setting up listener');
        server = net.createServer(function(stream) {});
        server.on('error', function(err) {
            console.log('Error on worker ' + workerID, err);
            teardown();
        });
        if (workerID == 10) {
            server.listen(16000, '127.0.0.1', function() {
                console.log('Worker ' + workerID + ' listener established');
                setTimeout(teardown, 1000);
            });
        } else {
            server.listen(16000, '127.0.0.1', function() {
                console.log('Worker ' + workerID + ' listener established');
            });
        }
    }
    var teardown = function() {
        console.log('Worker ' + workerID + ' closing listener');
        server.close(setup);
    }
    setup();
}

Initial output from this test case:

Worker 1 setting up listener
Worker 1 listener established
Worker 2 setting up listener
Worker 2 listener established
Worker 3 setting up listener
Worker 3 listener established
Worker 4 setting up listener
Worker 4 listener established
Worker 5 setting up listener
Worker 5 listener established
Worker 6 setting up listener
Worker 6 listener established
Worker 7 setting up listener
Worker 7 listener established
Worker 8 setting up listener
Worker 8 listener established
Worker 9 setting up listener
Worker 9 listener established
Worker 10 setting up listener
Worker 10 listener established
Worker 11 setting up listener
Worker 11 listener established
Worker 12 setting up listener
Worker 12 listener established
Worker 13 setting up listener
Worker 13 listener established
Worker 14 setting up listener
Worker 14 listener established
Worker 15 setting up listener
Worker 15 listener established
Worker 16 setting up listener
Worker 16 listener established
Worker 17 setting up listener
Worker 17 listener established
Worker 18 setting up listener
Worker 18 listener established
Worker 19 setting up listener
Worker 19 listener established
Worker 10 closing listener
Worker 10 setting up listener
Error on worker 10 { [Error: bind EADDRINUSE 127.0.0.1:16000]
  code: 'EADDRINUSE',
  errno: 'EADDRINUSE',
  syscall: 'bind',
  address: '127.0.0.1',
  port: 16000 }
Worker 10 closing listener
Worker 10 setting up listener
Error on worker 10 { [Error: bind EADDRINUSE 127.0.0.1:16000]
  code: 'EADDRINUSE',
  errno: 'EADDRINUSE',
  syscall: 'bind',
  address: '127.0.0.1',
  port: 16000 }
Worker 10 closing listener

(This issue has all the same information as this StackOverflow post that I posted a couple days back.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    clusterIssues and PRs related to the cluster subsystem.netIssues and PRs related to the net subsystem.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions