Skip to content

EADDRINUSE error when trying to bind to port after it was closed #53738

Open
@OliverJAsh

Description

@OliverJAsh

Version

20.12.2

Platform

Darwin olivers-mbp.lan 23.5.0 Darwin Kernel Version 23.5.0: Wed May  1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000 arm64 arm Darwin

Subsystem

No response

What steps will reproduce the bug?

The following script creates 2 cluster workers and each cluster worker does the following:

  1. Start server (A) on port 0 (random port).
  2. Close server A.
  3. Once server A has closed, start another server (B) on the same port as the previous server (A).
import cluster from 'node:cluster';
import express from 'express';

if (cluster.isPrimary) {
  const numCPUs = 2;

  console.log(`Master process ${process.pid} is running`);

  // Fork workers.
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork();
  }
} else {
  const a = express();
  const b = express();

  const port = 0;

  console.log(`[${process.pid}] [A] call listen on port`, port);
  const serverA = a.listen(port, () => {
    const randomPort = serverA.address().port;
    console.log(`[${process.pid}] [A] listening on port`, randomPort);

    serverA.close((error) => {
      console.log(`[${process.pid}] [A] close`, error);

      console.log(`[${process.pid}] [B] call listen on port`, randomPort);
      const serverB = b.listen(randomPort, () => {
        console.log(`[${process.pid}] [B] listening on port`, randomPort);
      });
      serverB.on('error', (error) => {
        console.log(`[${process.pid}] [B] error`, error);
      });
    });
  });
}

How often does it reproduce? Is there a required condition?

No response

What is the expected behavior? Why is that the expected behavior?

No error.

What do you see instead?

Sometimes, but not always, we see an EADDRINUSE error. For example:

$ node test
Master process 16437 is running
[16438] [A] call listen on port 0
[16439] [A] call listen on port 0
[16438] [A] listening on port 58256
[16438] [A] close undefined
[16438] [B] call listen on port 58256
[16439] [A] listening on port 58256
[16439] [A] close undefined
[16439] [B] call listen on port 58256
[16439] [B] listening on port 58256
[16438] [B] error Error: bind EADDRINUSE null:58256
    at listenOnPrimaryHandle (node:net:1969:18)
    at rr (node:internal/cluster/child:163:12)
    at Worker.<anonymous> (node:internal/cluster/child:113:7)
    at process.onInternalMessage (node:internal/cluster/utils:49:5)
    at process.emit (node:events:530:35)
    at emit (node:internal/child_process:951:14)
    at process.processTicksAndRejections (node:internal/process/task_queues:83:21) {
  errno: -48,
  code: 'EADDRINUSE',
  syscall: 'bind',
  address: null,
  port: 58256
}

It seems to happen more frequently when the CPU is under pressure.

This is not expected because, as far as I understand:

  • It should be possible to bind to the same port across cluster workers.
  • Server A has been closed by the time we try to bind server B. (According to the documentation the close callback is only called once the server has closed (i.e. the port has been released?)

Additional information

I have been unable to reproduce the problem with a single cluster worker which suggests the problem only occurs when there's contention between cluster workers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    clusterIssues and PRs related to the cluster subsystem.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions