Skip to content

feat: add cluster management#39

Open
phvalguima wants to merge 1 commit intomainfrom
add-cluster-management
Open

feat: add cluster management#39
phvalguima wants to merge 1 commit intomainfrom
add-cluster-management

Conversation

@phvalguima
Copy link
Copy Markdown

This PR allows to run several replicas of the server in parallel.

@github-actions
Copy link
Copy Markdown

⚠️ Deprecation Warning: The deny-licenses option is deprecated for possible removal in the next major release. For more information, see issue 997.

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Snapshot Warnings

⚠️: No snapshots were found for the head SHA 79285ca.
Ensure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice.

Scanned Files

None

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds infrastructure intended to support running multiple server replicas (clustered workers) by improving upstream connection reuse, server listen configuration, and introducing a cluster startup entrypoint for local/Docker runs.

Changes:

  • Reuse a single shared undici Agent per worker to avoid per-request connection pools.
  • Make server startup more cluster-/ops-friendly (numeric port parsing, configurable TCP backlog, bind address, and graceful shutdown via server.close()).
  • Add start:cluster and switch Docker CMD to run dist/cluster.js.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
src/routes/responses/handleOneTurn.ts Introduces a shared undici dispatcher and uses it for OpenAI requests.
src/index.ts Updates listen configuration (port parsing, backlog, bind host) and changes shutdown to server.close().
package.json Adds start:cluster script targeting dist/cluster.js.
eslint.config.js Adds additional globals to satisfy linting for newly introduced runtime symbols/types.
Dockerfile Changes container entrypoint from dist/index.js to dist/cluster.js.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +15 to +23
const port = parseInt(String(process.env.PORT || 3000), 10);
const highWaterMark = parseInt(process.env.STREAM_HIGH_WATER_MARK || "65536", 10);
const backlog = parseInt(process.env.TCP_BACKLOG || "5000", 10);

// Start server with configurable highWaterMark for SSE streaming backpressure
createServer({ highWaterMark }, app).listen(port, () => {
logger.info({ port, highWaterMark }, "Server started");
// and configurable TCP backlog (SO_MAXCONN / listen queue depth)
const server = createServer({ highWaterMark }, app);
server.listen(port, "0.0.0.0", backlog, () => {
logger.info({ port, highWaterMark, backlog, pid: process.pid }, "Server started");
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

port/highWaterMark/backlog are parsed with parseInt but not validated. If PORT or TCP_BACKLOG are set to a non-numeric value, parseInt returns NaN and server.listen(...) will throw at startup. Consider validating (Number.isFinite / integer range) and either falling back to defaults or exiting with a clear configuration error message.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, but that happens once, at startup and will trigger crashloop restart of the container.

Comment on lines +21 to +23
const server = createServer({ highWaterMark }, app);
server.listen(port, "0.0.0.0", backlog, () => {
logger.info({ port, highWaterMark, backlog, pid: process.pid }, "Server started");
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Binding explicitly to "0.0.0.0" forces IPv4-only listening; on IPv6-only hosts this can prevent the server from accepting connections. Consider making the bind address configurable (e.g., HOST env var) or omitting the host argument to let Node bind to the system default.

Copilot uses AI. Check for mistakes.
Comment on lines 28 to 36
process.on("SIGINT", () => {
logger.info("Server shutting down (SIGINT)");
process.exit(0);
logger.info({ pid: process.pid }, "Server shutting down (SIGINT)");
server.close(() => process.exit(0));
});

process.on("SIGTERM", () => {
logger.info("Server shutting down (SIGTERM)");
process.exit(0);
logger.info({ pid: process.pid }, "Server shutting down (SIGTERM)");
server.close(() => process.exit(0));
});
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

server.close() waits for existing connections to finish; with long-lived SSE streams this callback may never run, causing SIGTERM/SIGINT shutdown to hang. Consider adding a shutdown timeout and using server.closeIdleConnections()/server.closeAllConnections() (Node >=20) or tracking sockets to force-close after the grace period.

Copilot uses AI. Check for mistakes.
Comment on lines +26 to +33
// Shared undici Agent per worker process — avoids creating a new connection pool per request.
// Configurable via UPSTREAM_MAX_CONNECTIONS (connections per origin) and UPSTREAM_KEEP_ALIVE_TIMEOUT_MS.
const sharedDispatcher = new Agent({
allowH2: true,
connections: parseInt(process.env.UPSTREAM_MAX_CONNECTIONS || "128", 10),
pipelining: 1,
keepAliveTimeout: parseInt(process.env.UPSTREAM_KEEP_ALIVE_TIMEOUT_MS || "30000", 10),
connectTimeout: parseInt(process.env.UPSTREAM_CONNECT_TIMEOUT_MS || "30000", 10),
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inline comment says the Agent is configurable via UPSTREAM_MAX_CONNECTIONS and UPSTREAM_KEEP_ALIVE_TIMEOUT_MS, but the code also reads UPSTREAM_CONNECT_TIMEOUT_MS. Update the comment to include the connect-timeout env var (or drop the env-var list) so the documentation matches behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +26 to +33
// Shared undici Agent per worker process — avoids creating a new connection pool per request.
// Configurable via UPSTREAM_MAX_CONNECTIONS (connections per origin) and UPSTREAM_KEEP_ALIVE_TIMEOUT_MS.
const sharedDispatcher = new Agent({
allowH2: true,
connections: parseInt(process.env.UPSTREAM_MAX_CONNECTIONS || "128", 10),
pipelining: 1,
keepAliveTimeout: parseInt(process.env.UPSTREAM_KEEP_ALIVE_TIMEOUT_MS || "30000", 10),
connectTimeout: parseInt(process.env.UPSTREAM_CONNECT_TIMEOUT_MS || "30000", 10),
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Agent options are derived via parseInt(...) without validating the result. If any of these env vars are set to a non-numeric value, parseInt will return NaN and undici may throw or behave unexpectedly at runtime. Consider coercing with Number(...) and falling back to defaults (or throwing a clear startup error) when the parsed value is not a finite integer.

Suggested change
// Shared undici Agent per worker process — avoids creating a new connection pool per request.
// Configurable via UPSTREAM_MAX_CONNECTIONS (connections per origin) and UPSTREAM_KEEP_ALIVE_TIMEOUT_MS.
const sharedDispatcher = new Agent({
allowH2: true,
connections: parseInt(process.env.UPSTREAM_MAX_CONNECTIONS || "128", 10),
pipelining: 1,
keepAliveTimeout: parseInt(process.env.UPSTREAM_KEEP_ALIVE_TIMEOUT_MS || "30000", 10),
connectTimeout: parseInt(process.env.UPSTREAM_CONNECT_TIMEOUT_MS || "30000", 10),
function getFiniteIntegerEnv(name: string, fallback: number): number {
const rawValue = process.env[name];
if (rawValue === undefined) {
return fallback;
}
const parsedValue = Number(rawValue);
return Number.isFinite(parsedValue) && Number.isInteger(parsedValue) ? parsedValue : fallback;
}
// Shared undici Agent per worker process — avoids creating a new connection pool per request.
// Configurable via UPSTREAM_MAX_CONNECTIONS (connections per origin) and UPSTREAM_KEEP_ALIVE_TIMEOUT_MS.
const sharedDispatcher = new Agent({
allowH2: true,
connections: getFiniteIntegerEnv("UPSTREAM_MAX_CONNECTIONS", 128),
pipelining: 1,
keepAliveTimeout: getFiniteIntegerEnv("UPSTREAM_KEEP_ALIVE_TIMEOUT_MS", 30000),
connectTimeout: getFiniteIntegerEnv("UPSTREAM_CONNECT_TIMEOUT_MS", 30000),

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants