Conversation
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Snapshot WarningsEnsure that dependencies are being submitted on PR branches and consider enabling retry-on-snapshot-warnings. See the documentation for more information and troubleshooting advice. Scanned FilesNone |
There was a problem hiding this comment.
Pull request overview
Adds infrastructure intended to support running multiple server replicas (clustered workers) by improving upstream connection reuse, server listen configuration, and introducing a cluster startup entrypoint for local/Docker runs.
Changes:
- Reuse a single shared
undiciAgentper worker to avoid per-request connection pools. - Make server startup more cluster-/ops-friendly (numeric port parsing, configurable TCP backlog, bind address, and graceful shutdown via
server.close()). - Add
start:clusterand switch Docker CMD to rundist/cluster.js.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
src/routes/responses/handleOneTurn.ts |
Introduces a shared undici dispatcher and uses it for OpenAI requests. |
src/index.ts |
Updates listen configuration (port parsing, backlog, bind host) and changes shutdown to server.close(). |
package.json |
Adds start:cluster script targeting dist/cluster.js. |
eslint.config.js |
Adds additional globals to satisfy linting for newly introduced runtime symbols/types. |
Dockerfile |
Changes container entrypoint from dist/index.js to dist/cluster.js. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const port = parseInt(String(process.env.PORT || 3000), 10); | ||
| const highWaterMark = parseInt(process.env.STREAM_HIGH_WATER_MARK || "65536", 10); | ||
| const backlog = parseInt(process.env.TCP_BACKLOG || "5000", 10); | ||
|
|
||
| // Start server with configurable highWaterMark for SSE streaming backpressure | ||
| createServer({ highWaterMark }, app).listen(port, () => { | ||
| logger.info({ port, highWaterMark }, "Server started"); | ||
| // and configurable TCP backlog (SO_MAXCONN / listen queue depth) | ||
| const server = createServer({ highWaterMark }, app); | ||
| server.listen(port, "0.0.0.0", backlog, () => { | ||
| logger.info({ port, highWaterMark, backlog, pid: process.pid }, "Server started"); |
There was a problem hiding this comment.
port/highWaterMark/backlog are parsed with parseInt but not validated. If PORT or TCP_BACKLOG are set to a non-numeric value, parseInt returns NaN and server.listen(...) will throw at startup. Consider validating (Number.isFinite / integer range) and either falling back to defaults or exiting with a clear configuration error message.
There was a problem hiding this comment.
True, but that happens once, at startup and will trigger crashloop restart of the container.
| const server = createServer({ highWaterMark }, app); | ||
| server.listen(port, "0.0.0.0", backlog, () => { | ||
| logger.info({ port, highWaterMark, backlog, pid: process.pid }, "Server started"); |
There was a problem hiding this comment.
Binding explicitly to "0.0.0.0" forces IPv4-only listening; on IPv6-only hosts this can prevent the server from accepting connections. Consider making the bind address configurable (e.g., HOST env var) or omitting the host argument to let Node bind to the system default.
| process.on("SIGINT", () => { | ||
| logger.info("Server shutting down (SIGINT)"); | ||
| process.exit(0); | ||
| logger.info({ pid: process.pid }, "Server shutting down (SIGINT)"); | ||
| server.close(() => process.exit(0)); | ||
| }); | ||
|
|
||
| process.on("SIGTERM", () => { | ||
| logger.info("Server shutting down (SIGTERM)"); | ||
| process.exit(0); | ||
| logger.info({ pid: process.pid }, "Server shutting down (SIGTERM)"); | ||
| server.close(() => process.exit(0)); | ||
| }); |
There was a problem hiding this comment.
server.close() waits for existing connections to finish; with long-lived SSE streams this callback may never run, causing SIGTERM/SIGINT shutdown to hang. Consider adding a shutdown timeout and using server.closeIdleConnections()/server.closeAllConnections() (Node >=20) or tracking sockets to force-close after the grace period.
| // Shared undici Agent per worker process — avoids creating a new connection pool per request. | ||
| // Configurable via UPSTREAM_MAX_CONNECTIONS (connections per origin) and UPSTREAM_KEEP_ALIVE_TIMEOUT_MS. | ||
| const sharedDispatcher = new Agent({ | ||
| allowH2: true, | ||
| connections: parseInt(process.env.UPSTREAM_MAX_CONNECTIONS || "128", 10), | ||
| pipelining: 1, | ||
| keepAliveTimeout: parseInt(process.env.UPSTREAM_KEEP_ALIVE_TIMEOUT_MS || "30000", 10), | ||
| connectTimeout: parseInt(process.env.UPSTREAM_CONNECT_TIMEOUT_MS || "30000", 10), |
There was a problem hiding this comment.
The inline comment says the Agent is configurable via UPSTREAM_MAX_CONNECTIONS and UPSTREAM_KEEP_ALIVE_TIMEOUT_MS, but the code also reads UPSTREAM_CONNECT_TIMEOUT_MS. Update the comment to include the connect-timeout env var (or drop the env-var list) so the documentation matches behavior.
| // Shared undici Agent per worker process — avoids creating a new connection pool per request. | ||
| // Configurable via UPSTREAM_MAX_CONNECTIONS (connections per origin) and UPSTREAM_KEEP_ALIVE_TIMEOUT_MS. | ||
| const sharedDispatcher = new Agent({ | ||
| allowH2: true, | ||
| connections: parseInt(process.env.UPSTREAM_MAX_CONNECTIONS || "128", 10), | ||
| pipelining: 1, | ||
| keepAliveTimeout: parseInt(process.env.UPSTREAM_KEEP_ALIVE_TIMEOUT_MS || "30000", 10), | ||
| connectTimeout: parseInt(process.env.UPSTREAM_CONNECT_TIMEOUT_MS || "30000", 10), |
There was a problem hiding this comment.
The Agent options are derived via parseInt(...) without validating the result. If any of these env vars are set to a non-numeric value, parseInt will return NaN and undici may throw or behave unexpectedly at runtime. Consider coercing with Number(...) and falling back to defaults (or throwing a clear startup error) when the parsed value is not a finite integer.
| // Shared undici Agent per worker process — avoids creating a new connection pool per request. | |
| // Configurable via UPSTREAM_MAX_CONNECTIONS (connections per origin) and UPSTREAM_KEEP_ALIVE_TIMEOUT_MS. | |
| const sharedDispatcher = new Agent({ | |
| allowH2: true, | |
| connections: parseInt(process.env.UPSTREAM_MAX_CONNECTIONS || "128", 10), | |
| pipelining: 1, | |
| keepAliveTimeout: parseInt(process.env.UPSTREAM_KEEP_ALIVE_TIMEOUT_MS || "30000", 10), | |
| connectTimeout: parseInt(process.env.UPSTREAM_CONNECT_TIMEOUT_MS || "30000", 10), | |
| function getFiniteIntegerEnv(name: string, fallback: number): number { | |
| const rawValue = process.env[name]; | |
| if (rawValue === undefined) { | |
| return fallback; | |
| } | |
| const parsedValue = Number(rawValue); | |
| return Number.isFinite(parsedValue) && Number.isInteger(parsedValue) ? parsedValue : fallback; | |
| } | |
| // Shared undici Agent per worker process — avoids creating a new connection pool per request. | |
| // Configurable via UPSTREAM_MAX_CONNECTIONS (connections per origin) and UPSTREAM_KEEP_ALIVE_TIMEOUT_MS. | |
| const sharedDispatcher = new Agent({ | |
| allowH2: true, | |
| connections: getFiniteIntegerEnv("UPSTREAM_MAX_CONNECTIONS", 128), | |
| pipelining: 1, | |
| keepAliveTimeout: getFiniteIntegerEnv("UPSTREAM_KEEP_ALIVE_TIMEOUT_MS", 30000), | |
| connectTimeout: getFiniteIntegerEnv("UPSTREAM_CONNECT_TIMEOUT_MS", 30000), |
This PR allows to run several replicas of the server in parallel.