Skip to content

1Utkarsh1/mcp-stdio-guard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

mcp-stdio-guard logo

mcp-stdio-guard

Catch stdout pollution and handshake failures in MCP stdio servers before clients do.

CI npm Socket runtime dependencies node license

mcp-stdio-guard hero showing a clean MCP stdio pipeline

MCP stdio servers use stdout as their protocol channel. Debug text, banners, progress logs, console.log, Python print, or any other stray stdout output can corrupt the stream and make clients fail in confusing ways.

mcp-stdio-guard starts your server, performs a real MCP initialize handshake, probes advertised tools, resources, and prompts list capabilities, optionally sends a real post-initialize MCP request such as tools/list, validates every stdout frame, checks returned tool metadata, and scans source for risky stdout calls.

Why This Exists

The latest MCP docs say stdio servers must send JSON-RPC messages on stdout, may log to stderr, and must complete the initialize then notifications/initialized lifecycle before normal operation.

That is easy to get wrong in real servers. This guard turns that fragile process boundary into a fast local check and a CI gate.

Protocol flow tested by mcp-stdio-guard

Install

From npm:

npx mcp-stdio-guard -- node ./server.js

From this repo:

git clone https://github.com/1Utkarsh1/mcp-stdio-guard.git
cd mcp-stdio-guard
npm ci
npm test

Quickstart

Run your MCP server behind the guard:

mcp-stdio-guard -- node ./server.js

Use a deterministic profile for common workflows:

mcp-stdio-guard --profile registry --json -- node ./server.js

Use a config file for registry runs that need environment names, request lists, or explicitly safe tool calls:

mcp-stdio-guard --config mcp-stdio-guard.config.json

Exercise a real MCP operation after initialization:

mcp-stdio-guard --request tools/list -- node ./server.js

Scan source for obvious stdout writes too. Findings are warnings unless --fail-on-static is set:

mcp-stdio-guard --scan src --fail-on-static --request tools/list -- node ./server.js

JSON output for CI:

mcp-stdio-guard --json --request tools/list -- node ./server.js

Repeat the same guard to catch cold/warm startup behavior:

mcp-stdio-guard --repeat 2 --request tools/list -- node ./server.js

What It Catches

Passing and failing terminal output examples

Problem Runtime check Static scan
console.log("starting") before server startup Yes Yes
Dependency/import-time stdout pollution Yes with --repeat No
Python print("debug") in a stdio server Yes Yes
Late stdout logs after initialize Yes Partial
Invalid JSON-RPC frames Yes No
Server crash after notifications/initialized Yes No
Missing initialize or operation response Yes No
Duplicate tool names or invalid inputSchema.required Yes with --request tools/list No
Cold/warm protocol, capability, or tool-list drift Warning with --repeat No
stderr diagnostics Allowed Allowed

Live MCP Coverage

The test suite creates real servers with @modelcontextprotocol/sdk@1.29.0 and verifies:

Scenario Expected result
clean SDK stdio server through initialize and tools/list Pass
SDK server with startup stdout pollution Fail
SDK server with stderr diagnostics Pass
SDK server with late stdout pollution after connection Fail
hand-rolled server that ignores post-initialize requests Fail
server that crashes after initialized notification Fail

Commands

mcp-stdio-guard [options] -- <command> [args...]
Option Description
--config <path> read a JSON config file for registry runs and explicitly safe tool calls
--profile <name> apply a deterministic guard profile: custom, smoke, registry, ci, or strict
--protocol <version> MCP protocol version to send, default 2025-11-25
--timeout <ms> initialize and request timeout, default 5000
--max-stdout-bytes <n> total stdout byte limit, default 1048576
--max-stdout-line-bytes <n> single stdout line byte limit, default 262144
--max-stderr-bytes <n> retained stderr byte limit, default 1048576
--repeat <count> run the same guard multiple times to catch cold/warm startup behavior
--request <method> send one MCP request after initialization, for example tools/list
--params <json> JSON params for --request
--adversarial-probe <name> / --adversarial-probes <list> opt into strict protocol probes: invalid-method, invalid-params, notification, malformed-json, all, or none
--scan <path> scan source for risky stdout writes and visible startup-output risks
--fail-on-static make static scan findings fail the command
--json print machine-readable output
--cwd <path> run the server command from a specific directory
--help show help

Profiles

Profiles are deterministic presets for common workflows. Existing CLI behavior remains the default custom profile, so current commands keep working unless --profile is provided.

Profile Behavior
custom preserve explicit CLI flags and legacy defaults
smoke initialize only unless --request is provided; skip advertised tools/list, resources/list, and prompts/list probes
registry run advertised list probes and repeat twice by default for cold/warm consistency
ci emit JSON output and make static scan findings fail when --scan is used
strict combine CI-style output/static failures, registry-style repeat depth, and built-in adversarial protocol probes

Explicit flags can still narrow or deepen a profile. For example, --profile registry --repeat 1 keeps registry capability probing but disables the repeat preset. Use --profile strict --adversarial-probes none if you want strict JSON/static behavior without adversarial inputs.

Config Files

Config files let registries run repeatable checks without hiding what was executed. The file is JSON, and CLI flags still override matching config defaults. Parsing happens before the server process starts, so invalid config does not launch the target command.

Supported fields:

Field Meaning
command command as either ["node", "./server.js"] or "node" with args
args arguments used only when command is a string
cwd working directory, resolved relative to the config file
env environment variables to pass; values are redacted in JSON output
profile, protocol, timeoutMs, repeat, json same meaning as CLI options
maxStdoutBytes, maxStdoutLineBytes, maxStderrBytes byte limits for untrusted child output
scan or scanPath source scan path, resolved relative to the config file
failOnStatic make static scan findings fail
request one explicit post-initialize request: { "method": "tools/list" }
requests list of explicit post-initialize requests
safeToolCalls opt-in tools/call recipes; no tool is called unless listed here or explicitly requested
adversarialProbes opt-in built-in probes as true, "all", "none", or a list of probe names
adversarialToolCalls opt-in invalid-argument tools/call probes for configured safe tools

Example:

{
  "profile": "registry",
  "command": ["node", "./server.js"],
  "cwd": ".",
  "json": true,
  "env": {
    "API_TOKEN": "set-in-runner"
  },
  "requests": [
    { "method": "tools/list" }
  ],
  "adversarialProbes": ["invalid-method", "notification"],
  "safeToolCalls": [
    { "name": "echo", "arguments": { "text": "hello" } }
  ],
  "adversarialToolCalls": [
    { "name": "echo", "arguments": { "unexpected": true } }
  ]
}

The guard does not discover and call arbitrary tools from tools/list. Tool execution only happens through an explicit safeToolCalls entry or an explicit tools/call request you provide.

Adversarial probes are off by default because they intentionally send unusual inputs. Built-in probes check that unknown methods return structured errors, invalid params return structured errors, notifications do not receive responses, and malformed JSON does not crash the process. adversarialToolCalls is separate because it calls a named tool with intentionally invalid arguments; only use it for tools you control and consider safe/idempotent.

JSON Contract

--json is intended for CI, registries, and badge ingestion. The current contract is schemaVersion: 1; new fields may be added, but these fields are stable for consumers:

Field Meaning
schemaVersion JSON contract version, currently 1
ok true when no error-severity issue was found
config config file metadata and checks used, or { "enabled": false, ... }
profile selected guard profile, for example custom, smoke, registry, ci, or strict
command command and arguments that were validated
protocol MCP protocol version sent by the guard
negotiatedProtocol protocol version returned by the server, when available
initialized whether the server completed the initialize handshake
operation post-initialize request result, or null when --request was not used
operations all explicit post-initialize requests, including config requests and safe tool calls
adversarial opt-in adversarial probe results, including status, risk text, and per-probe issue codes
toolSchema summary of tools/list metadata validation when that operation was requested or probed from an advertised tools capability
capabilityProbes whether advertised capability list probes were enabled for this run
capabilityKeys sorted capability keys returned by initialize for a single run; repeat mode exposes this inside each runs entry
capabilityChecks advertised capability probes observed during a single run; repeat mode exposes this inside each runs entry
drift repeat-run comparison summary for negotiated protocol, advertised capabilities, tool names/counts, and resource/prompt list counts
process startup, timeout, exit code, signal, and guard-termination metadata for a single run; repeat mode exposes this inside each runs entry
checks badge-friendly per-class statuses
issueClasses registry-friendly summary grouped by installRuntime, stdioTransport, and mcpProtocol
summary badge-friendly aggregate status, primary issue, issue counts, and display guidance
fingerprint redacted reproducibility metadata for debugging registry and CI runs
issues machine-readable diagnostics with class, severity, code, and message; repeat mode also adds run
staticScan whether source scanning was enabled and whether findings fail the command
staticFindings source scan findings with language, file, line, reason, and message
runs per-run results when --repeat is used

process.output records observed stdout/stderr byte counts, configured output limits, whether retained stderr was truncated, and the issue code that stopped the run when a limit was exceeded.

summary is the preferred badge and listing entry point. It is additive to ok, checks, and issueClasses; consumers that already read those fields can keep doing so.

Summary field Shape
summary.status pass, warning, needs_inspection, or fail
summary.issueClass "", installRuntime, stdioTransport, mcpProtocol, or mixed
summary.primaryIssueCode first display-worthy issue code, or "" when none
summary.primaryIssueClass class for the primary issue, or "" when none
summary.issueCounts { "error": number, "warning": number, "total": number }
summary.badge { "label": "mcp stdio", "message": string, "color": string }
summary.display short human display guidance for registry listings

Registry summary display guidance:

Status Meaning Suggested display
pass no issues were found verified
warning warning-only result, such as static stdout risk or metadata quality advisory verified with warnings
needs_inspection error-severity issue is limited to install/runtime health runtime or install issue; inspect before presenting as protocol failure
fail stdio transport or MCP protocol error was observed stdio transport or MCP protocol failure

Check statuses are pass, fail, warning, or skipped. The checks object separates the signal into initialize, stdout, jsonRpc, operation, capabilities, toolSchema, adversarial, process, pythonBuffering, staticScan, and repeat, each with stable status and issueCodes fields. When --repeat is used, checks.repeat also includes runs, passedRuns, and failedRuns; each entry in runs is a normal schema-versioned result for that individual guard run.

issueClasses is additive to checks. It groups issue codes by the kind of problem a registry or client should display:

Issue class Meaning Display guidance
installRuntime the command could not start, timed out, exited, crashed, exceeded stderr limits, or hit a runtime advisory show as "needs inspection" or "runtime/install issue"; do not present it as an MCP protocol violation
stdioTransport stdout was not a clean newline-delimited JSON-RPC channel, exceeded output limits, or source scan found risky stdout writes show as stdio hygiene failure; ask maintainers to keep diagnostics on stderr
mcpProtocol the server emitted invalid JSON-RPC/MCP responses, mismatched request ids, or returned initialize/operation errors show as MCP/JSON-RPC conformance issue

Current issue-code mapping:

Issue class Issue codes
installRuntime initialize-timeout, operation-missing-response, operation-timeout, python-buffered-stdio, server-crashed, server-exited, spawn-failed, stderr-output-limit-exceeded
stdioTransport static-stdout-write, stdout-content-length-framing, stdout-empty-line, stdout-line-too-large, stdout-non-json, stdout-output-limit-exceeded, stdout-without-newline
mcpProtocol adversarial-invalid-method-result, adversarial-invalid-params-result, adversarial-malformed-json-result, adversarial-notification-response, adversarial-probe-crash, adversarial-probe-invalid-stdout, adversarial-probe-timeout, adversarial-tool-call-result, capability-list-error, capability-list-missing-response, capability-list-timeout, capability-list-unsupported, initialize-error, initialize-invalid-capabilities, initialize-invalid-protocol-version, initialize-invalid-result, initialize-invalid-server-info, initialize-missing-capabilities, initialize-missing-protocol-version, initialize-missing-server-info, notification-response, operation-error, repeat-capability-drift, repeat-list-shape-drift, repeat-protocol-drift, repeat-tool-drift, response-id-mismatch, response-id-type-mismatch, stdout-invalid-json-rpc, stdout-unexpected-request-id, tool-description-missing, tool-input-schema-invalid, tool-input-schema-required-missing, tool-name-duplicate, tool-name-invalid, tools-list-invalid-result

Initialize lifecycle checks are part of the MCP protocol class. Missing or invalid protocolVersion and capabilities fail the run before the guard sends notifications/initialized or any normal request. Missing or invalid serverInfo is warning-level so registries can surface incomplete metadata without confusing it with a broken transport.

JSON-RPC invariant checks distinguish wrong response ids from id type round-trip problems and fail servers that respond to notifications/initialized. JSON-RPC error frames must be structured with numeric code and string message fields.

Tool schema checks run when tools/list receives a successful result, either from --request tools/list or from the advertised tools capability probe. Duplicate or invalid tool names, missing inputSchema, invalid schema shapes, and required entries that are absent from properties are MCP protocol failures. Missing tool descriptions are warning-level so registries can show quality guidance without marking the server broken.

Capability honesty checks are additive. If initialize advertises capabilities.tools, capabilities.resources, or capabilities.prompts, the guard probes the matching tools/list, resources/list, or prompts/list method after notifications/initialized. Unadvertised capabilities are skipped, not failed. capability-list-unsupported means an advertised list method returned method-not-found; capability-list-error, capability-list-timeout, and capability-list-missing-response mean the advertised list method existed in the contract but failed at runtime.

Adversarial probes are additive and opt-in. Their failures are classified as mcpProtocol, not install/runtime failures, so registries can distinguish "the package cannot start" from "the server started but mishandled strict JSON-RPC/MCP inputs." malformed-json accepts either a structured parse error or silence after a short observation window; a crash is a protocol failure. notification expects no response.

Repeat drift checks compare successful initialized runs against the first initialized run. Negotiated protocol changes, advertised capability key changes, added or removed tool names, tool count changes, and resource/prompt list count changes are warning-level repeat-* issues. Tool order is normalized before comparison, so order-only changes do not warn.

The repeat drift object has stable status, issueCodes, baselineRun, and comparedRuns fields. Its nested negotiatedProtocol, capabilities, tools, lists.resources, and lists.prompts sections include changedRuns so registries can show exactly what changed between cold and warm starts.

Runtime issue codes remain backward-compatible. For finer registry display, runtime issues may also include a stable detailCode:

Existing issue code Detail codes
spawn-failed spawn-failed-before-startup
server-exited clean-exit-before-initialize, nonzero-exit-before-initialize, signal-exit-before-initialize
initialize-timeout startup-timeout
operation-timeout request-timeout
operation-missing-response clean-exit-during-operation, nonzero-exit-during-operation, signal-exit-during-operation
server-crashed nonzero-exit-after-initialize, signal-exit-after-initialize

Schema version policy:

Change type Versioning
Add a field, check, issue code, detail code, or optional metadata keep schemaVersion: 1
Rename, remove, or change the type or meaning of a stable field bump schemaVersion
Change the summary.status enum or the meaning of an existing status bump schemaVersion
Add a new issue code under an existing issue class keep schemaVersion: 1; consumers should fall back to class, severity, and summary.status

Migration notes from the first schemaVersion: 1 contract:

Existing consumer behavior Recommended migration
Read ok for pass/fail keep reading ok; use summary.status when a registry needs to distinguish runtime inspection from protocol failure
Read checks.*.status and checks.*.issueCodes keep reading checks; use summary.badge for badges and compact listing UI
Read issueClasses for display buckets keep reading issueClasses; use summary.issueClass and summary.primaryIssueCode for the first-line message
Treat every ok: false as a broken MCP server prefer summary.status; needs_inspection means install/runtime health failed, not stdio or MCP protocol conformance

process records the observed lifecycle even when the run passes. outcome is one of starting, running, exited, timeout, spawn-failed, or guard-terminated; starting is the transient initial value while the child is being created, not an expected terminal outcome. phase is startup, initialize, operation, adversarial, or post-initialize. exitCode and signal are included when the process exits before the guard finishes; timeout runs include timedOut, timeoutCode, timeoutMs, and guard kill metadata. spawnError is either null or an object with code and message; the matching spawn-failed issue also exposes spawnErrorCode.

Spawn failure shape:

Field Shape
process.spawnError null or { "code": "ENOENT", "message": "spawn missing-command ENOENT" }
issues[].spawnErrorCode short platform error code such as ENOENT, or "" when unavailable

fingerprint helps explain why a result reproduced in one runner but not another. It includes the guard version, redacted command argv, cwd details, protocol, timeout, repeat count, requested operation, platform/arch, relevant runtime versions, package metadata when detectable, static-scan context, and startup/total duration. Environment variable values are always emitted as <redacted> and only explicitly provided env names are listed.

Registry display flow:

Step Use
1 Use summary.status, summary.badge, and summary.display for compact badges and listing rows
2 Use fingerprint.command, fingerprint.cwd, and fingerprint.package to show what was actually run
3 Show issueClasses details so install/runtime, stdio transport, and MCP protocol failures stay distinct
4 Show checks.capabilities as advertised MCP surface honesty
5 Show checks.toolSchema as tool metadata quality, separate from startup and stdio transport health
6 Show drift warnings as stability advisories, not hard failures, unless another check failed
7 Compare fingerprint.system, fingerprint.runtimes, and fingerprint.timings before marking a package broken
8 Show fingerprint.env.names only when debugging; never ask users to paste secret values

Example:

{
  "schemaVersion": 1,
  "ok": true,
  "config": {
    "enabled": false,
    "path": "",
    "resolvedPath": "",
    "checks": {
      "command": false,
      "cwd": false,
      "envNames": [],
      "requests": [],
      "safeToolCalls": [],
      "adversarialProbes": [],
      "adversarialToolCalls": []
    }
  },
  "profile": "custom",
  "fingerprint": {
    "guard": { "name": "mcp-stdio-guard", "version": "1.0.0" },
    "command": {
      "executable": "node",
      "args": ["./server.js"],
      "argv": ["node", "./server.js"]
    },
    "cwd": {
      "requested": "/repo/server",
      "resolved": "/repo/server",
      "exists": true
    },
    "protocol": "2025-11-25",
    "config": {
      "enabled": false,
      "path": "",
      "resolvedPath": "",
      "checks": {
        "command": false,
        "cwd": false,
        "envNames": [],
        "requests": [],
        "safeToolCalls": [],
        "adversarialProbes": [],
        "adversarialToolCalls": []
      }
    },
    "profile": "custom",
    "timeoutMs": 5000,
    "repeat": 1,
    "capabilityProbes": true,
    "adversarialProbes": [],
    "operation": { "method": "tools/list", "hasParams": false, "source": "cli-request", "safeToolCallName": "" },
    "operations": [{ "method": "tools/list", "hasParams": false, "source": "cli-request", "safeToolCallName": "" }],
    "system": { "platform": "darwin", "arch": "arm64", "osRelease": "25.0.0" },
    "runtimes": {
      "node": { "version": "v24.0.0", "role": "guard-and-target" }
    },
    "package": null,
    "env": {
      "inherited": true,
      "names": ["API_TOKEN"],
      "values": { "API_TOKEN": "<redacted>" }
    },
    "staticScan": { "enabled": false, "path": "", "failOnFindings": false },
    "timings": { "startupMs": 42, "totalMs": 96 }
  },
  "process": {
    "started": true,
    "pid": 12345,
    "outcome": "guard-terminated",
    "phase": "post-initialize",
    "exitCode": null,
    "signal": null,
    "timedOut": false,
    "timeoutCode": "",
    "timeoutMs": 5000,
    "killedByGuard": true,
    "killSignal": "SIGTERM",
    "killReason": "guard-finished",
    "spawnError": null
  },
  "capabilityProbes": true,
  "adversarial": { "enabled": false, "probes": [] },
  "capabilityKeys": ["tools"],
  "capabilityChecks": {
    "tools": { "advertised": true, "method": "tools/list", "responded": true, "itemCount": 2, "error": null },
    "resources": { "advertised": false, "method": "resources/list", "responded": false, "itemCount": null, "error": null },
    "prompts": { "advertised": false, "method": "prompts/list", "responded": false, "itemCount": null, "error": null }
  },
  "issueClasses": {
    "installRuntime": { "status": "pass", "issueCodes": [] },
    "stdioTransport": { "status": "pass", "issueCodes": [] },
    "mcpProtocol": { "status": "pass", "issueCodes": [] }
  },
  "summary": {
    "status": "pass",
    "issueClass": "",
    "primaryIssueCode": "",
    "primaryIssueClass": "",
    "issueCounts": { "error": 0, "warning": 0, "total": 0 },
    "badge": { "label": "mcp stdio", "message": "pass", "color": "brightgreen" },
    "display": "verified"
  },
  "toolSchema": {
    "checked": true,
    "toolCount": 2,
    "toolNames": ["read_file", "search"],
    "validToolCount": 2,
    "warningCount": 0,
    "errorCount": 0,
    "duplicateNames": []
  },
  "checks": {
    "initialize": { "status": "pass", "issueCodes": [] },
    "stdout": { "status": "pass", "issueCodes": [] },
    "jsonRpc": { "status": "pass", "issueCodes": [] },
    "operation": { "status": "pass", "issueCodes": [] },
    "capabilities": {
      "status": "pass",
      "issueCodes": [],
      "tools": { "status": "pass", "issueCodes": [], "advertised": true, "method": "tools/list", "responded": true, "itemCount": 2 },
      "resources": { "status": "skipped", "issueCodes": [], "advertised": false, "method": "resources/list", "responded": false, "itemCount": null },
      "prompts": { "status": "skipped", "issueCodes": [], "advertised": false, "method": "prompts/list", "responded": false, "itemCount": null }
    },
    "toolSchema": { "status": "pass", "issueCodes": [] },
    "adversarial": { "status": "skipped", "issueCodes": [] },
    "process": { "status": "pass", "issueCodes": [] },
    "pythonBuffering": { "status": "pass", "issueCodes": [] },
    "staticScan": { "status": "skipped", "issueCodes": [] },
    "repeat": { "status": "skipped", "issueCodes": [] }
  }
}

The guard is registry-agnostic. It does not care whether an install command came from Smithery, Glama, GitHub, or a private catalog; it validates the command, working directory, optional source path, and observed stdio behavior.

CI

- run: npm ci
- run: npx mcp-stdio-guard --profile ci --scan src --request tools/list -- node ./server.js

Output

Passing server:

PASS MCP stdio guard
initialize: ok
frames: 2 stdout / 0 invalid
stderr: 0 lines
protocol: 2025-11-25
request: tools/list responded
tool schemas: 2/2 valid

Polluted stdout:

FAIL MCP stdio guard
initialize: ok
frames: 2 stdout / 1 invalid
stderr: 0 lines
protocol: 2025-11-25
request: tools/list responded
[error] stdout-non-json: stdout line 1 is not JSON-RPC: "server starting..."

Design

  • Runtime dependencies: zero.
  • Default behavior: validate the real process boundary.
  • Optional static scan: intentionally simple and conservative; catches common JavaScript and Python stdout writes, stdout logging handlers, and visible startup-output risks.
  • CI posture: fail on protocol corruption, crashes, and missing responses.
  • Promotion promise: no fake stars, no spam, just a tool that catches a real MCP failure mode.

License

MIT

About

Catch stdout pollution and handshake failures in MCP stdio servers before clients do.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors