Please do not report security vulnerabilities through public GitHub issues.
Use GitHub's private vulnerability reporting
to submit a report. If that is unavailable, email git@jonathanchun.com with
the subject line [ShellGuard Security].
Please include:
- A description of the vulnerability and its potential impact
- Steps to reproduce or a proof-of-concept
- The version or commit hash you tested against
You should receive an acknowledgment within 72 hours. Once the issue is confirmed, a fix will be developed privately and released as soon as practical. Credit will be given unless you prefer to remain anonymous.
Please practice responsible disclosure: allow a reasonable window for a fix before publishing details publicly.
Only the latest release on the main branch receives security updates.
There is no backport policy for older versions.
ShellGuard is designed for a specific context: it sits between an LLM (via MCP) and a remote host where the LLM has been prompted to operate in read-only, diagnostic mode. The LLM is already instructed not to take destructive actions. ShellGuard exists as a fail-safe -- a guardrail that catches mistakes, not a fortress designed to withstand a dedicated adversary with direct access to the tool.
In normal operation, the LLM cooperates with the security policy. It will not
attempt to rm -rf / or exfiltrate data because its system prompt tells it not
to. ShellGuard's job is to enforce that contract mechanically so that prompt
injection, model misbehavior, or accidental misuse cannot silently escalate into
destructive actions. Think of it as a seatbelt, not an armored vehicle.
What ShellGuard is designed to stop:
- An LLM that has been prompt-injected into running destructive commands
- Accidental execution of dangerous operations during legitimate diagnostics
- Shell injection through malformed or adversarial input strings
- Escalation from read-only diagnostics to write operations
What ShellGuard does not claim to stop:
- A skilled human attacker with direct access to the MCP tool who is deliberately crafting bypass attempts. The allowlist and parser are thorough but not formally verified.
- Information disclosure through allowed read-only commands (this is by design -- diagnostics require reading system state)
- Attacks that operate entirely within the bounds of allowed commands (e.g.,
reading sensitive files with
cat, querying metadata endpoints withcurl)
The security model is defense-in-depth: multiple independent layers each reduce risk, but no single layer is assumed to be impenetrable.
ShellGuard gates shell command execution on remote hosts through a four-stage defense-in-depth pipeline. Every command must pass all four stages sequentially; a failure at any stage blocks execution entirely.
Input ─► Parse ─► Validate ─► Reconstruct ─► Execute
The parser (parser/) converts raw shell input into a structured AST using
mvdan.cc/sh/v3, a proper bash parser -- not
regex. It enforces hard limits (64 KB input, 32 pipe segments, 1024 args per
segment) and rejects dangerous shell constructs at the syntax level:
- Command substitution (
$(...),`...`) - Variable expansion (
$HOME,${VAR}) - Process substitution (
<(...),>(...)) - Arithmetic expansion (
$((...))) - Redirections (
>,>>,<,2>&1) - Background execution (
&) - Control flow (
if,while,for,case) - Function definitions, subshells, brace groups
- Multi-statement input (
;as statement separator) - Extended globs, brace expansion
Only simple commands connected by |, &&, or || are accepted.
The validator (validator/) checks every parsed pipeline segment against a
YAML-based command registry using default-deny semantics. A command must be
explicitly allowlisted to execute. Currently 84 commands are allowed and 95 are
explicitly denied with human-readable reasons.
Key behaviors:
- Individual flags are validated per command. Denied flags include a reason.
sudois restricted tosudo [-u user] <command>with recursive validation of the inner command. All other sudo flags are rejected.xargsrequires a pipe and recursively validates its inner command.- Subcommands (e.g.,
docker ps,kubectl get,aws ec2 describe-instances) are resolved to composite manifest keys. - SQL passed to
psql -cis validated: onlySELECT,EXPLAIN,SHOW,WITH(read-only CTEs), and psql backslash commands are allowed. - Path arguments are normalized and checked against per-command restricted paths.
- Glob patterns in positional arguments are rejected unless explicitly allowed.
The reconstructor (ssh/) rebuilds the validated pipeline into a safe shell
command string. ShellQuote is the critical security boundary: every token that
is not strictly alphanumeric plus a small safe character set is wrapped in
single quotes with proper escaping. This neutralizes any shell metacharacters
that might have passed through as literal argument values.
Pipeline operators (|, &&, ||) are inserted verbatim between segments.
This is a deliberate trust boundary -- reconstruction trusts the parser to have
produced only safe operators.
The reconstructed command is sent to the remote host over SSH. Output is truncated at 64 KB (75% head / 25% tail split) to prevent unbounded responses.
curl is allowed with a GET-only policy (write methods like -X POST, -d,
--data, -T are denied). However, there is no URL validation. An LLM client
could be manipulated into constructing requests to internal endpoints:
- Cloud metadata services (e.g.,
http://169.254.169.254/) - Internal services on
localhostor private IP ranges
The mitigation is that command substitution is blocked, so URLs cannot be dynamically constructed from host data. The risk of LLM-directed SSRF remains.
printenv and cat /proc/self/environ are allowed by design for diagnostics.
These can expose database credentials, API keys, and other secrets stored in
environment variables.
The Operator field in pipeline segments is inserted into reconstructed
commands without quoting. Reconstruction trusts the parser to produce only |,
&&, or ||. If the Pipeline struct were mutated between validation and
reconstruction, arbitrary operators could be injected. No immutability
enforcement exists on the struct (TOCTOU gap), but in practice the struct is
created and consumed within a single synchronous call path.
The parser strips quotes and normalizes tokens to their semantic values. If a
double-quoted string containing $(cmd) reaches the parser, it is preserved as
a literal string (the parser does not reject it in all code paths for
double-quoted content). The reconstructor's single-quoting neutralizes this, but
it represents a defense-in-depth dependency rather than independent rejection.
docker inspect --format accepts arbitrary Go templates, which can call
built-in template functions. This is constrained to read-only inspection but
could be used to extract information in unexpected ways.
ShellGuard supports three host key verification modes:
| Mode | Behavior |
|---|---|
accept-new (default) |
Trust-on-first-use. Accept unknown hosts on first connection, reject changed keys on subsequent connections. |
strict |
Require the host key to already exist in known_hosts. Reject unknown hosts. |
off |
Disable host key verification entirely. Not recommended. |
The default accept-new mode provides TOFU (Trust-On-First-Use) security,
which protects against MITM attacks after the initial connection but not during
the first connection to an unknown host. For environments where host keys can be
pre-distributed, strict mode is recommended.
Known host keys are persisted to a known_hosts file and checked on subsequent
connections. A mismatch between a stored key and the key presented by the server
will reject the connection with a HostKeyError.
ShellGuard maintains dedicated security test suites across all pipeline stages:
| Test file | Scope |
|---|---|
parser/security_test.go |
40 attack vector categories against the parser |
validator/attack_vectors_test.go |
20 attack vector categories against the validator |
ssh/reconstruct_security_test.go |
20 attack vector categories against ShellQuote and reconstruction |
security_pipeline_test.go |
19 cross-layer test groups exercising the full pipeline end-to-end |
ssh/hostkey_security_test.go |
Host key verification security scenarios |
Attack vectors tested include: shell injection, command substitution bypass, variable expansion, null byte injection, Unicode homoglyphs and control characters, sudo escalation, SQL injection through psql, subcommand confusion, flag smuggling, path traversal, TOCTOU operator injection, and resource exhaustion.
Fuzz tests (parser/fuzz_test.go, ssh/fuzz_test.go) run with rich seed
corpora covering normal inputs, attack payloads, edge cases, and Unicode. Fuzz
invariants include non-empty segments, valid operators, balanced quoting,
idempotent ShellQuote, and round-trip consistency.