Skip to content

sirmarkz/staff-engineer-mode

Repository files navigation

Staff Engineer Mode

Production-engineering judgment for AI coding agents.

Staff Engineer Mode packages public production-engineering practices as decision guidance for AI coding agents. As agents write material amounts of production code, they need to reason about what happens when that code runs at 3am. The router and specialist files add reliability, security, operability, compatibility, and rollout checks before code ships.

Sources

Staff Engineer Mode distills public engineering practices into practical guidance for AI coding agents. Representative source families include AWS Builders' Library, Google SRE and Software Engineering at Google, Meta Engineering, Microsoft SDL and DevOps guidance, Apple security and privacy docs, Netflix resilience work, and standards or guidance from NIST, CISA, OWASP, OpenSSF, IETF, and W3C. See the source index for the full reference set. Staff Engineer Mode is independent and is not endorsed by or affiliated with those organizations.

How It Works

Ask a normal engineering question. Hand the agent a task, design, diff, incident, rollout, or maintenance problem. The router picks one specialist (occasionally one secondary), reads that file, and returns concrete decisions, risks, checks, owners, supporting details, and next steps. You never name a specialist.

Supported tools should list only the native staff-engineer-mode router. Specialist files live under specialists/ and load only after routing. The router picks one primary specialist by default.

For commits and amends, Staff Engineer Mode calls agent-pr-review against the exact staged diff. For releases, tags, version bumps, packages, artifacts, and promotions, it calls release-build-reproducibility and production-readiness-review together.

Installation

Examples labeled "terminal" are run in your shell. Examples labeled "agent chat" are typed inside that tool's interactive agent session.

Claude Code

Terminal:

claude plugin marketplace add https://github.com/sirmarkz/staff-engineer-mode.git
claude plugin install staff-engineer-mode@staff-engineer-mode

Agent chat:

/plugin marketplace add https://github.com/sirmarkz/staff-engineer-mode.git
/plugin install staff-engineer-mode@staff-engineer-mode

Codex

Terminal:

codex plugin marketplace add https://github.com/sirmarkz/staff-engineer-mode.git
codex plugin add staff-engineer-mode@staff-engineer-mode

Cursor

Terminal:

git clone https://github.com/sirmarkz/staff-engineer-mode.git ~/.cursor/staff-engineer-mode-src
mkdir -p ~/.cursor/plugins
ln -s ~/.cursor/staff-engineer-mode-src ~/.cursor/plugins/staff-engineer-mode

OpenCode

Terminal:

opencode plugin 'staff-engineer-mode@git+https://github.com/sirmarkz/staff-engineer-mode.git'

GitHub Copilot CLI

Terminal:

copilot plugin marketplace add https://github.com/sirmarkz/staff-engineer-mode.git

Install the plugin:

copilot plugin install staff-engineer-mode@staff-engineer-mode

Gemini CLI

Terminal:

gemini extensions install https://github.com/sirmarkz/staff-engineer-mode

Verify

Start a fresh session inside any open repo and ask one of:

  • "Before implementing partner webhooks, design delivery retries, replay, and dead-letter handling."
  • "For a new inventory dependency call, decide timeout, retry, and fallback."
  • "Review my last commit."

The agent should load the router, choose one specialist, and respond with concrete decisions, risks, checks, owners, supporting details, and next steps.

What's Inside

One native router skill: staff-engineer-mode. It routes to 54 specialist files under specialists/; those files are not installed or listed as separate native skills.

Examples by surface:

Surface Example specialist files
Architecture and interfaces architecture-decisions, api-design-and-compatibility, data-contracts, state-machine-correctness
Reliability and resilience slo-and-error-budgets, high-availability-design, dependency-resilience, backup-and-recovery, resilience-experiments, performance-and-capacity
Delivery and change safety progressive-delivery, feature-flag-lifecycle, release-build-reproducibility, testing-and-quality-gates, test-data-engineering, dev-environment-parity, migration-and-deprecation, code-readability-for-agents, dependency-and-code-hygiene, configuration-and-automation-safety, fleet-upgrades
Operations and observability observability-and-alerting
Security and privacy secure-sdlc-and-threat-modeling, identity-and-secrets, cryptography-and-key-lifecycle, software-supply-chain-security, vulnerability-management, tenant-isolation, privacy-and-data-lifecycle
Data and workflow systems distributed-data-and-consistency, database-operations, event-workflows, data-pipeline-reliability, caching-and-derived-data
Platform and edge infrastructure-and-policy-as-code, internal-service-networking, edge-traffic-and-ddos-defense, cost-aware-reliability
Client, ML/AI, and experimentation web-release-gates, mobile-release-engineering, accessibility-gates, llm-application-security, llm-evaluation, llm-serving-cost-and-latency, ml-reliability-and-evaluation, experimentation-and-metric-guardrails
Engineering workflow, readiness, and controls agent-pr-review, ai-coding-governance, documentation-lifecycle, engineering-control-evidence, production-readiness-review, incident-response-and-postmortems, oncall-health, platform-golden-paths

Contributing

Patches welcome, especially practices from authoritative sources: first-party engineering publications, official documentation, standards bodies, peer-reviewed papers, or widely cited practitioner references.

New specialist files must be technology-agnostic, cite source-index references, and avoid vendor endorsement. Read CONTRIBUTING.md before opening a PR. The voice is enforced.

License

MIT