n-n-code · n-n-code · Apr 13, 2026
diff --git a/.agents/skills/backend-guidance/SKILL.md b/.agents/skills/backend-guidance/SKILL.md
@@ -5,10 +5,15 @@ description: Overlay for server-side networked code — HTTP handlers, gRPC serv
 
 # Backend Guidance
 
-Read `AGENTS.md` first. This is a composable overlay, not a standalone workflow.
+This is a composable overlay, not a standalone workflow.
 Use alongside the repo's implementation skill (e.g. **coding-guidance-cpp**, **project-core-dev**)
 when the change touches backend code.
 
+Use this as the thin default backend overlay for ordinary backend work.
+If the task includes service-boundary refactors, repository or transaction work,
+queue or webhook reliability, stronger testing expectations, or explicit
+trust-boundary hardening, prefer `backend-systems-guidance`.
+
 ## When to use
 
 The repo has server-side networked code: HTTP route handlers, gRPC service

diff --git a/.agents/skills/backend-systems-guidance/SKILL.md b/.agents/skills/backend-systems-guidance/SKILL.md
@@ -0,0 +1,131 @@
+---
+name: backend-systems-guidance
+description: Canonical overlay for server-side networked code that needs stronger architecture, testing, reliability, and security discipline. Use alongside the repo's implementation skill when implementing or reviewing backend services, APIs, middleware, queues, repositories, or backend refactors.
+---
+
+# Backend Systems Guidance
+
+This is a composable overlay, not a standalone workflow.
+Use alongside the repo's implementation skill when the change touches backend
+code.
+
+This is the canonical backend overlay in this repo.
+It extends the thin baseline `backend-guidance` overlay with stronger guidance
+for non-trivial service boundaries, repositories, reliability, and trust
+boundaries.
+
+Prefer it over `backend-guidance` when the task includes new endpoints or
+consumers, multi-layer refactors, repository or transaction work, auth or
+trust-boundary logic, or backend review that needs explicit testing and
+reliability checks.
+
+Use the bundled references only when needed:
+
+- [references/trigger-evals.md](references/trigger-evals.md) for lightweight
+  prompt checks when revising the trigger or scope
+
+## When to use
+
+- the repo has server-side networked code such as HTTP handlers, gRPC methods,
+  webhooks, queue consumers, or message producers
+- the task adds or reshapes routes, controllers, services, repositories,
+  middleware, or request-processing boundaries
+- the task changes auth, authorization, validation, idempotency, retries,
+  external requests, caching, or observability
+- the task needs backend review beyond basic handler thinness, especially for
+  security, data access, or missing tests
+
+## Not for
+
+- HTTP client code, CLI tooling, offline batch scripts, or data pipelines with
+  no request or consumer boundary
+- frontend-only work
+- threat modeling or security audit work where `security` should be the primary
+  workflow skill
+
+Do not fold infrastructure deployment workflows, outbound-client-only guidance,
+or full security-audit checklists into this skill. Keep this overlay centered
+on backend request and consumer systems plus their immediate reliability and
+trust boundaries.
+
+## Core workflow
+
+1. Read the touched backend files and map the request or consumer path end to
+   end: boundary, service logic, data access, external calls, and state
+   changes.
+2. Pick the mode before changing code:
+   - baseline backend change when the work is mostly a thin handler or small
+     service fix
+   - service-boundary change when responsibilities, data access, or dependency
+     direction may need to move
+   - reliability-hardening or review mode when the main risk is missing tests,
+     auth gaps, retries, observability, or unsafe failure handling
+3. Keep the boundary thin: decode input, authenticate and authorize, validate,
+   call a service, map transport errors, and serialize output. Business
+   decisions belong in service code that can run without the transport layer.
+4. Place persistence and external integrations deliberately:
+   - repositories or data adapters own query shape, batching, and transaction
+     details when that improves clarity or testing
+   - services coordinate business rules, idempotency, retries, and side-effect
+     ordering
+   - handlers and controllers do not reach directly into ORM or network clients
+     unless the change is truly trivial and stays trivial
+5. Harden cross-cutting concerns at the edge:
+   - validate external input once at the boundary
+   - enforce auth and authorization before business actions
+   - set timeouts, retry rules, and destination allowlists for outbound calls
+   - use structured logging, correlation identifiers, and explicit error
+     mapping for observable failure paths
+6. Choose the smallest test set that proves the change:
+   - unit tests for service logic and decision branches
+   - integration tests for handlers, consumers, repositories, and transaction
+     behavior
+   - auth and permission tests for protected flows
+   - contract or schema tests when the change alters external API or event
+     shapes
+   - load or concurrency tests only for changed hotspots, queue throughput, or
+     latency-sensitive paths
+7. Review the result for boundary leaks, unsafe defaults, data-access
+   inefficiency, and missing verification before finishing.
+
+## Decision rules
+
+- Start with `backend-guidance` for ordinary backend edits. Use this overlay
+  when the task needs stronger design pressure, harder review, or explicit
+  backend quality gates.
+- Keep handlers thin in responsibility, not by literal line count. If a
+  handler or consumer owns business decisions, retries, transaction branching,
+  or query orchestration, extract inward.
+- Keep business logic transport-free. If testing a rule requires booting HTTP,
+  gRPC, or queue infrastructure, the logic is in the wrong place.
+- Add a repository or data-access interface when it reduces duplication,
+  isolates non-trivial queries, helps transaction composition, or makes tests
+  materially simpler. Do not add one for single-call trivial CRUD.
+- Prefer one validation pass at the outer edge plus typed internal data. Avoid
+  repeated validation in every layer unless a trust boundary changes.
+- Treat retries as a design choice, not a default. Only retry idempotent or
+  explicitly deduplicated work, and pair retries with deadlines or backoff.
+- Use idempotency keys or duplicate-detection for retried creates, webhook
+  handlers, and queue consumers that can be re-delivered.
+- Every outbound request needs a timeout and failure policy. For user-controlled
+  destinations, apply allowlists or equivalent SSRF protections.
+- Keep error handling explicit: domain code returns or throws domain-level
+  failures; boundary code maps them to HTTP, gRPC, queue, or job semantics.
+- Measure before adding caching. Cache only stable read paths with clear
+  invalidation or bounded staleness.
+
+## Validation
+
+A backend change is done when, in addition to the base implementation skill's
+validation:
+
+- handlers or consumers stay as boundary glue and delegate business decisions to
+  testable service code
+- data access and external I/O live behind clear seams when the change is
+  non-trivial
+- external input, auth, and transport-specific error mapping stay at the edge
+- retries, idempotency, timeouts, and failure handling are explicit where the
+  change can duplicate work or call remote systems
+- tests cover the changed behavior at the correct level, including integration
+  coverage for boundary behavior and permission or failure cases when relevant
+- new high-risk paths emit enough evidence to debug production behavior
diff --git a/.agents/skills/backend-systems-guidance/references/trigger-evals.md b/.agents/skills/backend-systems-guidance/references/trigger-evals.md
@@ -0,0 +1,44 @@
+# Backend Systems Guidance Trigger Evals
+
+Use these prompts to spot over-triggering and boundary drift between
+`backend-guidance` and `backend-systems-guidance`.
+
+## Expected `backend-guidance`
+
+- "Fix this HTTP handler so it stops doing auth checks inline and just calls the service."
+- "Clean up this gRPC method and move input parsing to the boundary."
+- "Review this small message consumer and make sure transport concerns stay at the edge."
+
+## Expected `backend-systems-guidance`
+
+- "Add idempotent webhook handling with retries, timeout rules, and integration tests."
+- "Refactor this endpoint stack into controller, service, and repository layers and check for N+1 queries."
+- "Review this backend change for auth gaps, transaction boundaries, outbound request safety, and missing tests."
+
+## Expected Neither
+
+- "Tune this HTTP client retry policy for outbound SDK calls."
+- "Threat-model our auth system and list likely attack paths."
+- "Set up Kubernetes deployment manifests and production dashboards."
+
+## Manual Simulation Summary
+
+Prompts used:
+
+- positive-obvious: "Refactor this endpoint stack into controller, service, and repository layers and check for N+1 queries."
+- positive-paraphrased: "This backend change touches retries, auth, and webhook deduplication. Use the stronger backend overlay."
+- negative-adjacent: "Tune this HTTP client retry policy for outbound SDK calls."
+
+What passed:
+
+- obvious multi-layer backend work points clearly to `backend-systems-guidance`
+- paraphrased reliability and trust-boundary work still triggers the stronger
+  overlay
+- outbound-client-only work stays out of scope
+
+Residual risk:
+
+- backend review requests may still need judgment when they are narrow enough to
+  fit `backend-guidance`
+- if the stronger overlay grows more review-specific, re-check whether a
+  companion review overlay would be cleaner