Skip to content

#431 Add deep-review-ci agent (actionlint static + LLM semantic pass)#444

Merged
hubertgajewski merged 4 commits into
mainfrom
feature/431
May 1, 2026
Merged

#431 Add deep-review-ci agent (actionlint static + LLM semantic pass)#444
hubertgajewski merged 4 commits into
mainfrom
feature/431

Conversation

@hubertgajewski
Copy link
Copy Markdown
Owner

@hubertgajewski hubertgajewski commented Apr 30, 2026

Summary

Adds .claude/agents/deep-review-ci.md — a CI / GitHub Actions specialist agent for the /deep-review-next orchestrator under epic #436. The agent runs actionlint (which embeds shellcheck) first as a zero-LLM-token static pass, and only escalates to an LLM semantic pass when the workflow shows non-trivial markers (if: conditions, needs: graphs / matrices, ref fetches, pull_request_target / workflow_run triggers, secret writes, concurrency: / schedule: / workflow_dispatch.inputs:). The LLM checklist's first item explicitly catches the head_sha availability bug from PR #205 (actions/checkout@vN with fetch-depth: 0 does not fetch refs from non-checked-out branches) at HIGH severity, citing OWASP-T10 A08, CWE 1395.

The agent's frontmatter is the only one in the roster that whitelists Bash: it adds Bash(actionlint *) and Bash(shellcheck *) because the static-tool pass needs to spawn those two analyzers. No other shell access. The orchestrator's Read, Grep, Glob-only generalization is updated to call out this exception.

Per sibling-PR precedent (#426 / #427 / #428 / #429 each bundled "agent + orchestrator wiring"), this PR also wires deep-review-ci into /deep-review-next: adds the row to the current roster table, removes the row from the Pending table, adds the conditional dispatch in Step 1 (only when .github/workflows/**.yml / .yaml or action.yml / action.yaml is in scope), adds the ### deep-review-ci section to the Step 2 aggregate output, and extends the total: line and status: rule to include ci HIGH / ci MEDIUM / ci LOW.

README.md Prerequisites adds an actionlint row (with the brew install actionlint shellcheck install line) and the optional-tools sentence is updated to include it.

Closes #431
Contributes to #436

DoD coverage

# DoD item Where
1 .claude/agents/deep-review-ci.md with restricted tools frontmatter .claude/agents/deep-review-ci.md lines 1–6
2 Static-tool pass first; LLM pass only when non-trivial ## How to run steps 2–4 + ## Non-trivial markers
3 PR #205 added to S7 benchmark corpus and recall verified Out of scope — see Scope notes
4 README Prerequisites mentions actionlint README.md line 109
5 Sources cited via REFERENCES.md short IDs Agent's ## Sources block uses real OWASP-T10, OWASP-ASVS, CWE-T25, CWE IDs from .claude/skills/deep-review-next/REFERENCES.md (#425 already merged)
6 Estimate = 2 on Project #1 Pending — set on the issue pre-merge (Project #1 hours/estimate live on issues, not PRs); blocked in this session by gh token scope (only gist, read:org, repo granted; project scope needed)
7 Actual hours on Project #1 Pending — set on the issue post-merge; same scope block

Citation mapping note

The issue's AC names "GitHub Actions security hardening docs" and "OWASP CI/CD Top 10" as the LLM-pass sources. Neither short ID exists in the project's REFERENCES.md bibliography. The agent maps each CI concern to the closest in-scope IDs that DO exist in REFERENCES.md:

CI concern Cited as
head_sha availability / pull_request_target checkout-and-execute / workflow_run provenance OWASP-T10 A08, CWE 1395
Token scoping / missing permissions: OWASP-T10 A01, OWASP-T10 A05, OWASP-ASVS V14
Secret in ${{ secrets.* }} shell expression OWASP-T10 A03, CWE-T25 78
Secrets in outputs / artifacts / logs OWASP-T10 A02, CWE-T25 200
Third-party action pinned to movable tag OWASP-T10 A06, CWE 1357
Missing concurrency on push-back workflows OWASP-ASVS V14
Missing job timeout OWASP-T10 A05

If REFERENCES.md later adds a GH-HARDEN or OWASP-CICD entry, the agent prose can be retrofitted in a one-line sweep.

Scope notes

Test plan

  • git diff origin/main --stat shows three files: .claude/agents/deep-review-ci.md (new, 117 lines), .claude/skills/deep-review-next/SKILL.md (+14/-3, orchestrator wiring), README.md (+2/-1, Prerequisites row).
  • Frontmatter tools: line restricts to Read, Grep, Glob, Bash(actionlint *), Bash(shellcheck *) per the issue spec.
  • Frontmatter model: sonnet is set.
  • LLM checklist's first item targets the PR #203 Checkout default branch so self-healing script is always up to date #205 head_sha regression at HIGH severity, citing OWASP-T10 A08, CWE 1395.
  • Sources block lists only short IDs that resolve in .claude/skills/deep-review-next/REFERENCES.md.
  • Output schema matches the sibling pipe-delimited convention (<severity> | <category> | <file>:<line> | <description> | <recommended fix> plus summary: line).
  • Orchestrator wiring: agent appears in the current roster table; conditional dispatch fires only when a .github/workflows/**.yml / .yaml or action.yml / action.yaml path is in the diff; ### deep-review-ci section + ci HIGH/MEDIUM/LOW counters added to Step 2 aggregate.
  • README Prerequisites table renders with five rows (Node.js / Bruno / Docker / act / actionlint); the actionlint row and optional-tools sentence reference /deep-review-next (matches the orchestrator's actual name).
  • /security-review (manual against this branch's diff): clean — no executable code, no secrets, no auth surface.
  • /deep-review checklist: 5 pass / 13 N/A / 0 fail.
  • Reviewer inline comment on the tools: * glob — replied with the rationale (Claude Code tool-frontmatter syntax; <file> placeholder is not interpreted).
  • CI lint workflow (playwright-typescript-lint.yml) is N/A for this PR (no playwright/typescript/** changes); GitHub will skip it.
  • /deep-review-next end-to-end smoke against a PR that touches .github/workflows/*.yml (verifies dispatch-by-glob and that deep-review-ci returns the expected schema). Sequence with the other agent rollouts; not tracked here.

Comment thread .claude/agents/deep-review-ci.md
Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review

PR #444 adds a new deep-review-ci.md agent and updates README.md.

deep-review-ci.md (new)

Frontmatter — name, description, tools (Read, Grep, Glob, Bash(actionlint *), Bash(shellcheck *)), model: sonnet — all correct. Tool restriction is sound: Read/Grep/Glob plus two named Bash invocations prevent shell escape.

Static-tool pass — step 2 runs actionlint which internally invokes shellcheck. Step 3 calls shellcheck directly against an extracted script when inline output is ambiguous.

Non-trivial markers — 7 triggers: if: conditions on event metadata/ref/actor, multi-job needs:/matrix, ref fetches, pull_request_target/workflow_run/repository_dispatch triggers, secret writes, concurrency:/schedule:/workflow_dispatch.inputs:.

LLM semantic checklist — 9 items. head_sha availability for non-checked-out branches flagged at HIGH citing PR #205. pull_request_target injection, workflow_run fork-originated triggers, token scoping, concurrency on push-back, action pinning, secret output handling, job timeout, secret-in-expression all covered. Citations use short IDs.

Output format — consistent with existing review-agents style.

DoD items 1, 2, 4, 5 confirmed in diff. Item 3 (PR #205 recall benchmark) correctly excluded in scope notes.

README.md

actionlint row correctly added to prerequisites table. Optional-tools sentence updated to include actionlint.

Finding

The agent is well-scoped, tool restriction is sound, LLM checklist is comprehensive, README update is accurate. One non-blocking inline comment at line 4. No blocking issues. APPROVE.

Reviewed by proxy/@preset/minimax-minimax-m2-7-no-thinking

Copy link
Copy Markdown

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

PR #431 adds .claude/agents/deep-review-ci.md — a CI/GitHub Actions specialist agent wired into /deep-review-next — plus orchestrator wiring changes in SKILL.md and a README.md prerequisites row.

What was reviewed

  • .claude/agents/deep-review-ci.md (new, ~117 lines): frontmatter (tools: Read, Grep, Glob, Bash(actionlint *), Bash(shellcheck *), model: sonnet), Sources section citing OWASP-T10, OWASP-ASVS, CWE-T25, CWE IDs resolved against REFERENCES.md, Inputs section describing the stop rule, ## How to run procedure (static-tool pass → trivial-vs-non-trivial gate → LLM semantic pass), non-trivial markers list, LLM semantic checklist with 9 items (head_sha availability, pull_request_target injection, workflow_run fork-origin guard, token scoping, secret in shell, artifact/secret output, action pinning, concurrency, job timeout), categories, confidence threshold, severity levels, and the pipe-delimited output schema.
  • .claude/skills/deep-review-next/SKILL.md: adds deep-review-ci row to the current roster table, removes it from the roadmap table, updates the orchestrator dispatch comment to note the Bash exception, adds the conditional dispatch block, adds ### deep-review-ci to Step 2 aggregate output, and extends total: / status: to cover ci HIGH/MEDIUM/LOW.
  • README.md: adds actionlint row to Prerequisites table (with install instructions) and updates the optional-tools sentence.

Findings

No executable code is introduced. All content is documentation or orchestrator configuration. The agent's tools: frontmatter correctly restricts Bash to the two named analyzers. The orchestrator wiring correctly gates the dispatch on workflow-file paths. The README.md prerequisites entry references /deep-review-next by its actual name. The sources map CI concerns to short IDs that exist in REFERENCES.md. No blocking issues.

Reviewed by proxy/@preset/minimax-minimax-m2-7-no-thinking

@hubertgajewski hubertgajewski merged commit e6819d5 into main May 1, 2026
28 of 29 checks passed
@hubertgajewski hubertgajewski deleted the feature/431 branch May 1, 2026 06:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[tooling] CI / GitHub Actions reviewer agent for /deep-review

1 participant