Skip to content

feat(skills): rewrite code-review with sharded step-file architecture#2007

Merged
alexeyv merged 4 commits intomainfrom
feat/code-review-rewrite
Mar 15, 2026
Merged

feat(skills): rewrite code-review with sharded step-file architecture#2007
alexeyv merged 4 commits intomainfrom
feat/code-review-rewrite

Conversation

@alexeyv
Copy link
Copy Markdown
Collaborator

@alexeyv alexeyv commented Mar 15, 2026

Summary

  • Rewrites the monolithic XML-step code-review workflow into 4 sharded step files following the same architecture as bmad-quick-dev-new-preview
  • Adds three parallel review layers (Blind Hunter, Edge Case Hunter, Acceptance Auditor) with structured triage that classifies findings into intent_gap/bad_spec/patch/defer/reject
  • Works with or without a spec file; loopbacks are recommendations, not automated re-runs

Test plan

  • Verify npm run quality passes (all checks green in this branch)
  • Run the skill on a real diff with a spec file — confirm 3 parallel reviews launch and triage produces categorized output
  • Run the skill on a bare diff (no spec) — confirm Acceptance Auditor is skipped and intent_gap/bad_spec categories are unavailable
  • Test the fallback path (no subagents) — confirm prompt files are generated with correct review_mode gating

🤖 Generated with Claude Code

…ecture

Replace monolithic XML-step workflow with 4 sharded step files:
- step-01: gather context (diff source, spec, chunking)
- step-02: parallel review (blind hunter, edge case, acceptance auditor)
- step-03: triage (normalize, deduplicate, classify into 5 buckets)
- step-04: present (categorized findings with recommendations)

Key changes:
- Three parallel review layers via subagent invocation
- Structured triage with intent_gap/bad_spec/patch/defer/reject categories
- Works with or without a spec file
- Loopbacks are recommendations, not automated re-runs
- Remove checklist.md (superseded by triage step)
- Remove discover-inputs.md (no longer referenced)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@augmentcode
Copy link
Copy Markdown

augmentcode bot commented Mar 15, 2026

🤖 Augment PR Summary

Summary: Refactors bmad-code-review from a single monolithic workflow definition into a sharded, step-file workflow aligned with the newer “Quick Flow” execution style.

Changes:

  • Removes the legacy checklist.md and discover-inputs.md protocols in favor of an explicit 4-step sequence.
  • Adds step files for: context gathering, parallel review execution, findings triage, and final presentation.
  • Introduces three review layers (Blind Hunter, Edge Case Hunter, Acceptance Auditor) and describes what context each receives.
  • Adds structured triage rules to normalize outputs, deduplicate overlapping findings, and classify issues into intent_gap/bad_spec/patch/defer/reject.
  • Supports running with or without a spec/story file; in “no-spec” mode spec-dependent categories are disabled/reclassified.
  • Defines a fallback mode that generates prompt files when subagents aren’t available.
  • Updates the workflow entrypoint (workflow.md) to describe step-file execution rules and point to steps/step-01-gather-context.md.

Technical Notes: The new flow centers around constructing a unified diff ({diff_output}), optionally loading spec/context docs, running parallel reviewer skills, then producing a categorized report rather than an auto-fix loop.

🤖 Was this summary useful? React with 👍 or 👎

Copy link
Copy Markdown

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.


## NEXT

Read fully and follow `./steps/step-02-review.md`
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The next-step link here uses ./steps/..., but this file is already inside steps/, so that path will resolve to steps/steps/... and break sequential execution (see also: step-02-review.md and step-03-triage.md). This likely wants to be a sibling reference like ./step-02-review.md per the other sharded workflows.

Severity: high

Other Locations
  • src/bmm/workflows/4-implementation/bmad-code-review/steps/step-02-review.md:40
  • src/bmm/workflows/4-implementation/bmad-code-review/steps/step-03-triage.md:49

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

name: Gather Context
description: 'Determine what to review, construct the diff, and load any spec/context documents.'
diff_output: '' # set at runtime
spec_file: '' # set at runtime (path or empty)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spec_file is described as a “path or empty”, but later instructions treat {spec_file} as loaded content (parse its frontmatter context, and pass “{spec_file} content” to the Acceptance Auditor). Consider clarifying whether this variable is meant to store the path or the file contents to avoid the auditor/context-loading steps running with the wrong input.

Severity: medium

Other Locations
  • src/bmm/workflows/4-implementation/bmad-code-review/steps/step-02-review.md:23

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 15, 2026

📝 Walkthrough

Walkthrough

Replaces legacy checklist and discovery docs with a four-step, modular BMAD code-review workflow: Gather Context, parallel Review (three subagents), Triage, and Present; workflow.md restructured to an architectural, step-driven format.

Changes

Cohort / File(s) Summary
Removed documentation
src/bmm/workflows/4-implementation/bmad-code-review/checklist.md, src/bmm/workflows/4-implementation/bmad-code-review/discover-inputs.md
Deleted Senior Developer Validation Checklist and Discover Inputs Protocol docs.
Workflow entry
src/bmm/workflows/4-implementation/bmad-code-review/workflow.md
Rewrote workflow.md from a procedural, git-aware narrative to a high-level workflow architecture and initialization sequence.
Step: Gather Context
src/bmm/workflows/4-implementation/bmad-code-review/steps/step-01-gather-context.md
Added step defining review scope prompts, diff-construction options (uncommitted/staged/branch/commit-range/file list), spec_file handling, large-diff chunking rules, and a checkpoint confirmation.
Step: Review (parallel)
src/bmm/workflows/4-implementation/bmad-code-review/steps/step-02-review.md
Added step launching parallel subagents (Blind Hunter, Edge Case Hunter, Acceptance Auditor), payload rules, failure handling, and fallback reviewer prompt generation.
Step: Triage
src/bmm/workflows/4-implementation/bmad-code-review/steps/step-03-triage.md
Added normalization schema for findings, deduplication rules, conservative classification into buckets (intent_gap, bad_spec, patch, defer, reject), and handling for no-spec mode.
Step: Present
src/bmm/workflows/4-implementation/bmad-code-review/steps/step-04-present.md
Added presentation rules grouping findings by category with required formats, summary counts, and completion guidance.

Sequence Diagram(s)

sequenceDiagram
  participant User as User
  participant Engine as Workflow Engine
  participant Repo as Repository/Docs
  participant Subagents as Parallel Subagents
  participant Triage as Triage Processor
  participant Presenter as Presenter

  User->>Engine: Start review (choose scope & mode)
  Engine->>Repo: Load diffs & optional spec_file
  Engine->>Subagents: Dispatch review payloads (Blind, Edge, Acceptance)
  note right of Subagents: Reviews run in parallel\n(no shared conversation history)
  Subagents-->>Engine: Findings (or failure statuses)
  Engine->>Triage: Normalize, dedupe, classify findings
  Triage-->>Engine: Triaged report (buckets + counts)
  Engine->>Presenter: Format presentation per rules
  Presenter-->>User: Present grouped findings & summary
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • bmadcode
  • pbean
  • cecil-the-coder
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The PR title clearly describes the main change: rewriting the code-review workflow into a sharded step-file architecture with parallel review layers.
Description check ✅ Passed The PR description is directly related to the changeset, detailing the rewrite of the monolithic workflow into sharded step files, the three parallel review layers, and triage classification.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/code-review-rewrite
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🧹 Nitpick comments (1)
src/bmm/workflows/4-implementation/bmad-code-review/steps/step-04-present.md (1)

35-39: Recommendation logic can emit conflicting next steps without precedence rules.

A run with both patch and intent_gap currently yields two different directives, but no explicit priority order. Add precedence so users get one unambiguous primary action first.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-04-present.md`
around lines 35 - 39, The next-steps text in step-04-present.md can produce
multiple conflicting directives when multiple finding types are present; enforce
a single precedence order (e.g., give `patch` highest priority, then
`intent_gap`/`bad_spec`, then `defer`) and change the logic that emits the
bullet items so it selects and shows only the first applicable message instead
of emitting all matching messages; update the messaging for the chosen branch
(use the existing strings for `patch`, `intent_gap`/`bad_spec`, and `defer`) and
ensure the code/path that builds the presentation checks findings in that
precedence order and returns one primary action.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-01-gather-context.md`:
- Around line 28-29: The current handling only HALTs when `{diff_output}` is
empty but misses an explicit stop-and-retry when a provided diff is unparseable;
update the logic that validates the "provided diff" parseability to perform an
immediate HALT with a clear user-facing message and a retry instruction if
parsing fails (i.e., treat parse failure as a terminal error separate from empty
content), and ensure the step that constructs `{diff_output}` distinguishes
between empty output and parse errors so callers of the parse validation receive
the explicit HALT/ retry flow.
- Line 48: The link in step-01-gather-context.md incorrectly uses a path
relative to the workflow root; change the reference Read fully and follow
`./steps/step-02-review.md` to use a relative path within the same steps
directory: `./step-02-review.md` so that step-01-gather-context.md correctly
resolves the step file; update the text in step-01-gather-context.md to replace
`./steps/step-02-review.md` with `./step-02-review.md`.
- Around line 18-24: The "Provided diff or file list" option is not implemented
in the construction rules; update the step text and logic to explicitly support
a file list path: add a distinct option label "Provided diff or file list
(path)" and implement prompt flow that, when selected, asks for a file-list path
(or pasted list), validates existence/format, and constructs the review input
from those file paths; update all occurrences referenced in the step (the
options block around "Provided diff or file list" and the construction rules
later in the document) to document the expected input format and handling so
file-list inputs are processed the same as diffs/commit ranges.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-02-review.md`:
- Line 40: Update the intra-step link in step-02-review.md: replace the
incorrect relative link string "./steps/step-03-triage.md" with
"./step-03-triage.md" so the reference resolves relative to the current steps
directory; search for the literal "./steps/step-03-triage.md" in the file and
change it to "./step-03-triage.md" (verify any other links in step-02-review.md
that use the "./steps/..." prefix and adjust them similarly).
- Line 26: Update the subagent failure handling text so that empty findings are
not treated as a failure: change the rule in the "Subagent failure handling"
step to only mark a layer as failed when a subagent times out, throws an error,
or returns unparseable output, and treat a legitimately empty findings array as
a successful (but empty) response to be included in aggregations and user
reporting; reference the terms "subagent", "layer", "findings", "timeout" and
"unparseable" when making the clarification so reviewers and implementers know
exactly which conditions constitute failure.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-03-triage.md`:
- Line 29: The "Deduplicate" rule currently drops cross-source evidence when
keeping a single finding; change the logic that implements the Deduplicate step
so it merges evidence fields from duplicate findings instead of discarding them
— locate the code or logic that implements "Deduplicate" / the rule handling for
findings (the part that selects one surviving finding) and update it to create
one canonical finding by merging evidence arrays/fields (e.g., "references",
"guard_snippets", "adversarial_context", "locations") from all duplicates,
de-duplicating entries within those fields, and annotating the merged record
with mergedSources/merged_ids metadata.
- Line 49: The reference in step-03-triage.md to "Read fully and follow
`./steps/step-04-present.md`" uses the wrong relative path for a file already
inside the steps/ directory; update the link to `./step-04-present.md` so
step-03-triage.md correctly resolves to the sibling file (edit the string in
step-03-triage.md where the Read fully and follow line appears).
- Line 44: Update the incorrect step reference in step-03-triage.md: change the
phrase that reads “noted in step 2” to correctly reference the normalization
flow where failure/format tracking is introduced (e.g., “noted in step 1” or
“noted in the normalization flow (step 1)”) so the triage instruction aligns
with the normalization logic; edit the sentence in the step-03-triage.md content
where that phrase appears to ensure operators are directed to the normalization
flow for failure details.

In
`@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-04-present.md`:
- Around line 25-27: The "**Patch**" section ("These are fixable code issues:")
currently allows entries without a location; change the spec and validation so
that every patch entry must include a concrete location (line/file range) or an
explicit reproduction pointer. Update the template text under the "**Patch**"
heading to state "Location (file:line or reproduction pointer) is required" and
add validation logic in the patch builder/validator to reject or flag any patch
missing a location for the "patch" type.

---

Nitpick comments:
In
`@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-04-present.md`:
- Around line 35-39: The next-steps text in step-04-present.md can produce
multiple conflicting directives when multiple finding types are present; enforce
a single precedence order (e.g., give `patch` highest priority, then
`intent_gap`/`bad_spec`, then `defer`) and change the logic that emits the
bullet items so it selects and shows only the first applicable message instead
of emitting all matching messages; update the messaging for the chosen branch
(use the existing strings for `patch`, `intent_gap`/`bad_spec`, and `defer`) and
ensure the code/path that builds the presentation checks findings in that
precedence order and returns one primary action.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9fd75a17-eeab-4d62-bf97-9599546db0da

📥 Commits

Reviewing files that changed from the base of the PR and between 42b1d0f and 5de8a78.

📒 Files selected for processing (7)
  • src/bmm/workflows/4-implementation/bmad-code-review/checklist.md
  • src/bmm/workflows/4-implementation/bmad-code-review/discover-inputs.md
  • src/bmm/workflows/4-implementation/bmad-code-review/steps/step-01-gather-context.md
  • src/bmm/workflows/4-implementation/bmad-code-review/steps/step-02-review.md
  • src/bmm/workflows/4-implementation/bmad-code-review/steps/step-03-triage.md
  • src/bmm/workflows/4-implementation/bmad-code-review/steps/step-04-present.md
  • src/bmm/workflows/4-implementation/bmad-code-review/workflow.md
💤 Files with no reviewable changes (2)
  • src/bmm/workflows/4-implementation/bmad-code-review/discover-inputs.md
  • src/bmm/workflows/4-implementation/bmad-code-review/checklist.md

- **Acceptance Auditor** (only if `{review_mode}` = `"full"`) -- A subagent that receives `{diff_output}`, `{spec_file}` content, and any loaded context docs. Its prompt:
> You are an Acceptance Auditor. Review this diff against the spec and context docs. Check for: violations of acceptance criteria, deviations from spec intent, missing implementation of specified behavior, contradictions between spec constraints and actual code. Output findings as a markdown list. Each finding: one-line title, which AC/constraint it violates, and evidence from the diff.

2. **Subagent failure handling**: If any subagent fails, times out, or returns empty results, note the failed layer and proceed with findings from the remaining layers. Report the failure to the user in the next step.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Failure handling incorrectly treats empty findings as an error.

A subagent can legitimately return zero findings. Mark only timeout/error/unparseable output as failure.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-02-review.md`
at line 26, Update the subagent failure handling text so that empty findings are
not treated as a failure: change the rule in the "Subagent failure handling"
step to only mark a layer as failed when a subagent times out, throws an error,
or returns unparseable output, and treat a legitimately empty findings array as
a successful (but empty) response to be included in aggregations and user
reporting; reference the terms "subagent", "layer", "findings", "timeout" and
"unparseable" when making the clarification so reviewers and implementers know
exactly which conditions constitute failure.

- `detail` -- full description
- `location` -- file and line reference (if available)

2. **Deduplicate.** If two findings describe the same issue, keep the one with more specificity (prefer edge-case JSON with location over adversarial prose). Note merged sources on the surviving finding.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Deduplication drops cross-source evidence instead of merging it.

Keeping only one record can discard AC references, guard snippets, or stronger evidence from other layers. Keep one canonical finding but merge evidence fields from duplicates.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-03-triage.md`
at line 29, The "Deduplicate" rule currently drops cross-source evidence when
keeping a single finding; change the logic that implements the Deduplicate step
so it merges evidence fields from duplicate findings instead of discarding them
— locate the code or logic that implements "Deduplicate" / the rule handling for
findings (the part that selects one surviving finding) and update it to create
one canonical finding by merging evidence arrays/fields (e.g., "references",
"guard_snippets", "adversarial_context", "locations") from all duplicates,
de-duplicating entries within those fields, and annotating the merged record
with mergedSources/merged_ids metadata.


5. If zero findings remain after dropping rejects, note clean review.

6. If any review layer failed or returned empty (noted in step 2), report this to the user now.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Step reference is incorrect and will mislead execution.

Line 44 says failures were “noted in step 2”, but failure/format tracking is introduced in normalization flow (step 1). This creates operator confusion during handoff.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-03-triage.md`
at line 44, Update the incorrect step reference in step-03-triage.md: change the
phrase that reads “noted in step 2” to correctly reference the normalization
flow where failure/format tracking is introduced (e.g., “noted in step 1” or
“noted in the normalization flow (step 1)”) so the triage instruction aligns
with the normalization logic; edit the sentence in the step-03-triage.md content
where that phrase appears to ensure operators are directed to the normalization
flow for failure details.

Comment on lines +25 to +27
- **Patch**: "These are fixable code issues:"
- List each with title + detail + location (if available).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Patch findings are allowed without a location, which makes them non-actionable.

For patch, location should be mandatory (or explicitly require a concrete reproduction pointer when line-level location is unavailable). Optional location will produce “fix me somewhere” output.

Proposed tightening
-   - **Patch**: "These are fixable code issues:"
-     - List each with title + detail + location (if available).
+   - **Patch**: "These are fixable code issues:"
+     - List each with title + detail + location.
+     - If exact line is unavailable, require at least file + symbol/function/block reference.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- **Patch**: "These are fixable code issues:"
- List each with title + detail + location (if available).
- **Patch**: "These are fixable code issues:"
- List each with title + detail + location.
- If exact line is unavailable, require at least file + symbol/function/block reference.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-04-present.md`
around lines 25 - 27, The "**Patch**" section ("These are fixable code issues:")
currently allows entries without a location; change the spec and validation so
that every patch entry must include a concrete location (line/file range) or an
explicit reproduction pointer. Update the template text under the "**Patch**"
heading to state "Location (file:line or reproduction pointer) is required" and
add validation logic in the patch builder/validator to reject or flag any patch
missing a location for the "patch" type.

@alexeyv alexeyv marked this pull request as draft March 15, 2026 15:31
Comment on lines +16 to +18
## INSTRUCTIONS

1. Ask the user: **What do you want to review?** Present these options:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should try to detect intent and autorun the proper option - so if someone does /code-review staged changes only it will not stop here and ask


2. **Subagent failure handling**: If any subagent fails, times out, or returns empty results, note the failed layer and proceed with findings from the remaining layers. Report the failure to the user in the next step.

3. **Fallback** (if subagents are not available): Generate prompt files in `{implementation_artifacts}` -- one per active reviewer:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these just be 3 separate skills if this is the fallback?

what i would suggest when we later rework the skill is have arguments and paths internally to run the different review types - or we can create a bunch of different review skills. can design this better later - for now though, hmm not sure if this is good UX vs just accepting that its a crap tool and try to just run them 1 at a time...

alexeyv and others added 2 commits March 15, 2026 15:08
Address findings from adversarial review: clarify spec_file as path
variable (F4), add file list construction rule (F8), HALT on unparseable
diffs (F9), differentiate empty findings handling (F5), merge evidence
during dedup instead of dropping (F6), and fix NEXT step paths (F1).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alexeyv alexeyv marked this pull request as ready for review March 15, 2026 21:10
@github-actions
Copy link
Copy Markdown
Contributor

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 15, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 2 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

- **Specific commit range** (ask for the range)
- **Provided diff or file list** (user pastes or provides a path)

2. Construct `{diff_output}` from the chosen source.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step offers Uncommitted changes and Staged changes only, but the {diff_output} construction rules only spell out branch/commit/provided-diff/file-list. Consider explicitly defining how to construct {diff_output} for those two options (e.g., which git diff invocations and whether untracked files are in scope) so execution doesn’t diverge between agents.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

- **Staged changes only**
- **Branch diff** vs a base branch (ask which base branch)
- **Specific commit range** (ask for the range)
- **Provided diff or file list** (user pastes or provides a path)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“Provided diff or file list (user pastes or provides a path)” is a bit ambiguous: if the user provides a path to a diff file, step 2 only describes validating pasted diff content. Consider clarifying whether a path here means “read that file and treat its contents as the unified diff”, vs only supporting pasted diffs and separate path-based file lists.

Severity: medium

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.

Skip the interactive "What do you want to review?" menu when the user
already stated review mode in the invocation (e.g., "review staged
changes"). Adds keyword matching, sprint-tracking fallback, and
graceful fall-through to the existing interactive flow.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

♻️ Duplicate comments (3)
src/bmm/workflows/4-implementation/bmad-code-review/steps/step-02-review.md (1)

26-26: ⚠️ Potential issue | 🟠 Major

Empty findings are still treated as subagent failure.

Line 26 classifies “returns empty results” as failure, which is a valid review outcome and should not be marked failed.

Proposed fix
-2. **Subagent failure handling**: If any subagent fails, times out, or returns empty results, note the failed layer and proceed with findings from the remaining layers. Report the failure to the user in the next step.
+2. **Subagent failure handling**: Mark a layer as failed only if the subagent errors, times out, or returns unparseable output. Treat an empty findings result as a successful empty response and continue aggregation.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-02-review.md`
at line 26, The guidance in "Subagent failure handling" incorrectly treats
"returns empty results" as a failure; update the text in the step titled
"Subagent failure handling" so that only actual errors or timeouts are
classified as subagent failures while empty/zero findings are treated as valid
(non-failure) outcomes to be recorded and forwarded; specifically replace the
phrase "returns empty results" with wording that distinguishes empty results
from failures and add a note that empty findings should be included in the
aggregated findings and reported to the user (but not marked as a failed layer).
src/bmm/workflows/4-implementation/bmad-code-review/steps/step-03-triage.md (2)

47-47: ⚠️ Potential issue | 🟡 Minor

Step reference is incorrect and operationally confusing.

Line 47 says “noted in step 2,” but failure/empty tracking context does not originate from this file’s Step 2. This misroutes troubleshooting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-03-triage.md`
at line 47, The sentence "6. If any review layer failed or returned empty (noted
in step 2)" in step-03-triage.md is pointing to the wrong place; update that
wording so it references the correct origin (either the actual step where
failures are recorded or a generic descriptor). Locate the exact line "If any
review layer failed or returned empty (noted in step 2)" and replace "noted in
step 2" with a correct reference such as "noted in step 02 of the review
summary" or "noted earlier in this workflow" (or point to the specific step
name/ID that tracks failures) to remove ambiguity and ensure operational
clarity.

29-33: ⚠️ Potential issue | 🟠 Major

Dedup merge strategy discards structured evidence fidelity.

Appending everything into detail collapses typed fields (trigger_condition, guard_snippet, AC refs, locations) into prose, which weakens downstream actionability and auditing.

Proposed fix
-  - Append any unique detail, reasoning, or location references from the other finding(s) into the surviving `detail` field.
+  - Preserve and merge structured evidence fields when available (e.g., `locations[]`, `ac_refs[]`, `guard_snippets[]`, `trigger_conditions[]`), and keep `detail` as human-readable synthesis only.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-03-triage.md`
around lines 29 - 33, The current "Deduplicate" step instructs merging findings
by appending everything into the surviving detail field, which loses typed
fields; change the merge behavior so structured evidence fields
(trigger_condition, guard_snippet, AC refs, locations) are preserved and merged
into typed containers rather than collapsed into detail: when merging findings,
prefer the most specific finding as the base, set source to the merged sources
(e.g., blind+edge), and merge each structured field deterministically (e.g.,
combine trigger_condition values into an array or pick the most specific, dedupe
guard_snippet entries into a guard_snippets array, merge AC refs and locations
into arrays of unique entries) while concatenating only non-structured narrative
into detail.
🧹 Nitpick comments (1)
src/bmm/workflows/4-implementation/bmad-code-review/steps/step-01-gather-context.md (1)

38-39: Replace fixed “~3000 lines” gating with adaptive guidance.

Line 38 introduces a hard numeric threshold, which is brittle across repos and model/runtime contexts.

Based on learnings: “In BMAD-METHOD workflow step files under src/bmm/workflows/**/steps/*.md, avoid hard numeric thresholds in agent directives; use soft guidance-level language.”

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-01-gather-context.md`
around lines 38 - 39, Replace the hard numeric gating ("~3000 lines") in the
"Sanity check" step with adaptive, guidance-level language: change the directive
that checks `{diff_output}` to use a soft heuristic (e.g., "if `{diff_output}`
appears very large or spans many files/sections, warn the user and offer to
chunk the review by file group") and instruct the agent to consider repository
size, file count, and model context when suggesting chunking; keep the same
behavior (offer chunking, agree first group, list remaining groups) but remove
the fixed line-count threshold and frame it as a recommendation rather than a
strict rule.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-01-gather-context.md`:
- Around line 36-40: The current "full" review flow that loads all documents
listed in the frontmatter of {spec_file} is unbounded; limit context loading by
enforcing caps on number of files and cumulative size (e.g., max files and max
total lines/tokens), skip or truncate files that would exceed the cap and log
which files were skipped/truncated, and surface a clear warning to the user;
additionally, when {diff_output} is large (≈3000 lines) integrate this same
capping logic and offer chunking: implement a chunking prompt that groups
files/changes (keep the first agreed group and list remaining groups), and add
deterministic prioritization (e.g., follow explicit frontmatter order, then
most-recent/most-relevant) when selecting which context files to include; update
the code paths that perform "load each referenced document" and the sanity check
around {diff_output} to apply these limits and user prompts.
- Around line 29-30: The file-list diff baseline is hardcoded to "HEAD" when
constructing {diff_output}; change the git invocation to accept a configurable
baseline instead of always using HEAD (e.g., replace `git diff HEAD -- <paths>`
with `git diff <baseline> -- <paths>`), add logic to obtain/validate the
baseline source (target branch, explicit commit/range, or user-provided
baseline) and fall back to HEAD only as a default with a clear notice, and
ensure the existing verification that {diff_output} is non-empty now considers
the chosen baseline and prompts the user to pick a different baseline or review
full file contents if the diff is empty.
- Line 29: When the "file list" branch finds an empty git diff, clarify and
implement a deterministic fallback for constructing {diff_output}: if the user
opts to "review the full file contents", build {diff_output} by concatenating
the full working-tree contents of each path with a clear per-file header (e.g.,
"Full contents of <path>:"), using the current working-tree content (cat <path>
or git show :<path>) so the review step that consumes {diff_output} receives
explicit, reproducible file data; update the text around the existing "git diff
HEAD -- <path1> <path2> ..." instruction to document this fallback and reference
{diff_output} and the "file list" branch explicitly.
- Around line 33-37: The path intake currently accepts arbitrary values for
{spec_file} and referenced context docs (frontmatter "context") and must be
constrained to the repository/workspace to avoid exfiltration: when you set
{spec_file} and when loading each referenced context doc (only in the "full"
{review_mode} branch), resolve the candidate path against the
repository/workspace root, reject absolute paths and any paths that traverse
above the root (e.g., normalize and ensure the resolved path starts with
workspaceRoot), follow/validate symlinks to ensure they point inside the
workspace, and only load files that exist and are readable inside that root; for
any rejected or unreadable files log/warn and skip them rather than loading.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-02-review.md`:
- Line 35: Update the step that currently reads "Collect all findings" (the
action at Line 35) to explicitly require persisting raw per-layer outputs before
any aggregation or triage; instruct the workflow to store each layer's payload
(e.g., raw_layer_output or per-layer JSON) into a designated immutable store or
archive with metadata (layer name, timestamp, source) so that aggregation
functions like "collect all findings" consume copies, not originals, preserving
forensic traceability for reclassification or drops.
- Around line 17-24: The parallel subagent orchestration (Blind Hunter invoking
bmad-review-adversarial-general, Edge Case Hunter invoking
bmad-review-edge-case-hunter, and Acceptance Auditor when {review_mode} ==
"full" using {diff_output} and {spec_file}) lacks any timeout or cancellation
policy and can hang indefinitely; update the step to enforce a per-subagent
timeout and an overall parallel-budget timeout, specify cancellation semantics
(e.g., cancel remaining subagents when the overall timeout elapses or when
quorum of required results is reached), and define fallback behavior (e.g., mark
missing results as timed-out, continue with available outputs, and emit a clear
timeout error). Ensure you reference the subagent names and skills in the text
and add configurable timeout parameters (per_subagent_timeout,
parallel_budget_timeout) and a short note on how to handle partial/failed
responses in the Acceptance Auditor aggregation logic.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-03-triage.md`:
- Around line 45-47: The docs allow reporting a "clean review" in step 5 even
when step 2 indicated failed/empty review layers, which is contradictory; update
the logic/text so "clean review" is only reported when zero findings remain AND
no review layers failed or returned empty (i.e., ensure step 5 checks the step 2
status), otherwise emit a distinct "partial coverage / clean after rejects"
message that notes which layers failed or were empty; reference step 2
(failed/empty layers), step 5 ("clean review") and step 6 (report failed/empty
layers) when making this change.

---

Duplicate comments:
In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-02-review.md`:
- Line 26: The guidance in "Subagent failure handling" incorrectly treats
"returns empty results" as a failure; update the text in the step titled
"Subagent failure handling" so that only actual errors or timeouts are
classified as subagent failures while empty/zero findings are treated as valid
(non-failure) outcomes to be recorded and forwarded; specifically replace the
phrase "returns empty results" with wording that distinguishes empty results
from failures and add a note that empty findings should be included in the
aggregated findings and reported to the user (but not marked as a failed layer).

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-03-triage.md`:
- Line 47: The sentence "6. If any review layer failed or returned empty (noted
in step 2)" in step-03-triage.md is pointing to the wrong place; update that
wording so it references the correct origin (either the actual step where
failures are recorded or a generic descriptor). Locate the exact line "If any
review layer failed or returned empty (noted in step 2)" and replace "noted in
step 2" with a correct reference such as "noted in step 02 of the review
summary" or "noted earlier in this workflow" (or point to the specific step
name/ID that tracks failures) to remove ambiguity and ensure operational
clarity.
- Around line 29-33: The current "Deduplicate" step instructs merging findings
by appending everything into the surviving detail field, which loses typed
fields; change the merge behavior so structured evidence fields
(trigger_condition, guard_snippet, AC refs, locations) are preserved and merged
into typed containers rather than collapsed into detail: when merging findings,
prefer the most specific finding as the base, set source to the merged sources
(e.g., blind+edge), and merge each structured field deterministically (e.g.,
combine trigger_condition values into an array or pick the most specific, dedupe
guard_snippet entries into a guard_snippets array, merge AC refs and locations
into arrays of unique entries) while concatenating only non-structured narrative
into detail.

---

Nitpick comments:
In
`@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-01-gather-context.md`:
- Around line 38-39: Replace the hard numeric gating ("~3000 lines") in the
"Sanity check" step with adaptive, guidance-level language: change the directive
that checks `{diff_output}` to use a soft heuristic (e.g., "if `{diff_output}`
appears very large or spans many files/sections, warn the user and offer to
chunk the review by file group") and instruct the agent to consider repository
size, file count, and model context when suggesting chunking; keep the same
behavior (offer chunking, agree first group, list remaining groups) but remove
the fixed line-count threshold and frame it as a recommendation rather than a
strict rule.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9b2f8f36-15fc-4dba-bb9f-e48ebc912c6c

📥 Commits

Reviewing files that changed from the base of the PR and between 5de8a78 and f12e38e.

📒 Files selected for processing (3)
  • src/bmm/workflows/4-implementation/bmad-code-review/steps/step-01-gather-context.md
  • src/bmm/workflows/4-implementation/bmad-code-review/steps/step-02-review.md
  • src/bmm/workflows/4-implementation/bmad-code-review/steps/step-03-triage.md

Comment on lines +29 to +30
- For **file list**: validate each path exists in the working tree. Construct `{diff_output}` by running `git diff HEAD -- <path1> <path2> ...`. If the diff is empty (files have no uncommitted changes), ask the user whether to review the full file contents or to specify a different baseline.
- After constructing `{diff_output}`, verify it is non-empty regardless of source type. If empty, HALT and tell the user there is nothing to review.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

File-list diff baseline is hardcoded to HEAD, which can silently miss intended review scope.

Using only git diff HEAD -- <paths> ignores other valid review baselines (e.g., target branch, explicit commit range), so the selected source can produce misleadingly empty output.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-01-gather-context.md`
around lines 29 - 30, The file-list diff baseline is hardcoded to "HEAD" when
constructing {diff_output}; change the git invocation to accept a configurable
baseline instead of always using HEAD (e.g., replace `git diff HEAD -- <paths>`
with `git diff <baseline> -- <paths>`), add logic to obtain/validate the
baseline source (target branch, explicit commit/range, or user-provided
baseline) and fall back to HEAD only as a default with a clear notice, and
ensure the existing verification that {diff_output} is non-empty now considers
the chosen baseline and prompts the user to pick a different baseline or review
full file contents if the diff is empty.

- For **branch diff**: verify the base branch exists before running `git diff`. If it does not exist, HALT and ask the user for a valid branch.
- For **commit range**: verify the range resolves. If it does not, HALT and ask the user for a valid range.
- For **provided diff**: validate the content is non-empty and parseable as a unified diff. If it is not parseable, HALT and ask the user to provide a valid diff.
- For **file list**: validate each path exists in the working tree. Construct `{diff_output}` by running `git diff HEAD -- <path1> <path2> ...`. If the diff is empty (files have no uncommitted changes), ask the user whether to review the full file contents or to specify a different baseline.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

File-list fallback is underspecified and cannot be executed deterministically.

Line 29 says to “review the full file contents” when the diff is empty, but does not define how to construct {diff_output} in that branch. This creates a dead-end path in Step 1.

Proposed fix
-  - For **file list**: validate each path exists in the working tree. Construct `{diff_output}` by running `git diff HEAD -- <path1> <path2> ...`. If the diff is empty (files have no uncommitted changes), ask the user whether to review the full file contents or to specify a different baseline.
+  - For **file list**: validate each path exists in the working tree. Construct `{diff_output}` by running `git diff HEAD -- <path1> <path2> ...`.
+  - If that diff is empty, ask user to choose one fallback:
+    1) **Full file review mode**: construct synthetic review input containing file headers + current file contents for each path.
+    2) **Alternate baseline**: ask for `<baseline_ref>` and run `git diff <baseline_ref> -- <path1> <path2> ...`.
+  - HALT if neither fallback yields non-empty review input.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-01-gather-context.md`
at line 29, When the "file list" branch finds an empty git diff, clarify and
implement a deterministic fallback for constructing {diff_output}: if the user
opts to "review the full file contents", build {diff_output} by concatenating
the full working-tree contents of each path with a clear per-file header (e.g.,
"Full contents of <path>:"), using the current working-tree content (cat <path>
or git show :<path>) so the review step that consumes {diff_output} receives
explicit, reproducible file data; update the text around the existing "git diff
HEAD -- <path1> <path2> ..." instruction to document this fallback and reference
{diff_output} and the "file list" branch explicitly.

Comment on lines +33 to +37
- If yes: set `{spec_file}` to the path provided, verify the file exists and is readable, then set `{review_mode}` = `"full"`.
- If no: set `{review_mode}` = `"no-spec"`.

4. If `{review_mode}` = `"full"` and the file at `{spec_file}` has a `context` field in its frontmatter listing additional docs, load each referenced document. Warn the user about any docs that cannot be found.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Path intake lacks scope guard and can exfiltrate unrelated local files.

Line 33 and Line 36 accept arbitrary paths for {spec_file} and context docs without constraining them to repository/workspace boundaries. That opens accidental sensitive-file inclusion.

Proposed fix
-  - If yes: set `{spec_file}` to the path provided, verify the file exists and is readable, then set `{review_mode}` = `"full"`.
+  - If yes: set `{spec_file}` to the path provided, verify the file exists, is readable, and resolves inside the current workspace/repository root; otherwise HALT and request a valid in-repo path. Then set `{review_mode}` = `"full"`.

-4. If `{review_mode}` = `"full"` and the file at `{spec_file}` has a `context` field in its frontmatter listing additional docs, load each referenced document. Warn the user about any docs that cannot be found.
+4. If `{review_mode}` = `"full"` and the file at `{spec_file}` has a `context` field in its frontmatter listing additional docs, load each referenced document only if it resolves inside the workspace/repository root. Warn on missing or out-of-scope paths.
🧰 Tools
🪛 LanguageTool

[style] ~36-~36: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... set {review_mode} = "no-spec". 4. If {review_mode} = "full" and the file...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-01-gather-context.md`
around lines 33 - 37, The path intake currently accepts arbitrary values for
{spec_file} and referenced context docs (frontmatter "context") and must be
constrained to the repository/workspace to avoid exfiltration: when you set
{spec_file} and when loading each referenced context doc (only in the "full"
{review_mode} branch), resolve the candidate path against the
repository/workspace root, reject absolute paths and any paths that traverse
above the root (e.g., normalize and ensure the resolved path starts with
workspaceRoot), follow/validate symlinks to ensure they point inside the
workspace, and only load files that exist and are readable inside that root; for
any rejected or unreadable files log/warn and skip them rather than loading.

Comment on lines +36 to +40
4. If `{review_mode}` = `"full"` and the file at `{spec_file}` has a `context` field in its frontmatter listing additional docs, load each referenced document. Warn the user about any docs that cannot be found.

5. Sanity check: if `{diff_output}` exceeds approximately 3000 lines, warn the user and offer to chunk the review by file group.
- If the user opts to chunk: agree on the first group, narrow `{diff_output}` accordingly, and list the remaining groups for the user to note for follow-up runs.
- If the user declines: proceed as-is with the full diff.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Context loading is unbounded and risks context-window collapse on large doc sets.

The workflow can load arbitrary numbers/sizes of context files, then also process large diffs. This is a reliability hazard and will degrade triage quality in exactly the cases where signal matters most.

🧰 Tools
🪛 LanguageTool

[style] ~36-~36: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... set {review_mode} = "no-spec". 4. If {review_mode} = "full" and the file...

(ENGLISH_WORD_REPEAT_BEGINNING_RULE)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-01-gather-context.md`
around lines 36 - 40, The current "full" review flow that loads all documents
listed in the frontmatter of {spec_file} is unbounded; limit context loading by
enforcing caps on number of files and cumulative size (e.g., max files and max
total lines/tokens), skip or truncate files that would exceed the cap and log
which files were skipped/truncated, and surface a clear warning to the user;
additionally, when {diff_output} is large (≈3000 lines) integrate this same
capping logic and offer chunking: implement a chunking prompt that groups
files/changes (keep the first agreed group and list remaining groups), and add
deterministic prioritization (e.g., follow explicit frontmatter order, then
most-recent/most-relevant) when selecting which context files to include; update
the code paths that perform "load each referenced document" and the sanity check
around {diff_output} to apply these limits and user prompts.

Comment on lines +17 to +24
1. Launch parallel subagents. Each subagent gets NO conversation history from this session:

- **Blind Hunter** -- Invoke the `bmad-review-adversarial-general` skill in a subagent. Pass `content` = `{diff_output}` only. No spec, no project access.

- **Edge Case Hunter** -- Invoke the `bmad-review-edge-case-hunter` skill in a subagent. Pass `content` = `{diff_output}`. This subagent has read access to the project.

- **Acceptance Auditor** (only if `{review_mode}` = `"full"`) -- A subagent that receives `{diff_output}`, the content of the file at `{spec_file}`, and any loaded context docs. Its prompt:
> You are an Acceptance Auditor. Review this diff against the spec and context docs. Check for: violations of acceptance criteria, deviations from spec intent, missing implementation of specified behavior, contradictions between spec constraints and actual code. Output findings as a markdown list. Each finding: one-line title, which AC/constraint it violates, and evidence from the diff.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Parallel execution has no timeout budget or cancellation rule.

This step can hang indefinitely waiting for a slow subagent, blocking the pipeline without a defined recovery path.

🧰 Tools
🪛 LanguageTool

[uncategorized] ~24-~24: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ...s and actual code. Output findings as a markdown list. Each finding: one-line title, whi...

(MARKDOWN_NNP)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-02-review.md`
around lines 17 - 24, The parallel subagent orchestration (Blind Hunter invoking
bmad-review-adversarial-general, Edge Case Hunter invoking
bmad-review-edge-case-hunter, and Acceptance Auditor when {review_mode} ==
"full" using {diff_output} and {spec_file}) lacks any timeout or cancellation
policy and can hang indefinitely; update the step to enforce a per-subagent
timeout and an overall parallel-budget timeout, specify cancellation semantics
(e.g., cancel remaining subagents when the overall timeout elapses or when
quorum of required results is reached), and define fallback behavior (e.g., mark
missing results as timed-out, continue with available outputs, and emit a clear
timeout error). Ensure you reference the subagent names and skills in the text
and add configurable timeout parameters (per_subagent_timeout,
parallel_budget_timeout) and a short note on how to handle partial/failed
responses in the Acceptance Auditor aggregation logic.


HALT. Tell the user to run each prompt in a separate session and paste back findings. When findings are pasted, resume from this point and proceed to step 3.

4. Collect all findings from the completed layers.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Raw layer outputs are not explicitly preserved before aggregation.

Line 35 jumps to “collect all findings” but does not require preserving raw per-layer payloads. That removes forensic traceability when triage reclassifies or drops findings later.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-02-review.md`
at line 35, Update the step that currently reads "Collect all findings" (the
action at Line 35) to explicitly require persisting raw per-layer outputs before
any aggregation or triage; instruct the workflow to store each layer's payload
(e.g., raw_layer_output or per-layer JSON) into a designated immutable store or
archive with metadata (layer name, timestamp, source) so that aggregation
functions like "collect all findings" consume copies, not originals, preserving
forensic traceability for reclassification or drops.

Comment on lines +45 to +47
5. If zero findings remain after dropping rejects, note clean review.

6. If any review layer failed or returned empty (noted in step 2), report this to the user now.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

“Clean review” can be reported even when coverage was partial.

Line 45 allows a clean result after rejects are dropped, but Line 47 may also report failed/empty layers. That combination is contradictory and can be misread as a full-pass review.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bmm/workflows/4-implementation/bmad-code-review/steps/step-03-triage.md`
around lines 45 - 47, The docs allow reporting a "clean review" in step 5 even
when step 2 indicated failed/empty review layers, which is contradictory; update
the logic/text so "clean review" is only reported when zero findings remain AND
no review layers failed or returned empty (i.e., ensure step 5 checks the step 2
status), otherwise emit a distinct "partial coverage / clean after rejects"
message that notes which layers failed or were empty; reference step 2
(failed/empty layers), step 5 ("clean review") and step 6 (report failed/empty
layers) when making this change.

@alexeyv alexeyv merged commit 09bca11 into main Mar 15, 2026
5 checks passed
@alexeyv alexeyv deleted the feat/code-review-rewrite branch March 15, 2026 21:17
alexeyv added a commit that referenced this pull request Mar 15, 2026
- Remove name/description from step frontmatter (STEP-06)
- Add explicit HALT before user menu in step-01 (STEP-04)
- Escape story-id as template placeholder (REF-01)
- Handle untracked files in file-list diff mode (P-6)
- Formalize failed_layers variable handoff between steps (P-5)
- Add Acceptance Auditor skip note for no-spec mode (P-4)
- Guard false-clean when review layer failed (P-7)
- Reword patch recommendation to avoid auto-fix solicitation (P-3)
- Update SKILL.md description to reflect new capabilities (P-2)
bmadcode pushed a commit that referenced this pull request Mar 15, 2026
- Remove name/description from step frontmatter (STEP-06)
- Add explicit HALT before user menu in step-01 (STEP-04)
- Escape story-id as template placeholder (REF-01)
- Handle untracked files in file-list diff mode (P-6)
- Formalize failed_layers variable handoff between steps (P-5)
- Add Acceptance Auditor skip note for no-spec mode (P-4)
- Guard false-clean when review layer failed (P-7)
- Reword patch recommendation to avoid auto-fix solicitation (P-3)
- Update SKILL.md description to reflect new capabilities (P-2)
alexeyv added a commit that referenced this pull request Mar 18, 2026
…hoices

The March 15 rewrite (PR #2007) removed the ability to auto-fix patches,
create action items in story files, and handle deferred/spec findings.
This restores interactive post-review actions:

- Deferred findings: auto-written to deferred-work.md and checked off in story
- Intent gap/bad spec: conversation with downgrade-to-patch, patch-spec,
  reset-to-ready-for-dev, or dismiss options
- Patch findings: fix automatically, create action items, or show details

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
alexeyv added a commit that referenced this pull request Mar 20, 2026
…hoices (#2055)

* fix(code-review): restore actionable review output with interactive choices

The March 15 rewrite (PR #2007) removed the ability to auto-fix patches,
create action items in story files, and handle deferred/spec findings.
This restores interactive post-review actions:

- Deferred findings: auto-written to deferred-work.md and checked off in story
- Intent gap/bad spec: conversation with downgrade-to-patch, patch-spec,
  reset-to-ready-for-dev, or dismiss options
- Patch findings: fix automatically, create action items, or show details

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* refactor(code-review): simplify triage to decision-needed/patch/defer/dismiss

Replace 5-bucket classification (intent_gap, bad_spec, patch, defer, reject)
with 4 pragmatic buckets. Findings always written to story file first.
Decision-needed findings gate patch handling — resolve ambiguity before fixing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(code-review): address PR review findings in step-04-present

Replace undefined curly-brace placeholders with angle-bracket syntax,
add HALT guard before patch menu, guard spec_file references for
no-spec mode, and backtick category names for consistency.

* feat(code-review): add HALT guards, batch option, defer reason, final summary

Add strong HALT guards after decision-needed and patch menus to prevent
auto-progression. Add batch-apply option 0 for >3 patch findings. Prompt
for defer reason and append to story file and deferred-work.md. Show
boxed final summary with counts. Polish clean-review shortcut in triage.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants