fix(workflows): remove ambiguous with-argument from help task chaining#1739
fix(workflows): remove ambiguous with-argument from help task chaining#1739alexeyv wants to merge 13 commits intobmad-code-org:mainfrom
Conversation
Unified 5-step workflow with full BMM sharded workflow plumbing and minimal step prompts. Steps: clarify/route, plan, implement, review, present. 29 lines of instructions, 145 lines of plumbing.
Add QD2 trigger to agent definition and module-help.csv so the installer picks it up across all supported platforms.
Reference specs from the design sessions: full redesign plan, session state with all design decisions, and implementation roadmap.
18 tasks: skeleton test/eval, then per-step tighten/test/eval cycles, plus end-to-end eval. Each task file is a self-contained intent expression that can be fed to QD2.
The COMPOUND_TRIGGER_PATTERN and description bracket regex only
accepted [A-Z]{1,3}, rejecting shortcuts like QD2 that include
digits. Update both regexes to accept [A-Z][A-Z0-9]{0,2} so
alphanumeric shortcuts work while still requiring a leading letter.
Add test fixture covering the alphanumeric shortcut case.
…Grok spec and initial experiment run logs.
…SLint ignore lists.
Second QD2 skeleton test run (add plan-review step to task-01) with raw JSONL log and analysis. Add CAPTURE-RUN.md as a repeatable prompt for post-run log capture and summarization.
The "with argument" clause in 7 workflow completion steps caused LLMs to interpret "Read fully and follow: help.md with argument X" as a skill/function invocation rather than a file-read instruction. Drop the clause entirely — help.md already infers the completed workflow from the preceding "[Workflow] complete." text in conversation context. Closes bmad-code-org#1637
🤖 Augment PR SummarySummary: This PR reduces ambiguity in workflow completion guidance by removing the “with argument …” clause when chaining into the shared Changes:
Technical Notes: The “help” task already infers the completed workflow from conversation/artifact context, so removing the argument avoids tool-invocation confusion without changing 🤖 Was this summary useful? React with 👍 or 👎 |
|
|
||
| ### CHECKPOINT 1 | ||
|
|
||
| Present summary. `[A] Approve [E] Edit [F] Full BMM`. HALT. |
There was a problem hiding this comment.
Using [A] for Approve is likely to collide with the established [a] Advanced Elicitation shortcut used elsewhere in BMAD workflows, which could cause the agent to take the wrong branch when the user replies “A”. Consider whether the checkpoint menu letters should avoid reserved shortcuts.
Severity: medium
Other Locations
src/bmm/workflows/bmad-quick-flow/quick-dev2/steps/step-05-present.md:19
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
|
|
||
| ## INSTRUCTIONS | ||
|
|
||
| 1. Diff from `{baseline_commit}`. Review in context-free subagents: intent audit (skip for one-shot) + adversarial code review via `{adversarial_review_task}`. |
There was a problem hiding this comment.
Step 4 assumes {baseline_commit} is available, but Step 3 doesn’t explicitly state that it must be stored in that variable (or persisted for resume). If it’s missing on a later re-entry, the review diff could be generated from the wrong baseline.
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
|
|
||
| Present classified findings. `[A] Approve [E] Edit [R] Reject`. HALT. | ||
|
|
||
| - **A**: Commit patches. Print push command. Wait. Create PR. |
There was a problem hiding this comment.
📝 WalkthroughWalkthroughIntroduces a new unified quick-dev2 workflow with clarify, plan, implement, review, and present steps. Adds extensive experiment planning documentation including redesign specifications and task roadmaps. Updates seven existing workflow completion steps to remove "with argument" clauses from help task chaining. Tightens trigger pattern validation and updates configuration ignores. Changes
Sequence DiagramsequenceDiagram
participant User
participant Step01 as Step 1:<br/>Clarify & Route
participant Step02 as Step 2:<br/>Plan
participant Step03 as Step 3:<br/>Implement
participant Step04 as Step 4:<br/>Review
participant Step05 as Step 5:<br/>Present
User->>Step01: Provide task intent
Step01->>Step01: Clarify intent,<br/>backfill conventions,<br/>determine route
alt One-shot
Step01-->>Step03: execution_mode=one-shot
Step03->>Step03: Execute & implement
Step03->>Step04: Auto-advance
else Plan-code-review
Step01-->>Step02: Default routing
Step02->>Step02: Investigate, generate<br/>tech-spec, self-review
Step02-->>User: [Checkpoint 1]<br/>Approve/Edit/Full-BMM
User-->>Step03: Continue
Step03->>Step03: Baseline, shard,<br/>execute tasks seq.
Step03-->>User: Halt if needed
end
Step03->>Step04: Auto-advance
Step04->>Step04: Context-free<br/>adversarial review
Step04->>Step04: Classify findings,<br/>amend spec (≤5 loops)
Step04->>Step04: Auto-fix patches,<br/>commit
Step04->>Step05: Auto-advance
Step05->>Step05: Prioritize findings,<br/>generate PR
Step05-->>User: [Checkpoint 2]<br/>Approve/Edit/Reject
User-->>Step05: Confirm
Step05-->>User: Present results
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~30 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (5)
src/bmm/workflows/3-solutioning/check-implementation-readiness/steps/step-06-final-assessment.md (1)
39-43:⚠️ Potential issue | 🟡 MinorPre-existing semantic error:
🚫emoji misused for a required action.The EXECUTION PROTOCOLS block at line 43 reads:
🚫 Complete and present final reportThroughout these workflow files,
🚫signals a forbidden action (e.g.,🚫 FORBIDDEN to approve incomplete coverage). Using it on a step that must be executed is a contradiction that could cause an LLM to skip this action entirely. Since the PR is already touching this file, this should be corrected now.🔧 Proposed fix
-🚫 Complete and present final report +🎯 Complete and present final report🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/3-solutioning/check-implementation-readiness/steps/step-06-final-assessment.md` around lines 39 - 43, Summary: The EXECUTION PROTOCOLS line currently uses the forbidden emoji "🚫" for a required step ("🚫 Complete and present final report"), causing semantic conflict; replace the emoji with a positive/required marker. Fix: edit the EXECUTION PROTOCOLS entry that reads "🚫 Complete and present final report" and change the emoji to a required-action symbol used elsewhere (e.g., "✅" or "🎯") or remove the emoji so the line reads "Complete and present final report"; ensure the text remains identical otherwise so downstream LLMs see it as a required step.src/bmm/workflows/2-plan-workflows/create-ux-design/steps/step-14-complete.md (1)
83-89:⚠️ Potential issue | 🟡 MinorSection 5 (Final Completion Confirmation) is unreachable — the
help.mdredirect in Section 3 transfers control before Section 5 executes.The WORKFLOW COMPLETION SEQUENCE instructs the agent to read the complete file first, then execute sections in order. When it reaches Section 3 (line 83–85) and follows
help.md, it exits this step's execution context. Section 5 "Final Completion Confirmation" (line 87–89) — the congratulations message — is never delivered to the user.Every other file in this PR places the
help.mdredirect as the very last instruction in the final section. In this file alone, it sits in an intermediate section with content following it.Additionally:
- Section 4 is absent from the WORKFLOW COMPLETION SEQUENCE (the numbering jumps 3 → 5). This was presumably Section 4 at some point and its removal/renaming was not cleaned up.
- Line 89 contains garbled wording:
"Congratulate the user on the completion you both completed together of the UX."— even if this section were reachable, the phrasing should be corrected.🔧 Suggested fix
Move the congratulations into Section 3 (before the redirect) and renumber:
### 3. Suggest Next Steps +Congratulate the user on completing the UX Design. + UX Design complete. Read fully and follow: `{project-root}/_bmad/core/tasks/help.md` -### 5. Final Completion Confirmation - -Congratulate the user on the completion you both completed together of the UX. +### 4. Final Completion Confirmation + +(Handled by help.md guidance above.)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/2-plan-workflows/create-ux-design/steps/step-14-complete.md` around lines 83 - 89, Section 5 ("Final Completion Confirmation") is unreachable because the help.md redirect in Section 3 transfers control early and the numbering skips Section 4; move the congratulations text so it appears before the `{project-root}/_bmad/core/tasks/help.md` redirect (i.e., place the final confirmation inside Section 3 or renumber sections so the redirect is last), fix the garbled sentence to something like "Congratulate the user on completion of the UX design you completed together," and renumber headings sequentially to eliminate the missing Section 4 and ensure the help.md redirect is always the final instruction in the step.src/bmm/workflows/2-plan-workflows/create-prd/steps-v/step-v-13-report-complete.md (1)
196-202:⚠️ Potential issue | 🟡 MinorUsers selecting non-Exit paths (R, E, F) bypass help.md guidance, and the Edit workflow's completion option is missing it entirely.
The help.md reference at line 200 is nested inside the
IF X (Exit)conditional block, which means users who select R (Review), E (Edit Workflow), or F (Fix Simpler Items) will never receive the next-step guidance from help.md. More critically, the Edit workflow completion file (step-e-04-complete.md) offers users the ability to run full validation but contains zero reference to help.md — creating a gap where Edit → Validation completions lack guidance continuity.Separately, "PRD Validation complete." already follows the correct naming pattern (artifact-type + validation status), consistent with other workflow completions like "Architecture complete." or "UX Design complete."
To fix: Move the help.md line to a top-level step that all menu branches (R, E, F, X) converge on, or add explicit help.md routing to each branch endpoint. Additionally, audit
step-e-04-complete.mdto ensure it includes help.md guidance when the Edit workflow concludes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bmm/workflows/2-plan-workflows/create-prd/steps-v/step-v-13-report-complete.md` around lines 196 - 202, Move the help.md reference out of the IF X (Exit) block in step-v-13-report-complete.md so it is shown for all menu outcomes (R, E, F, X) by placing the line "PRD Validation complete. Read fully and follow: `{project-root}/_bmad/core/tasks/help.md`" at a top-level/common completion section that all branches converge on, or alternatively add the same help.md line explicitly under each branch (R, E, F, X); then open step-e-04-complete.md and add the same help.md guidance to its completion text so the Edit workflow also directs users to help.md. Ensure the wording preserves the existing "PRD Validation complete." naming pattern and that the inserted help.md path matches `{project-root}/_bmad/core/tasks/help.md`.tools/schema/agent.js (1)
93-100:⚠️ Potential issue | 🟠 MajorEarly
returninsuperRefineexits the entire callback — subsequent menu items are never validatedWhen
parseCompoundTriggerreturns invalid on menu itemN, thereturnstatement exits the wholesuperRefinefunction, not just the loop iteration. Menu itemsN+1,N+2, … are silently skipped. The same problem applies to the description-shortcut mismatch check (lines 104–111 and 115–121). Callers get at most one error per schema validation run, forcing tedious fix-and-re-validate cycles.Replace
returnwithcontinue-equivalent logic (collect errors without short-circuiting the loop).🐛 Proposed fix: accumulate errors without bailing out
if (!result.valid) { ctx.addIssue({ code: 'custom', path: ['agent', 'menu', index, 'trigger'], message: `agent.menu[].trigger compound format error: ${result.error}`, }); - return; + index += 1; + continue; } const descriptionMatch = item.description?.match(/^\[([A-Z][A-Z0-9]{0,2})\]/); if (!descriptionMatch) { ctx.addIssue({ code: 'custom', path: ['agent', 'menu', index, 'description'], message: `agent.menu[].description must start with [SHORTCUT] where SHORTCUT matches the trigger shortcut "${result.shortcut}"`, }); - return; + index += 1; + continue; }Note: The loop uses a manual
indexvariable rather thanentries(). A cleaner refactor would switch tofor (const [index, item] of value.agent.menu.entries())socontinueworks directly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/schema/agent.js` around lines 93 - 100, In superRefine for the agent schema, the current use of return inside the loop (after parseCompoundTrigger invalid results and in the description-shortcut mismatch checks) aborts the entire superRefine callback so subsequent agent.menu items are never validated; change the logic to collect issues and continue iterating instead of returning — either convert the loop to for (const [index, item] of value.agent.menu.entries()) and use continue after calling ctx.addIssue (for the parseCompoundTrigger failure and the description/shortcut mismatch checks), or remove the early return and let the loop proceed while still calling ctx.addIssue so all menu entries are validated (references: superRefine, parseCompoundTrigger, ctx.addIssue, agent.menu)._experiment/planning/roadmap/task-03-tighten-step-01-routing.md (1)
1-21:⚠️ Potential issue | 🟡 MinorMissing Output section — deliverable artifact undefined.
Every peer task in this roadmap (task-04, task-05, task-13, task-14, task-15, task-17, task-18) has an
## Outputsection naming the concrete deliverable. This "tighten" task has none. Readers cannot tell what artifact constitutes completion — presumably the updatedstep-01-clarify-and-route.md, but that is never stated. Without an explicit output, there is no way to verify the task is done.📋 Suggested addition
## Constraint Keep prompts thin. Add the minimum words needed to close the gap. If the LLM was already doing it right from training, don't add instructions for it. + +## Output + +Updated `src/bmm/workflows/bmad-quick-flow/quick-dev2/step-01-clarify-and-route.md`. Change noted in session log.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@_experiment/planning/roadmap/task-03-tighten-step-01-routing.md` around lines 1 - 21, The roadmap task file is missing an explicit "## Output" section, so add a brief Output block to _experiment/planning/roadmap/task-03-tighten-step-01-routing.md that names the concrete deliverable (e.g., "Updated step-01-clarify-and-route.md with tightened routing criteria and exit criteria enforcement") and any acceptance criteria needed to verify completion (e.g., specific checks for routing precision, WIP/artifact detection, and project context backfill). Reference the target artifact by filename step-01-clarify-and-route.md and include 1–2 bullet acceptance checks to make completion verifiable.
| ```bash | ||
| # Find the most recent conversation log | ||
| ls -lt ~/.claude/projects/-Users-alex-src-bmad-quick-flow-redesign/*.jsonl | head -1 | ||
|
|
||
| # Copy it with the naming convention: YYYY-MM-DD-<short-slug>.jsonl | ||
| cp <path-from-above> _experiment/runs/YYYY-MM-DD-<short-slug>.jsonl | ||
| ``` |
There was a problem hiding this comment.
Hardcoded personal machine path blocks all other users.
Line 13 hard-codes /Users/alex/src/bmad-quick-flow-redesign/ as the Claude project directory. This is not portable — it will silently produce no output on any machine other than the author's. A guide in a shared repository must not contain a personal filesystem path.
🛠 Proposed fix — use `fd` or `find` to locate the most recent JSONL
-# Find the most recent conversation log
-ls -lt ~/.claude/projects/-Users-alex-src-bmad-quick-flow-redesign/*.jsonl | head -1
-
-# Copy it with the naming convention: YYYY-MM-DD-<short-slug>.jsonl
-cp <path-from-above> _experiment/runs/YYYY-MM-DD-<short-slug>.jsonl
+# Find the most recent conversation log across all Claude projects
+LATEST=$(find ~/.claude/projects -name '*.jsonl' -printf '%T@ %p\n' 2>/dev/null | sort -rn | head -1 | cut -d' ' -f2-)
+echo "Latest log: $LATEST"
+
+# Copy it with the naming convention: YYYY-MM-DD-<short-slug>.jsonl
+cp "$LATEST" _experiment/runs/YYYY-MM-DD-<short-slug>.jsonl🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@_experiment/runs/CAPTURE-RUN.md` around lines 11 - 17, The script in
_experiment/runs/CAPTURE-RUN.md uses a hardcoded personal path
(/Users/alex/src/bmad-quick-flow-redesign/) which prevents others from finding
recent Claude project JSONL files; update the instructions to use a portable
discovery command (e.g., search under the shared projects directory instead of a
user-specific path) — replace the hardcoded path in the ls invocation with a
glob or a find/fd-based search that targets ~/.claude/projects/*.jsonl (or an
equivalent command that locates the newest .jsonl file) and instruct the user to
copy that discovered path to _experiment/runs/YYYY-MM-DD-<short-slug>.jsonl so
the steps work across machines.
Strip task sharding, rigid approval menu, and VC backfill. Add intent_gap-first classification cascade to review. Use slug-based spec naming in plan step.
Cross-references 3 QD2 test runs against the redesign plan. Classifies 24 observations as Works/Gap/Execution Gap/Plan Gap/Plumbing. Key findings: one-shot path bypasses step architecture, terse step files cause execution gaps, plan-code-review path works end-to-end.
Metrics from JSONL log mining: 56% fewer human turns (18→8), 32% fewer API turns (72→49), token cost at parity (1.14x). One-shot path identified as broken (P0 fix priority).
Captures lessons from the skeleton eval: workflow boundary scoping, human turn filtering, token deduplication, context normalization, and five common pitfalls to avoid.
|
Closing: this PR accidentally included 12 unrelated commits from the experiment branch. Reopening as a clean single-commit PR from |
Summary
with argumentclause from 7 workflow completion steps that chain tohelp.mdRead fully and follow: help.md with argument Xas a skill/function invocation instead of a file-read instruction, triggering "unknown skill" errorshelp.mdalready infers the completed workflow from the preceding "[Workflow] complete." text in conversation context — no changes tohelp.mdneededAlso fixes:
create-product-brief/step-06-complete.md) had wrong argument (Validate PRDinstead ofCreate Brief)check-implementation-readiness/step-06-final-assessment.md) had casing mismatch (implementation readinessvsCheck Implementation Readiness)Both are moot now that arguments are removed entirely.
Closes #1637
Test plan
src/forwith argument— should return zero matcheshelp.mdloads as a file read, not a skill invocationhelp.mdcorrectly infers the completed workflow from conversation context