fix(create-story): restore validate-workflow task and harden checklist execution for conditional contexts#1652
Conversation
… checklist handling
📝 WalkthroughWalkthroughAdds a deterministic variable-resolution step to the create-story workflow, removes conditional gating for intelligence outputs, and introduces a new comprehensive validation task (src/core/tasks/validate-workflow.xml) that enforces checklist-driven validation and mandatory evidence-backed report generation. Changes
Sequence DiagramsequenceDiagram
participant Client as Client
participant WF as Workflow Context
participant CK as Checklist Loader
participant DOC as Document Loader
participant VAL as Validator
participant REP as Report Generator
Client->>WF: Invoke validate-workflow (workflow, checklist?, document?)
WF->>WF: Load workflow YAML\nResolve variables (config_source files → keys, system paths, defaults)
WF-->>Client: Prompt if required vars unresolved
WF->>CK: Load and parse checklist
CK->>CK: Extract sections, items, IDs, critical flags
WF->>DOC: Load target document (default_output_file or explicit)
DOC->>DOC: Chunk/read content, extract metadata
VAL->>CK: Iterate items in order
VAL->>DOC: Evaluate evidence for each item → PASS/PARTIAL/FAIL/N/A
VAL->>VAL: Record remediation suggestions and counters
VAL->>REP: Generate report (path, timestamp, metrics, evidence, must-fix list)
REP-->>Client: Return decision summary + report path
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/bmm/workflows/4-implementation/create-story/checklist.md (1)
36-39:⚠️ Potential issue | 🟠 MajorFix undefined variable reference:
{story_file_path}does not exist in workflow.yamlThe checklist references
{story_file_path}on lines 38 and 65, but this variable is never defined increate-story/workflow.yaml. The workflow instead definesdefault_output_file: "{implementation_artifacts}/{{story_key}}.md"(line 29), which is whatvalidate-workflow.xmlexpects (lines 38-39).When validate-workflow.xml runs, it will not find
{story_file_path}and will fall back to fuzzy discovery or prompt the user for manual input, defeating the stated purpose of deterministic variable resolution in Step 1 (lines 67-72).Align this reference to
{default_output_file}to match the workflow configuration, or addstory_file_pathas an explicit variable inworkflow.yamlif it serves a different purpose thandefault_output_file.
🤖 Fix all issues with AI agents
In `@src/bmm/workflows/4-implementation/create-story/checklist.md`:
- Around line 67-72: The checklist's "Resolve variables deterministically" step
is ambiguous about which variables are considered required; update
src/bmm/workflows/4-implementation/create-story/checklist.md to explicitly
enumerate the required variables (for example, epics_file, architecture_file,
implementation_artifacts, story_file_path) and reference validate-workflow.xml's
inputs (workflow required=true, checklist/document/report optional) so agents
only halt for unresolved required keys; ensure the text clarifies optional vs
required resolution behavior and points to the {config_source}:key and system
path resolution rules already described.
In `@src/bmm/workflows/4-implementation/create-story/instructions.xml`:
- Around line 283-288: The latest_tech_information block is still wrapped in
<check if="web research completed"> so it can be omitted silently; make it
consistent with previous_story_intelligence and git_intelligence_summary by
removing the conditional check and adding an unconditional <action> that sets
latest_tech_information to an explicit "N/A" note with the reason when web
research wasn't performed, and include a <template-output
file="{default_output_file}">latest_tech_information</template-output> so an
auditable N/A is always emitted.
- Line 310: The invoke-task element that calls
_bmad/core/tasks/validate-workflow.xml must include the required workflow input;
modify the invoke-task on the given line to pass
workflow="{installed_path}/workflow.yaml" in addition to the existing checklist
and target file inputs, and also normalize the task path usage so it matches the
rest of the project (choose either the relative form
"_bmad/core/tasks/validate-workflow.xml" or the prefixed
"{project-root}/_bmad/core/tasks/validate-workflow.xml" and use that same form
consistently in this invoke-task and in checklist.md references) so
validate-workflow.xml receives its required workflow parameter
deterministically.
In `@src/core/tasks/validate-workflow.xml`:
- Around line 45-52: The "Load Checklist and Target Document" step currently
silently skips metadata extraction ("Extract story metadata when available"),
which can cause conditional checks in "Parse checklist into sections and atomic
validation items" to be unevaluable (e.g., conditions like "If story_num > 1");
update this step so that after attempting to extract epic_num, story_num,
story_id you detect whether any parsed checklist items reference those metadata
fields and, if extraction failed while referenced, either HALT with a clear
error asking for the missing metadata or pause and PROMPT the user to supply the
values; ensure the logic that checks references runs immediately after "Extract
story metadata..." and before evaluating conditional items so the workflow
either obtains required metadata or stops with a descriptive message.
- Around line 67-72: Add explicit initialization and accumulation of the report
counters: before or during Step 3 (the item evaluation loop) initialize totals
(total_count, applicable_count, pass_count, partial_count, fail_count, na_count,
critical_fail_count, critical_partial_count) and per-section maps (e.g.,
section_counts, section_pass_counts, section_partial_counts,
section_fail_counts); in the evaluation action for each item (the code that
assigns verdicts) increment the appropriate global counters and the item’s
section counters, mark critical items to increment critical_fail_count or
critical_partial_count as needed, and after the loop (between Step 3 and Step 4)
compute derived metrics like pass_percent = (pass_count / applicable_count)*100
and include these variables so the report step can reference pass_count,
applicable_count, pass_percent, critical_fail_count, critical_partial_count, and
per-section counts reliably.
- Around line 54-65: The workflow lacks a deterministic rule for how PARTIAL
verdicts affect the final PASS|FAIL gate emitted by the report summary: update
the validate workflow (the Step with n="3" title="Validate Every Checklist Item"
and the report summary logic that currently emits a binary PASS|FAIL) to include
an explicit gate-decision rule; for example add a clear rule such as "Gate =
FAIL if any critical item is FAIL or PARTIAL; otherwise PASS" (or your chosen
deterministic policy) and implement it where the summary is computed so PARTIALs
are handled consistently across runs.
- Around line 68-71: The report filename currently uses an undefined {timestamp}
token in the "Set report path" action; update that logic to emit a
deterministic, sortable timestamp format (e.g., YYYYMMDD-HHmmss) when {report}
is not provided so the fallback filename becomes
validation-report-{YYYYMMDD-HHmmss}.md; locate the "Set report path" action and
the place that constructs validation-report-{timestamp}.md and replace the
ambiguous {timestamp} with a formatted timestamp string (use process/templating
code that generates YYYYMMDD-HHmmss) so filenames sort lexicographically and are
consistent across agents.
- Around line 14-20: The rule "Always read COMPLETE files; do not sample with
offsets." in the <llm> block is infeasible for very large files; update that <i>
element to include a clear fallback: if a file exceeds the agent's context
limits, load and process the file in ordered sequential chunks (e.g.,
page/section-sized reads) ensuring each chunk is fully processed before fetching
the next, detect and log truncation/errors, and if any required path still
cannot be resolved after chunked processing, stop and prompt the user for
explicit input or a smaller excerpt; reference the <llm> block and the specific
<i> instruction to add this fallback language and error-handling guidance.
- Around line 41-42: The "Try fuzzy discovery..." action in the
validate-workflow.xml task violates the deterministic resolution rule; replace
it by either removing the fuzzy branch entirely so the flow falls through to the
user prompt ("Ask user: 'Which document should I validate?' and WAIT") or make
the discovery deterministic by changing the action to explicitly defined
disambiguation criteria (e.g., "Select the most recent .md in
implementation_artifacts whose filename contains {story_key} and log 'inferred
document: <filename>' before proceeding"); update the <action if="document path
unresolved"> node accordingly and ensure any inference is explicitly stated in
the action text.
🧹 Nitpick comments (3)
src/core/tasks/validate-workflow.xml (2)
49-50: Keyword-based criticality marking is fragile and prone to false positives/negatives.Line 50 marks items as critical if they contain "critical", "must", "required", or "blocking." This will flag items like "Must Improve" section headings (line 101 of the report format itself says "Must Fix"), "required inputs" descriptions, or any item that mentions these words in a non-critical context. Conversely, genuinely critical items phrased as "essential," "mandatory," or "do not proceed without" will be missed.
Consider either:
- Using an explicit marker in the checklist (e.g.,
[CRITICAL]prefix) rather than keyword sniffing, or- Expanding and documenting the keyword list with a note about false-positive risk.
122-127: Halt condition "HALT if any checklist section is skipped" is unenforceable.This halt condition relies on the LLM self-detecting that it skipped a section—the very failure mode it's trying to prevent. There is no external mechanism to verify completeness. An agent that skips a section is unlikely to also correctly report that it skipped a section.
A more robust approach: in step 3, require the agent to emit the section name before processing each section's items, and in step 4, cross-check that every parsed section from step 2 appears in the results. This at least makes skipping structurally detectable.
src/bmm/workflows/4-implementation/create-story/instructions.xml (1)
283-284: N/A fallback action is positioned after the conditional load block, but it doesn't reference the check result.Line 283 says "If previous story learnings are unavailable…set previous_story_intelligence to an explicit N/A note." The conditional load is in the
<check if="story_num > 1">block ending at line 200. However, line 283 is a standalone<action>that doesn't structurally connect to that check—it relies on the agent remembering whether the earlier check was entered. For a fresh-context executor (the exact scenario this PR is hardening), an agent may not correctly link "story_num > 1 was false" to "learnings are unavailable."A more robust pattern would be an explicit
<else>or<check if="story_num == 1">guard around the N/A assignment, rather than relying on the agent's inference across ~80 lines of intervening XML.
src/bmm/workflows/4-implementation/create-story/instructions.xml
Outdated
Show resolved
Hide resolved
src/bmm/workflows/4-implementation/create-story/instructions.xml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
This pull request restores the missing validate-workflow.xml task file and hardens checklist execution for conditional contexts, addressing issues #1530 and #1625. The PR reintroduces a dedicated validation framework that was accidentally deleted in a previous cleanup, and improves handling of conditional checklist items and explicit N/A reporting when story intelligence is unavailable.
Changes:
- Adds
validate-workflow.xmltask with deterministic variable resolution, conditional item handling, and validation report generation - Updates
create-storyinstructions to always outputprevious_story_intelligenceandgit_intelligence_summarysections with explicit N/A notes when unavailable, and clarifies the validation task invocation with explicit document parameter - Enhances checklist documentation with detailed variable resolution instructions for fresh-context execution
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/core/tasks/validate-workflow.xml | New task file providing checklist validation with evidence-backed reporting, conditional item support, and deterministic variable resolution |
| src/bmm/workflows/4-implementation/create-story/instructions.xml | Modified to always emit intelligence sections with explicit N/A handling and updated invoke-task to specify target document |
| src/bmm/workflows/4-implementation/create-story/checklist.md | Enhanced with deterministic variable resolution steps and clarified validation framework capabilities |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/bmm/workflows/4-implementation/create-story/instructions.xml
Outdated
Show resolved
Hide resolved
|
Addressed all bot review findings from 2026-02-14 and resolved the corresponding review threads. Implemented/finalized in commits: Highlights:
Ready for maintainer re-review. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@coderabbitai help |
ChatThere are 3 ways to chat with CodeRabbit:
CodeRabbit commands
Other keywords and placeholders
Status, support, documentation and community
|
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Fix all issues with AI agents
In `@src/bmm/workflows/4-implementation/create-story/checklist.md`:
- Around line 38-40: The checklist line that says "Load the newly created story
file (`{default_output_file}`)" assumes the file exists at validation time but
validation currently runs before saving per instructions.xml; either fix the
save-order in instructions.xml so the save occurs before validation, or change
this checklist step to conditional wording (e.g., "If present, load the story
file `{default_output_file}`") and mention loading workflow variables from
`{installed_path}/workflow.yaml` as before; update the checklist entry in
src/bmm/workflows/4-implementation/create-story/checklist.md accordingly and
ensure consistency with the save/validate order in instructions.xml.
- Around line 65-73: The checklist's document resolution order conflicts with
validate-workflow.xml: change the checklist so {story_file} is treated as the
primary document identifier and {default_output_file} is the fallback (update
the numbered steps and the "Required for this checklist flow" list accordingly)
to match validate-workflow.xml's resolution (which tries {story_file} before
{default_output_file}); keep a note that invoke-task in instructions.xml
explicitly passing document={default_output_file} overrides this only when
provided.
- Around line 67-70: The checklist in create-story/checklist.md omits the
prerequisite step to load the config source file before resolving
`{config_source}:key` references; update the deterministic resolution sequence
to include "load config_source file if present" as the first step so that
parsing workflow.yaml key/value pairs and resolving `{config_source}:key` (per
validate-workflow.xml's step 1 behavior) occur only after the referenced config
file has been loaded.
In `@src/bmm/workflows/4-implementation/create-story/instructions.xml`:
- Around line 309-310: The validate step is running against
{default_output_file} before the document is saved, causing load/validation to
use a missing or stale file; move the "Save story document unconditionally"
action so it executes before the <invoke-task> that runs
{project-root}/_bmad/core/tasks/validate-workflow.xml (i.e., ensure the
save/write of {default_output_file} completes and is flushed to disk before
invoking validate-workflow.xml with workflow={installed_path}/workflow.yaml
checklist={installed_path}/checklist.md document={default_output_file}); keep
the same parameters but swap the two steps so validation always reads the
freshly saved document.
In `@src/core/tasks/validate-workflow.xml`:
- Line 52: The "critical section labels" phrase is ambiguous; update the
validate-workflow.xml action text (the <action> element containing "Determine
critical checks...") to enumerate accepted section-level markers so agents use a
closed set: accept headings containing the token "CRITICAL" (case-insensitive),
headings containing "MUST FIX" or "MUST-FIX", headings with "Must Fix Before
Proceeding" or any exact phrase listed, and XML sections annotated with
critical="true" or critical="yes"; explicitly reject inference from generic
keywords alone. Modify the <action> text to list these exact patterns (e.g.,
"headings containing 'CRITICAL', 'MUST FIX', 'MUST-FIX', 'Must Fix Before
Proceeding', or sections with critical=\"true\" or critical=\"yes\"") so
labeling is deterministic.
- Around line 55-69: There is a logical contradiction between the "Ask user to
provide missing metadata fields (epic_num, story_num, story_id/story_key) and
WAIT" action and the "if item condition depends on missing metadata, mark
PARTIAL" fallback in the "Validate Every Checklist Item" step; resolve it by
choosing one behavior and updating the text: either remove the fallback sentence
in the "Validate Every Checklist Item" action so the metadata-halt action (the
one that prompts for epic_num, story_num, story_id/story_key) guarantees all
referenced metadata is present before evaluation, or expand the metadata-halt
action to list every metadata key that can be referenced (e.g., include
story_title and any other fields used by conditional items) so that the halt
covers all cases; update the relevant action text accordingly (the "Ask user to
provide missing metadata fields..." action or the "for each item" action under
"Validate Every Checklist Item") so they no longer contradict each other.
- Around line 128-132: The "Must Fix Before Proceeding" section currently lists
only "{all critical FAIL items}" but the gate logic treats critical PARTIALs
(variable critical_partial_count) as FAIL; update the template so critical
PARTIALs are included in the must-fix guidance — e.g., change that bullet to
"{all critical FAIL and PARTIAL items}" or otherwise include "{critical PARTIAL
items}" so remediation matches the gate, and adjust the "Should Improve" bullet
to only cover non-critical FAIL and PARTIAL items.
- Around line 79-86: The gate logic currently treats applicable_count==0 as pass
(pass_percent=0 and no fails), which yields a misleading "0/0 (0%) — PASS";
update the validation flow where applicable_count, pass_percent, and gate
decision are computed so that if applicable_count == 0 you do not auto-PASS:
instead set a distinct state (e.g., emit a warning flag, set gate decision to
NEEDS_REVIEW or REQUIRE_CONFIRMATION), and ensure this state is surfaced in
reports/notifications; adjust the decision block that checks
critical_fail_count, critical_partial_count, and fail_count to first check
applicable_count==0 and handle it before the existing FAIL/PASS branches.
🧹 Nitpick comments (2)
src/core/tasks/validate-workflow.xml (1)
141-146: Step 5 halts unconditionally — no path for automated pipelines.Line 145 says "HALT and wait for user direction" regardless of gate outcome. If the gate is PASS, an automated pipeline would still stall waiting for human input. Consider making the halt conditional on failures, or providing an
--auto-proceedinput that skips the halt on PASS.src/bmm/workflows/4-implementation/create-story/instructions.xml (1)
283-292: N/A fallback notes lack a deterministic format.Lines 283, 287, and 291 each say "set X to an explicit N/A note with reason," but no structured format is specified. One agent might write
N/A — this is the first story in the epic, another might write**Not Applicable**: No previous stories exist.The validator checklist likely checks for the presence of these sections, and inconsistent formatting may cause partial matches or false failures during validation.Define a canonical format, e.g.:
N/A: {one-line reason}.
src/bmm/workflows/4-implementation/create-story/instructions.xml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@src/bmm/workflows/4-implementation/create-story/checklist.md`:
- Around line 67-74: The checklist in create-story/checklist.md omits the fourth
sub-step from validate-workflow.xml that fills system-generated values (e.g.,
`{date}`); update the Resolve variables deterministically section to include
resolving system-generated values after resolving config files,
`{config_source}:key` references, and system path variables so tokens like
`{date}` are deterministically populated (mirror the behavior in
validate-workflow.xml step 1/lines 27-32).
In `@src/bmm/workflows/4-implementation/create-story/instructions.xml`:
- Line 309: The validate-workflow HALT is currently unconditional and blocks the
create-story flow; modify the invocation or the validate-workflow.xml so HALT
only executes on validation FAIL. Specifically, update validate-workflow.xml
(the HALT in step 5) to check the validation result variable (e.g.,
${validationStatus} or the task exit code) and only run HALT when status ==
"FAIL" (or alter the invoke-task call to pass a haltOnFail flag and have
validate-workflow honor it); ensure the invoke-task line calling
validate-workflow.xml continues automatically to sprint-status update when
validation passes.
In `@src/core/tasks/validate-workflow.xml`:
- Around line 47-56: Add a sanity check after the "Parse checklist into ordered
sections and atomic validation items; assign each item a stable id
(section_index.item_index)" step: compute an expected_item_count by scanning the
raw checklist for checkbox/bullet markers (e.g., "- [ ]", "- [x]", or list
bullets) and compare it to parsed_item_count; if parsed_item_count is
significantly lower than expected_item_count (configurable threshold, e.g., >10%
or absolute delta) then HALT or WARN and surface the list of unparsed line
ranges so the user can fix the input instead of silently losing items; modify
the existing "HALT with error: 'Checklist is empty or unparsable'" behavior to
also trigger for large divergences and add a clear error message referencing
parsed_item_count vs expected_item_count.
- Around line 149-154: Add a halt condition for report write failure by updating
the <halt-conditions> block to include a condition like "HALT if validation
report cannot be written"; then modify the task step that writes/saves the
validation report (the save/write-report action referenced by the critical-rules
that require "Always save a validation report file") to return/propagate an
error on failure and trigger the halt path so the workflow stops and logs the
failed save instead of continuing.
- Around line 141-146: Step 5 ("Return Decision and Halt") currently always
executes the "HALT and wait for user direction" action; change it so HALT is
conditional: only emit the "HALT and wait for user direction" action when the
gate result is FAIL or when the existing conditional action ("State clearly that
workflow should not proceed until fixes are applied" / action if="critical
failures exist") is true; otherwise return the concise summary, report path and
the gate decision and do not halt so the calling create-story workflow can
proceed automatically. Target the step with n="5" and modify the HALT action
control flow to depend on the gate outcome / critical failures flag rather than
executing unconditionally.
🧹 Nitpick comments (1)
src/core/tasks/validate-workflow.xml (1)
1-3: Taskiduses installed path but source lives at a different location.The task declares
id="_bmad/core/tasks/validate-workflow.xml"but the source file is atsrc/core/tasks/validate-workflow.xml. Theinstructions.xmlinvocation uses{project-root}/_bmad/core/tasks/validate-workflow.xml. This works post-installation (whensrc/core/is installed to_bmad/core/), but developers reading the source tree will see a path in theidattribute that doesn't match the file's actual location, which could cause confusion during development and debugging.Not a functional issue, but worth a comment in the file or the PR description clarifying the install-time path mapping.
src/bmm/workflows/4-implementation/create-story/instructions.xml
Outdated
Show resolved
Hide resolved
|
Addressed the latest CodeRabbit full-review findings and resolved all newly opened threads. New fixes pushed:
Key updates in this round:
Ready for maintainer re-review. |
|
Thank you for this suggestion @pure-maple ! The create story and validate workflow are being redone and will be merged to main soon and released after 6.0 release in the new workflow format used elsewhere, moving away from the current format. |
Upstream PR Draft (bmad-code-org/BMAD-METHOD)
Title
fix(create-story): restore validate-workflow task and harden checklist execution for conditional contexts
Problem
create-storyinvokesvalidate-workflow.xml, but the task file has been missing in current main, causing dangling references and inconsistent validation behavior. In addition, fresh-context checklist execution lacks deterministic variable resolution guidance, and first-story cases can be misjudged when prior-story context is not applicable.Scope
Added
src/core/tasks/validate-workflow.xmlstory_num > 1false => N/A with reason)Updated
src/bmm/workflows/4-implementation/create-story/instructions.xml{default_output_file})previous_story_intelligenceandgit_intelligence_summaryare always emitted; when unavailable they must output explicit N/A with reasonsrc/bmm/workflows/4-implementation/create-story/checklist.mdWhy this design
checklist.md) vs mechanism (validate-workflow.xml)Validation
story_num=1)Related
Risks
Rollback
Revert:
src/core/tasks/validate-workflow.xmlsrc/bmm/workflows/4-implementation/create-story/instructions.xmlsrc/bmm/workflows/4-implementation/create-story/checklist.md