feat: cross-file reference validator for BMAD source files#1494
feat: cross-file reference validator for BMAD source files#1494bmadcode merged 13 commits intobmad-code-org:mainfrom
Conversation
📝 WalkthroughWalkthroughAdds a new CI validation step that checks for broken cross-file references in BMAD source content (agents, workflows, tasks, steps). Introduces a standalone CLI tool that scans source files, extracts references, resolves them to actual file paths, and reports unresolved or absolute path leaks. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@tools/validate-file-refs.js`:
- Around line 338-340: The loop iterating over files uses fs.readFileSync
without error handling, so wrap the read operation in a try-catch inside the for
(const filePath of files) loop (where relativePath and PROJECT_ROOT are
computed) to catch any errors from fs.readFileSync; on error, log a warning that
includes relativePath (or filePath) and the error, then continue to the next
file instead of letting the exception propagate. Ensure only the read and
subsequent processing for that file are skipped, preserving the rest of the
loop.
- Around line 101-122: The getSourceFiles function currently calls
fs.readdirSync inside walk without error handling; wrap the fs.readdirSync call
in a try-catch inside walk (or validate dir before calling walk) so if
readdirSync throws (e.g., SRC_DIR missing or inaccessible) you catch the error
and throw or log a clear, user-friendly message that includes the directory path
and the original error (preserve error.stack/message). Update references in
getSourceFiles/walk so the function either returns an empty array or rethrows a
wrapped Error with context instead of letting the raw Node error bubble up.
🧹 Nitpick comments (2)
tools/validate-file-refs.js (2)
126-128: Potential regex catastrophic backtracking.The
[\s\S]*?pattern instripCodeBlockscan cause performance issues on very large files with many code blocks. While unlikely to be exploitable here, consider a more efficient approach if performance becomes a concern.
362-372: Silent skip for extensionless paths may hide valid broken references.When a resolved path has no extension, the code silently continues without reporting it. This could mask legitimate broken references to extensionless files. Consider logging in verbose mode or tracking these as a separate category.
Proposed improvement for verbose mode
if (!hasExt) { // Could be a directory reference — skip if not clearly a file + if (VERBOSE) { + console.log(` [SKIP] ${ref.raw} (no extension, may be directory)`); + } continue; }
Add tools/validate-file-refs.js that validates cross-file references
in BMAD source files (agents, workflows, tasks, steps). Catches broken
file paths, missing referenced files, wrong extensions, and absolute
path leaks before they reach users.
Addresses broken-file-ref and path-handling bug classes which account
for 25% of all historical bugs (59 closed issues, 129+ comments).
- Scans src/ for YAML, markdown, and XML files
- Validates {project-root}/_bmad/ references against source tree
- Checks relative path references, exec attributes, invoke-task tags
- Detects absolute path leaks (/Users/, /home/, C:\)
- Adds validate:refs npm script and CI step in quality.yaml
Add stripJsonExampleBlocks() to the markdown reference extractor so bare JSON example/template blocks (braces on their own lines) are removed before pattern matching. This prevents paths inside example data from being flagged as broken references.
…tput - Add utility/ to direct path mapping (was incorrectly falling through to src/modules/utility/) - Show line numbers for broken references in markdown files - Show YAML key path for broken references in YAML files - Print file headers in verbose mode for all files with refs
Broken refs no longer print [OK] before [BROKEN] in --verbose mode. Code block stripping now preserves newlines so offsetToLine() reports accurate line numbers when code blocks precede broken references.
0d0ffc1 to
773da28
Compare
|
Responding to the two nitpick observations from the review: Re: regex catastrophic backtracking in Tested Re: silent skip for extensionless paths (line 362–372) The skip is intentional. These are directory references like |
|
Nice job. This will cut the problems in half. |
|
Thank you. I keep meaning to roll something like this out for validating workflows the same way we are validating agents. Reviewing... |
PR Review: #1494Title: feat: cross-file reference validator for BMAD source files Findings1.
|
|
@arcaven are you on BMAD Discord by any chance? |
Address alexeyv's review findings on PR #1494: - Fix exec-attr prefix handling for {_bmad}/ and bare _bmad/ paths - Fix mapInstalledToSource fallback (remove phantom src/modules/ mapping) - Switch extractYamlRefs to parseDocument() for YAML line numbers Add CI integration (stories 2-1, 2-2): - Emit ::warning annotations for broken refs and abs-path leaks - Write markdown table to $GITHUB_STEP_SUMMARY - Guard both behind environment variable checks Harden CI output: - escapeAnnotation() encodes %, \r, \n per GitHub Actions spec - escapeTableCell() escapes pipe chars in step summary table
|
@alexeyv Thank you for the detailed review and for taking the time — really appreciate the consideration and the opportunity to contribute to the project. All four findings addressed in Finding 1:
|
|
@alexeyv as we discussed (thank you) on discord, I've proposed a more obvious PR Comment approach to surfacing the issues, but you can see here, already, what this will do with existing issues it's detected, in two places. It's easy to see these in the logs, for anyone that looks. Naturally, you can block with failing the CI rather than passing it. These will show in submitted code in pull/files view, and they'll show as warnings a click away in the checks / Github Actions view, if you go looking. But a further option is for PR Comments to be posted inline.
However, that is more difficult to maintain, and requires GHA permissions. I believe it's a good improvement, but I'm already at 484 lines in this feature. I want to see it benefit the project before I try to put chrome buff on anything. |
…-org#1494) * feat: add cross-file reference validator for CI Add tools/validate-file-refs.js that validates cross-file references in BMAD source files (agents, workflows, tasks, steps). Catches broken file paths, missing referenced files, wrong extensions, and absolute path leaks before they reach users. Addresses broken-file-ref and path-handling bug classes which account for 25% of all historical bugs (59 closed issues, 129+ comments). - Scans src/ for YAML, markdown, and XML files - Validates {project-root}/_bmad/ references against source tree - Checks relative path references, exec attributes, invoke-task tags - Detects absolute path leaks (/Users/, /home/, C:\) - Adds validate:refs npm script and CI step in quality.yaml * feat: strip JSON example blocks to reduce false-positive broken refs Add stripJsonExampleBlocks() to the markdown reference extractor so bare JSON example/template blocks (braces on their own lines) are removed before pattern matching. This prevents paths inside example data from being flagged as broken references. * feat: add line numbers, fix utility/ path mapping, improve verbose output - Add utility/ to direct path mapping (was incorrectly falling through to src/modules/utility/) - Show line numbers for broken references in markdown files - Show YAML key path for broken references in YAML files - Print file headers in verbose mode for all files with refs * fix: correct verbose [OK]/[BROKEN] overlap and line number drift Broken refs no longer print [OK] before [BROKEN] in --verbose mode. Code block stripping now preserves newlines so offsetToLine() reports accurate line numbers when code blocks precede broken references. * fix: address review feedback, add CI annotations and step summary Address alexeyv's review findings on PR bmad-code-org#1494: - Fix exec-attr prefix handling for {_bmad}/ and bare _bmad/ paths - Fix mapInstalledToSource fallback (remove phantom src/modules/ mapping) - Switch extractYamlRefs to parseDocument() for YAML line numbers Add CI integration (stories 2-1, 2-2): - Emit ::warning annotations for broken refs and abs-path leaks - Write markdown table to $GITHUB_STEP_SUMMARY - Guard both behind environment variable checks Harden CI output: - escapeAnnotation() encodes %, \r, \n per GitHub Actions spec - escapeTableCell() escapes pipe chars in step summary table --------- Co-authored-by: Alex Verkhovsky <alexey.verkhovsky@gmail.com> Co-authored-by: Brian <bmadcode@gmail.com>


Why
Broken file references and path handling account for 59 closed bugs (25% of all bugs) and 129+ comments. These categories have zero automated prevention today. When files are renamed or have extensions changed, stale references are only found by users post-release.
We replayed the validator against all 26 v6 release tags. The data tells a clear story:
The alpha.17
src/modules/migration alone introduced 137 broken refs that were fixed piecemeal across 5 subsequent releases. With this validator, all 137 would have appeared in the build log on that single PR.Fixes #1493
Related to bmad-code-org/bmad-builder#7 and #1529
What
Add
tools/validate-file-refs.js— a cross-file reference validator that checks whether file paths referenced across BMAD source files actually point to existing files. It also detects absolute path leaks (/Users/,/home/,C:\).src/YAML, markdown, and XML files (~217 files, ~483 references on currentmain){project-root}/_bmad/paths,{_bmad}/shorthand, relative paths,execattributes,<invoke-task>targets, step metadata, and Load directives{{mustache}},{installed_path},{output_folder}, etc.) and install-generated paths (_config/,config.yaml)node:fs,node:path, andyaml(existing dep)validate:refsnpm script and a step in the existingvalidateCI jobWhy is this safe to adopt
Non-blocking by design
The validator runs in warning mode by default (exit 0). Broken references appear in the build log for visibility, but build results are unaffected. No existing CI checks, pre-commit hooks, or npm scripts are modified in behavior. The validator is purely additive.
Every existing CI check continues to enforce exactly as before:
When ready to enforce, one change in
package.json:Try it
Or grab just the file without checking out the branch:
The validator only reads files. It makes no changes to disk.
Historic impact
Specific issues this would have preempted:
workflow.xmltask file contains an incorrect file extension reference for the party-mode workflow #1212 (party-mode.yamlvs.md): introduced alpha.17, reported by user a month later — validator catches it the day it's committedvalidate-workflow.xml): introduced Beta.0 — caught in the same CI runBroken references on current
mainsteps-c/step-11-polish.md:8./data/prd-purpose.md→ should be../data/steps-e/step-e-04-complete.md:7./steps-v/step-v-01-discovery.md→ should be../steps-v/create-story/checklist.md:36,67core/tasks/validate-workflow.xml→ doesn't existcore/tasks/workflow.xml:84core/workflows/party-mode/workflow.yaml→ file is.mdHow
flowchart TD A[Discover source files in src/] --> B[Read file content] B --> C{YAML or Markdown?} C -->|YAML| D[Parse YAML & walk values] C -->|Markdown/XML| E[Strip code blocks\n& JSON examples] E --> F[Extract refs via regex patterns] D --> G[Collect references] F --> G G --> H{Reference type?} H -->|project-root / _bmad| I[Map installed path → source path\ne.g. _bmad/core/ → src/core/] H -->|Relative path| J[Resolve from containing file dir] H -->|exec / invoke-task| K[Extract path & resolve] I --> L{fs.existsSync?} J --> L K --> L L -->|Missing| M["Report [BROKEN] with file:line"] L -->|Found| N["Report [OK] in verbose mode"] B --> O{Absolute path leak?} O -->|/Users/ /home/ C:\| P["Report [ABS-PATH]"] M --> Q[Summary & exit code] N --> Q P --> Q Q --> R{--strict flag?} R -->|Yes + issues| S[Exit 1] R -->|No or clean| T[Exit 0]Reference types validated
{project-root}/_bmad/paths{project-root}/_bmad/core/tasks/workflow.xml{_bmad}/shorthand{_bmad}/bmm/workflows/dev-story/workflow.yaml./step-02-analysis.md,../data/template.mdexecattributesexec="core/tasks/validate-workflow.xml"<invoke-task>targets<invoke-task>_bmad/core/tasks/workflow.xml</invoke-task>nextStepFile: './step-03.md'Load: `./checklist.md`/Users/dev/project/...Testing
npm run validate:refsvalidates ~483 references across ~217 source filesmainincluding Missing validate-workflow.xml referenced in create-story checklist #1455/[BUG] Missing _bmad/core/tasks/validate-workflow.xml breaks create-story validation (latest update of V6.) #1324 and Theworkflow.xmltask file contains an incorrect file extension reference for the party-mode workflow #1212, plus 2 previously unreported broken relative paths (fix: broken relative path in create-prd step-11-polish.md #1495, fix: broken relative path in create-prd step-e-04-complete.md #1496 — found by this tool)Near-term improvements
If this validator is useful, a few low-effort follow-ups could improve visibility of findings in CI:
::warning::annotations — Emit::warning file={file},line={line}::{message}so broken references appear as inline annotations on the PR "Files changed" tab. ~10 lines of change in the report function, zero additional permissions or workflow changes. This is how most Actions-based linters surface findings.$GITHUB_STEP_SUMMARY— Write a markdown summary table to the job summary page so findings are visible without expanding log output.--strictpromotion — Once existing broken refs onmainare fixed (Theworkflow.xmltask file contains an incorrect file extension reference for the party-mode workflow #1212, Missing validate-workflow.xml referenced in create-story checklist #1455/[BUG] Missing _bmad/core/tasks/validate-workflow.xml breaks create-story validation (latest update of V6.) #1324, fix: broken relative path in create-prd step-11-polish.md #1495, fix: broken relative path in create-prd step-e-04-complete.md #1496), flip to--strictso new broken references fail CI.PR comments and Check Run annotations were considered but require
pull-requests: writeorchecks: writepermissions, which introduce security considerations for fork-based PRs.