Skip to content

feat: workflow schema validator (Zod)#1529

Closed
arcaven wants to merge 9 commits intobmad-code-org:mainfrom
arcaven:feat/validate-workflow-schema
Closed

feat: workflow schema validator (Zod)#1529
arcaven wants to merge 9 commits intobmad-code-org:mainfrom
arcaven:feat/validate-workflow-schema

Conversation

@arcaven
Copy link
Copy Markdown
Contributor

@arcaven arcaven commented Feb 4, 2026

Overview

This PR adds Layer 2 (Schema Validation) to the BMAD-METHOD structural validation pipeline — a layered CI system where each layer catches structural failures that lower layers cannot.

The layers are ordered to provide the greatest value to the project in sequence. So Layer 0 prevents more bugs, the layer 1 I simulated as accounting for 25% of all issues reported, each layer accounting for fewer %, but valuable in narrowing ways.

                ┌─────────────┐
                │   Layer 3   │  Graph Validation
                │             │  Step transitions, reachability
                │             │  (planned)
            ┌───┴─────────────┴───┐
            │      Layer 2        │  Schema Validation    ◄── THIS PR
            │                     │  YAML field types, required fields, enums
            │                     │
        ┌───┴─────────────────────┴───┐
        │          Layer 1            │  File Reference Validation
        │                             │  Cross-file refs, path resolution
        │                             │  ✅ MERGED (PR #1494)
    ┌───┴─────────────────────────────┴───┐
    │              Layer 0                │  Formatting & Linting
    │                                     │  Prettier, ESLint, markdownlint
    │                                     │  ✅ EXISTING
    └─────────────────────────────────────┘

All layers are deterministic with zero false positives. Each runs as an independent CI job with its own --strict flag and npm script. The pyramid is a conceptual model for coverage, not an execution dependency chain.

Summary

Adds a Zod-based schema validator for workflow.yaml files, following the exact pattern of the existing validate-agent-schema.js (PR #1494).

  • Schema (tools/schema/workflow.js): Validates 5 required fields (name, description, author, standalone, web_bundle), template polymorphism (string | false), input_file_patterns with load_strategy enum (FULL_LOAD | SELECTIVE_LOAD | INDEX_GUIDED), execution_hints, tags, required_tools, variables, and instructions file extension (.md or .xml). Uses .passthrough() for the many varying optional string config fields.
  • CLI (tools/validate-workflow-schema.js): Scans src/{core,bmm}/workflows/**/workflow.yaml, reports errors with ::warning annotations in GitHub Actions, supports --strict flag for CI enforcement.
  • Tests: 27 fixtures (11 valid + 16 invalid) with a test runner that validates exact Zod error shapes (code, path, message, expected/received).
  • CI integration: npm run test:schemas and npm run validate:schemas now include workflow schema validation.
  • Removed standalone and web_bundle from workflow schema (tools/schema/workflow.js) See below. If I've misinterpreted and these should be optional instead, let me know and I'll put them back in as such.

Forensic validation

Design decisions

  • Passthrough on top-level object: Workflow files have 20+ varying optional string fields (config refs, path refs, etc.). Using .strict() would break when new fields are added. .passthrough() validates structural fields while allowing any additional string config.
  • YAML-only scope: Only validates workflow.yaml files (13 files). Markdown workflow frontmatter (12 files with only 3 trivial fields) is deferred.
  • Default mode is warning-only: --strict flag required for hard enforcement, matching the contributor-friendly approach.

Why here and What's next

BMAD-METHOD is:

  • where the formats are defined and
  • the primary install source and
  • Most community contributions land as direct PRs

Errors introduced here bypass any protections bmad-builder might provide for the subset of the community who are using it to generate content. CI validators in this repo catch those errors at the PR stage, before they reach users. That's why the validation pipeline starts here.

Layer 3 (workflow graph validation) — step transitions, reachability, dangling reference detection — will follow as a separate PR.

I'm planning to offer these validators in bmad-builder for both CI protection of contributions and validation of generated output (see bmad-builder#7). I could shift to adding Layers 1–2 in bmad-builder before Layer 3 here, or complete the sequence, as you prefer.

sequenceDiagram
    participant BM as BMAD-METHOD
    participant BMB as bmad-builder

    Note over BM: ✅ Layer 0 — Formatting & Linting<br/>Existing

    Note over BM: ✅ Layer 1 — File Reference<br/>Validator PR 1494 merged

    rect rgba(0, 128, 255, 0.1)
    BM->>BM: Layer 2 — Schema Validator<br/>PR 1529 this PR
    end

    BM->>BM: Layer 3 — Graph Validator (planned, separate PR)

    Note over BM,BMB: ── community checkpoint ──

    BM-->>BMB: Extend validators with --installed mode<br/>CI validation of BMB source files

    BMB->>BMB: Post-generation validation<br/>Run Layers 1-3 on builder output
Loading

Thank you for allowing me to offer this, glad to change scope, direction, design or sequencing as you prefer.

Test plan

  • 27/27 test fixtures passing (RED → GREEN)
  • 52/52 agent schema tests still passing (no regression)
  • 12/12 installation component tests passing
  • 10/10 agent files pass agent schema validation
  • 12/13 workflow files pass workflow schema validation (1 pre-existing defect)
  • ESLint, Prettier, markdownlint all clean
  • Verify CI runs both validators via npm run validate:schemas

…dator

27 test fixtures (11 valid + 16 invalid) and a test runner that
exercises the forthcoming Zod schema for workflow.yaml files.
Includes a stub schema that fails all validation, confirming
26 failing / 1 passing (yaml-parse-error) — TDD red state.

Follows the validate-agent-schema pattern (tools/schema/agent.js,
test/test-agent-schema.js).

Ref: MSSCI-12749
Implement the Zod schema for workflow.yaml files and the CLI
validator tool, completing the GREEN phase of TDD.

- tools/schema/workflow.js: Zod schema with 5 required fields,
  template polymorphism (string|false), input_file_patterns with
  load_strategy enum, execution_hints, and instructions extension
  validation. Passthrough for varying optional string fields.
- tools/validate-workflow-schema.js: CLI tool mirroring
  validate-agent-schema.js with --strict flag and GitHub Actions
  ::warning annotations.
- package.json: Wire test:schemas and validate:schemas to include
  workflow schema validation.

27/27 tests passing. 12/13 real workflow files pass validation
(qa/automate missing standalone — pre-existing upstream defect).

Ref: MSSCI-12749
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 4, 2026

📝 Walkthrough

Walkthrough

A new workflow YAML schema validation system is introduced, including Zod-based schema definitions, a test fixture framework, a CLI validation tool, and comprehensive test coverage for both valid and invalid workflow configurations across 40+ fixture files.

Changes

Cohort / File(s) Summary
Package Configuration
package.json
Added test-workflow-schema.js and validate-workflow-schema.js to test:schemas and validate:schemas npm scripts respectively.
Schema Definitions
tools/schema/workflow.js
Defined Zod-based validation schema for workflow.yaml with required fields (name, description, author, standalone, web_bundle), optional fields (template, input_file_patterns, execution_hints, tags, required_tools, variables, instructions), load strategy enum, and custom validation enforcing .md or .xml extension for instruction files. Exported validateWorkflowFile function for schema validation.
Validation CLI Tool
tools/validate-workflow-schema.js
Created command-line tool for scanning and validating workflow YAML files, supporting --strict mode, YAML parsing error handling, detailed per-file error reporting, and GitHub Actions CI annotations for warnings.
Test Runner
test/test-workflow-schema.js
Implemented comprehensive test runner discovering workflow fixture files, extracting per-file test metadata from YAML comments, executing validation against expected outcomes (PASS/FAIL with optional error details), comparing actual vs expected errors, and producing categorized, colored output with failure summaries.
Invalid Workflow Fixtures
test/fixtures/workflow-schema/invalid/*
Added 16 test fixture files covering validation failures: bad execution hints type, invalid instructions extension, bad load strategy enum, incorrect tags/template/standalone/web_bundle types, empty/missing name/author/description/standalone/web_bundle fields, missing load strategy, and YAML parse errors.
Valid Workflow Fixtures
test/fixtures/workflow-schema/valid/*
Added 11 test fixture files covering valid workflows: minimal required fields, full configuration, implementation-style, Excalidraw-style, with template (false/path variants), execution hints, input patterns, required tools, tags, and variables.

Sequence Diagram

sequenceDiagram
    participant FileSystem as File<br/>System
    participant TestRunner as Test<br/>Runner
    participant YAMLParser as YAML<br/>Parser
    participant Validator as Schema<br/>Validator
    participant Output as Test<br/>Output

    loop For each workflow fixture
        FileSystem->>TestRunner: Discover fixture file
        TestRunner->>FileSystem: Read metadata (comments)
        TestRunner->>FileSystem: Read YAML content
        
        alt YAML Parses Successfully
            FileSystem->>YAMLParser: Raw YAML content
            YAMLParser->>YAMLParser: Parse YAML
            YAMLParser->>Validator: Parsed workflow object
            Validator->>Validator: Validate against schema
            
            alt Validation Success
                Validator-->>TestRunner: Valid result
                TestRunner->>TestRunner: Compare to expected (PASS)
            else Validation Fails
                Validator-->>TestRunner: Error details
                TestRunner->>TestRunner: Compare to expected<br/>(error code, path, type)
            end
        else YAML Parse Fails
            YAMLParser-->>TestRunner: Parse error
            TestRunner->>TestRunner: Compare to expected<br/>(expect FAIL flag)
        end
        
        TestRunner->>Output: Record test result<br/>(pass/fail)
    end
    
    Output->>Output: Categorize and<br/>aggregate results
    Output-->>FileSystem: Print summary<br/>(counts, failures)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • alexeyv
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: workflow schema validator (Zod)' clearly and concisely describes the main change: adding a workflow schema validator using Zod, which is the primary objective of this PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The PR description comprehensively explains the addition of Layer 2 (Schema Validation) to a validation pipeline, including detailed technical implementation details, design decisions, and test results.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@test/test-workflow-schema.js`:
- Around line 33-35: parseTestMetadata currently calls fs.readFileSync without
handling IO errors, which can bubble up before runTest's try/catch and abort the
whole suite; update either parseTestMetadata to catch and return a safe default
(or throw a controlled error) or move the metadata read so it happens inside
runTest's existing try/catch. Specifically, wrap the fs.readFileSync and content
parsing in a try/catch inside parseTestMetadata (or call parseTestMetadata from
within runTest's try block) and ensure on error you return null/empty metadata
or rethrow a test-scoped error so only the individual test fails instead of the
entire suite; reference parseTestMetadata and runTest to locate where to apply
the change.

In `@tools/schema/workflow.js`:
- Around line 1-2: The project requires the Zod runtime (see the require('zod')
import and use of z in tools/schema/workflow.js), but zod is not declared in
package.json; add "zod" to the "dependencies" section of package.json (pick a
compatible semver, e.g., latest stable) and run npm install / yarn install so CI
and clean installs include the module.

In `@tools/validate-workflow-schema.js`:
- Around line 50-52: The annotation paths are computed using
path.relative(process.cwd(), filePath) which misaligns when a custom project
root is used; update the computation in the workflowFiles loop to base
relativePath on the configured root (e.g., path.relative(root, filePath) or
projectRoot) so GitHub Actions annotations reference the repository root
correctly — adjust any variable names (relativePath, root/projectRoot,
workflowFiles) accordingly and ensure downstream uses of relativePath continue
to work.
🧹 Nitpick comments (1)
test/fixtures/workflow-schema/invalid/missing-load-strategy.workflow.yaml (1)

1-14: Minor metadata inconsistency compared to other fixtures.

This fixture tests a missing required field (load_strategy), which Zod reports as invalid_type with received: undefined. For consistency with other invalid_type fixtures (like wrong-type-name.workflow.yaml), consider adding the Error expected and Error received metadata lines:

# Expected: FAIL
# Error code: invalid_type
# Error path: input_file_patterns.epics.load_strategy
# Error expected: 'FULL_LOAD' | 'SELECTIVE_LOAD' | 'INDEX_GUIDED'
# Error received: undefined
# Tests: AC18 — input_file_patterns entry missing load_strategy produces error

This is a minor documentation consistency improvement and does not affect test correctness.

Comment on lines +1 to +2
// Zod schema definition for workflow.yaml files
const { z } = require('zod');
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Locate package.json files
find . -name "package.json" -type f | head -20

# Search for zod in package.json
if [ -f "package.json" ]; then
  echo "=== package.json content (relevant sections) ==="
  cat package.json | grep -A 5 -B 5 '"zod"' || echo "zod not found in main package.json"
fi

Repository: bmad-code-org/BMAD-METHOD

Length of output: 167


🏁 Script executed:

#!/bin/bash
# Check if zod is used in the workflow.js file
cat tools/schema/workflow.js | head -20
rg -n "z\." tools/schema/workflow.js | head -5

Repository: bmad-code-org/BMAD-METHOD

Length of output: 991


Add zod to package.json dependencies

The module requires zod at runtime (line 2 and throughout) but it's not declared in package.json. This will cause failures in clean installs and CI environments. Add zod to the dependencies section in package.json.

🤖 Prompt for AI Agents
In `@tools/schema/workflow.js` around lines 1 - 2, The project requires the Zod
runtime (see the require('zod') import and use of z in
tools/schema/workflow.js), but zod is not declared in package.json; add "zod" to
the "dependencies" section of package.json (pick a compatible semver, e.g.,
latest stable) and run npm install / yarn install so CI and clean installs
include the module.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this may be a false positive — zod is already available as a transitive dependency through the existing packages, and the agent schema validator (tools/schema/agent.js) uses the same require('zod') pattern without declaring it directly in package.json. This PR intentionally mirrors that established pattern.

Happy to add an explicit dependency if that's preferred, but wanted to flag the existing precedent first. @alexeyv — would you like zod declared directly, or is the current transitive approach acceptable?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment on lines +50 to +52
for (const filePath of workflowFiles.sort()) {
const relativePath = path.relative(process.cwd(), filePath);

Copy link
Copy Markdown

@coderabbitai coderabbitai bot Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Align annotation paths with the scanned root

relativePath is derived from process.cwd(), so running with a non-default projectRoot (or from a different working directory) can emit paths that don’t map to the repo root in GitHub Actions annotations. Consider basing the relative path on root to keep warnings accurate.

🤖 Prompt for AI Agents
In `@tools/validate-workflow-schema.js` around lines 50 - 52, The annotation paths
are computed using path.relative(process.cwd(), filePath) which misaligns when a
custom project root is used; update the computation in the workflowFiles loop to
base relativePath on the configured root (e.g., path.relative(root, filePath) or
projectRoot) so GitHub Actions annotations reference the repository root
correctly — adjust any variable names (relativePath, root/projectRoot,
workflowFiles) accordingly and ensure downstream uses of relativePath continue
to work.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This matches the existing agent schema validator, which uses the same path.relative(process.cwd(), filePath) pattern at tools/validate-agent-schema.js line 48.

In practice, the tool is always invoked via npm run validate:schemas (which sets cwd to the package root) or through GitHub Actions (which runs from the repo checkout root). The CLI doesn't accept a --root flag, so there's no supported invocation path where process.cwd() would differ from the project root.

If the project wants to align both validators to use an explicit root, happy to contribute that as a follow-up across both tools.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the validate-agent-schema.js file around line 48 to verify the user's claim
cat tools/validate-agent-schema.js | head -60 | tail -20

Length of output: 688


@arcaven You're absolutely right—I've confirmed that validate-agent-schema.js uses the same path.relative(process.cwd(), filePath) pattern on line 48. Given that both validators follow the same convention and the practical usage through npm run validate:schemas and CI always runs from the repo root, the current approach is consistent and works correctly.

Thanks for the context and for offering to contribute a follow-up if the project decides to align both validators to use an explicit root in the future. That sounds like a reasonable enhancement if it's ever needed, but the current implementation is fine.

@arcaven arcaven marked this pull request as draft February 4, 2026 19:41
@arcaven
Copy link
Copy Markdown
Contributor Author

arcaven commented Feb 4, 2026

Re: the nitpick on missing-load-strategy.workflow.yaml missing Error expected / Error received metadata lines —

Not a consistency issue — it's a deliberate choice. The fixture tests what it needs to test (AC18). Adding more metadata would increase maintenance surface for zero value.

The test runner treats expected and received as optional metadata — when present it validates them, when absent it skips those checks. This fixture omits them because the expected value for a missing enum field would be the full union string ('FULL_LOAD' | 'SELECTIVE_LOAD' | 'INDEX_GUIDED'), which would make the fixture brittle if enum values change. The fixture already validates the right things: error code (invalid_type) and error path (input_file_patterns.epics.load_strategy), which is sufficient for AC18 coverage.

Adds test coverage for the case where a YAML number value appears
in the variables record (e.g., `count: 5`). YAML parses unquoted
numbers as integers, which the schema correctly rejects as
invalid_type (expected string, received number). This is a likely
contributor mistake worth covering.

28/28 fixtures now passing.
@arcaven arcaven marked this pull request as ready for review February 4, 2026 20:41
The maintainer (Brian Madison) removed these fields from all workflow
files in two commits:

- f699a36: removed `standalone: true` from all workflow.yaml files
- 731bee2: removed `web_bundle: false` from all workflow.yaml files

These fields are no longer part of the workflow.yaml format. Update
the schema to match upstream and remove related test fixtures.

24/24 tests passing. 9/9 workflow files valid.
@arcaven
Copy link
Copy Markdown
Contributor Author

arcaven commented Feb 5, 2026

Updated schema to match upstream changes.

The standalone and web_bundle fields have been removed from the workflow schema after the maintainer (Brian Madison) removed them from all workflow files in two commits after this feature was drafted:

  • f699a368: removed standalone: true from all workflow.yaml files
  • 731bee26: removed web_bundle: false from all workflow.yaml files

Schema and test fixtures updated accordingly. All 9 workflow files now pass validation. 24/24 workflow schema tests passing.

@bmadcode
Copy link
Copy Markdown
Collaborator

bmadcode commented Feb 6, 2026

I just realize this was all for checking workflow.yaml files - these are deprecated, no more are being added, and all that exist are being replaced. There is no need to add this as they will be gone within the next week - and many of the properties are not used anymore.

@bmadcode bmadcode closed this Feb 6, 2026
@arcaven
Copy link
Copy Markdown
Contributor Author

arcaven commented Feb 6, 2026

Understood — closing this PR given the workflow.yaml deprecation.

We'll adapt the validator approach for the new workflow*.md frontmatter format once the migration is complete. The core infrastructure (Zod schema validation, CI annotations, fixture-based testing) can be reused with a new schema definition targeting the markdown-based workflows.

Thanks for the heads-up on the direction.

bmadcode added a commit that referenced this pull request Feb 8, 2026
…erences (#1573)

* feat: extend validate-file-refs.js to scan CSV workflow-file references

Add CSV file reference extraction to the Layer 1 validation pipeline,
preventing broken _bmad/ workflow-file paths in module-help.csv files.
Closes the gap identified after PR #1529 where CSV references were
unvalidated despite being a source of repeat community issues.

Refs: #1519

* fix: include test:refs in aggregate test script

Add CSV file-ref extraction tests to the aggregate `npm test` pipeline,
matching the existing pattern for test:schemas and test:install.

Thanks to CodeRabbit for catching the omission.

* fix: address review feedback on CSV validator extension

- Surface CSV parse errors visibly instead of silently swallowing
  (no Layer 2c schema validator exists yet to catch these)
- Add explanatory comments for the !VERBOSE logging pattern
  (non-verbose prints file headers only when issues found)
- Add verbose-mode diagnostics for extensionless path handling
  ([SKIP] when nothing exists, [OK-DIR] for valid directories)

* refactor: collect-then-print to eliminate confusing !VERBOSE pattern

Replace the split header-printing logic (print early in verbose mode,
print late in non-verbose mode with a !VERBOSE guard) with a simpler
collect-then-print approach. Refs are now classified into ok[] and
broken[] arrays first, then printed in a single location with one
straightforward if/else if decision.

Addresses alexeyv's review feedback about the counterintuitive
"if not verbose, log" pattern.

* feat: promote extensionless unresolved paths from silent skip to [UNRESOLVED]

Paths without file extensions that don't exist as files or directories
are now flagged as [UNRESOLVED] — a distinct tag from [BROKEN] (which
means a file with a known extension wasn't found). Both count toward
the broken reference total and appear in CI annotations.

This catches real bugs like wrong directory names in installed_path
metadata and dead invoke-workflow references to removed workflows.
Extensionless paths that DO exist as directories are still [OK-DIR].

---------

Co-authored-by: Alex Verkhovsky <alexey.verkhovsky@gmail.com>
Co-authored-by: Brian <bmadcode@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] Edited workflows requires schema update: standalone/web_bundle fields

2 participants