Skip to content

ci: enforce per-package coverage threshold via vitest [closes #218]#276

Open
EmersonBraun wants to merge 1 commit intomainfrom
foundation/coverage-gate
Open

ci: enforce per-package coverage threshold via vitest [closes #218]#276
EmersonBraun wants to merge 1 commit intomainfrom
foundation/coverage-gate

Conversation

@EmersonBraun
Copy link
Copy Markdown
Owner

Summary

Adds vitest coverage gate to every package via shared config helper. Closes #218. Pairs with #275 (size-limit) to formalize quality bars.

Architecture

  • vitest.shared.ts at root — createTestConfig({ linesThreshold, environment?, setupFiles? }) helper with v8 provider
  • Every package's vitest.config.ts now imports + extends the helper (one-line config per package)
  • New test:coverage script per package runs vitest run --coverage
  • New top-level pnpm test:coverage via turbo
  • New .github/workflows/coverage.yml runs on every PR + push to main; blocks merge on regression below threshold

Per-package thresholds (current minus 5% buffer)

Package Actual Threshold Headroom
@agentskit/rag 100% 95 +5
@agentskit/skills 100% 95 +5
@agentskit/eval 100% 95 +5
@agentskit/templates 100% 95 +5
@agentskit/runtime 94.33% 90 +4
@agentskit/react 86.60% 80 +6
@agentskit/memory 84.02% 80 +4
@agentskit/core 79.25% 75 +4 (sacred — target 90%)
@agentskit/tools 76.00% 70 +6
@agentskit/adapters 63.83% 60 +4
@agentskit/observability 57.69% 55 +3
@agentskit/cli 36.70% 30 +7
@agentskit/sandbox 35.38% 30 +5

Why lines-only as the gate metric

Simplest signal, most stable across test styles. Branches/functions/statements still reported (text + html + lcov + json-summary) but not blocking. Avoids the failure mode where 100% line coverage with one branch missed produces a noisy red.

Aspirational targets (next 2 sprints)

  • Every package ≥ 80% lines
  • @agentskit/core90% lines (Manifesto principle 1 — sacred)

These are not enforced; they are the goalposts. The thresholds in this PR are the floor that prevents regression.

Trade-offs

  • Headroom intentionally generous (3-7 points). Tightens as tests accumulate; relaxing requires conscious PR.
  • index.ts files excluded from coverage (re-exports only). Source files stay in scope.
  • Coverage reports uploaded as GH Actions artifact (7-day retention) — sets stage for Codecov/CodeClimate badge in a follow-up PR.

⚠️ Depends on #265 (ink test fix)

@agentskit/ink tests were broken on main before #265 (ink@7 / ink-testing-library@4 incompat). This PR's CI will fail on the ink package until #265 merges. Other 13 packages pass coverage threshold locally.

Recommended order: merge #265 → rebase this PR → CI green.

Test plan

  • All 13 non-ink packages pass coverage threshold locally (pnpm --filter <pkg> test:coverage)
  • Aggregate: pnpm test:coverage runs all in parallel via turbo
  • React preserves happy-dom env + setupFiles via helper opts
  • Workflow YAML validates
  • (Post-foundation: manifesto, origin, governance templates, and Phase 0 planning docs #265 merge) CI on this PR shows coverage check passing
  • (Future PR test) deliberately remove tests below threshold, confirm CI blocks merge

Follow-ups

  • Add coverage badge to README (Codecov or CodeClimate, free for OSS)
  • Tighten thresholds after 1-2 sprints of stable measurement
  • Add per-package coverage badge in each package README
  • Address sandbox & cli (lowest coverage) as Phase 0 wraps

Closes #218
Refs #211

Adds @vitest/coverage-v8 and a shared test config helper enforcing a
'lines' coverage threshold per package on every PR.

Architecture:
- vitest.shared.ts at root: createTestConfig(opts) helper with v8 provider,
  per-package linesThreshold, optional environment + setupFiles
- Each package's vitest.config.ts now imports + extends the helper
- New 'test:coverage' script per package runs vitest with --coverage
- New top-level 'pnpm test:coverage' via turbo
- New .github/workflows/coverage.yml runs on every PR + push to main

Per-package thresholds (current minus 5% buffer; tighten as coverage grows):

  Package          Actual    Threshold  Headroom
  -------          ------    ---------  --------
  rag              100%      95         +5
  skills           100%      95         +5
  eval             100%      95         +5
  templates        100%      95         +5
  runtime          94.33%    90         +4
  react            86.60%    80         +6
  memory           84.02%    80         +4
  core             79.25%    75         +4   (sacred — target 90%)
  tools            76.00%    70         +6
  adapters         63.83%    60         +4
  observability    57.69%    55         +3
  cli              36.70%    30         +7
  sandbox          35.38%    30         +5

Why lines-only as the gate metric: simplest signal, most stable across
test styles. Branches/functions/statements still reported but not blocking.

Aspirational targets for next 2 sprints:
- Every package ≥ 80% lines
- @agentskit/core ≥ 90% lines (Manifesto principle 1)

Closes #218
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[P0.7] Coverage gate in CI (thresholds per package)

1 participant