Skip to content

Add QA happy-path replay tests and smoke suite scaffolding#276

Merged
jeremy merged 2 commits intomainfrom
map-territory
Mar 12, 2026
Merged

Add QA happy-path replay tests and smoke suite scaffolding#276
jeremy merged 2 commits intomainfrom
map-territory

Conversation

@jeremy
Copy link
Copy Markdown
Member

@jeremy jeremy commented Mar 12, 2026

Summary

  • VCR-style HTTP recording proxy (e2e/recorder/) captures API interactions as cassettes for deterministic offline replay
  • 5 happy-path tests in qa_happypath.bats gate CI via make test-e2e, catching response parsing regressions before merge
  • Committed cassettes enable replay with no credentials, no server, no setup
  • Pre-release smoke suite scaffolding (e2e/smoke/) for live-server verification (not yet operational)

Context

David found "files list was entirely busted" — the existing 270 e2e tests only exercise error paths with zero happy-path coverage. These tests replay recorded API interactions through a local proxy, verifying the CLI correctly parses responses and produces well-formed JSON output.

Recording proxy

~400 lines of stdlib Go. Two modes:

Mode Behavior
record Forward to live API, save cassettes
replay Serve from committed cassettes

Request matching uses canonical keys (method + path + sorted_query + body_hash) with per-key cursors for repeated calls. Link headers are rewritten for pagination through the proxy.

Re-recording

make record-cassettes \
  BASECAMP_RECORD_TOKEN=<token> \
  BASECAMP_RECORD_TARGET=http://3.basecampapi.localhost:4001 \
  BASECAMP_RECORD_ACCOUNT=181900405 \
  BASECAMP_RECORD_PROJECT=2085958494

Test plan

  • make check passes (275 e2e tests including 5 new happypath replay tests)
  • bats e2e/qa_happypath.bats passes in replay mode (no env vars needed)
  • Cassettes recorded from local BC3 dev server
  • Verify replay works in CI (no special setup required)

Summary by cubic

Adds a VCR-style HTTP recording proxy and five happy-path replay tests to catch JSON parsing regressions offline. Also scaffolds a pre-release smoke suite with trace capture and a QA report for live-server checks.

  • New Features

    • Recording proxy (e2e/recorder/) with record/replay, canonical request keys, and Link header rewrite; committed cassettes enable deterministic tests with no creds or server.
    • Five happy-path BATS tests (e2e/qa_happypath.bats) run via make test-e2e, covering projects, todos, files, recordings, and messages; e2e/run.sh skips if cassettes are missing.
    • Smoke suite scaffolding (e2e/smoke/) with parallel read-only and mutation tests orchestrated by run_smoke.sh; captures traces.jsonl and summarizes gaps via make qa-report (supports .qa-allowlist).
    • Make targets: record-cassettes, smoke, and qa-report.
  • Migration

    • Re-record cassettes: run make record-cassettes with BASECAMP_RECORD_TOKEN, BASECAMP_RECORD_TARGET, BASECAMP_RECORD_ACCOUNT, and BASECAMP_RECORD_PROJECT; output to e2e/cassettes/happypath/.
    • Run smoke suite: set BASECAMP_TOKEN and run make smoke; generate a summary with make qa-report (smoke is pre-release scaffolding).

Written for commit 1a6f9bb. Summary will update on new commits.

Copilot AI review requested due to automatic review settings March 12, 2026 00:08
@jeremy jeremy requested a review from a team as a code owner March 12, 2026 00:08
@github-actions github-actions bot added tests Tests (unit and e2e) enhancement New feature or request labels Mar 12, 2026
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

12 issues found across 29 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="e2e/smoke/smoke_cards_read.bats">

<violation number="1" location="e2e/smoke/smoke_cards_read.bats:12">
P2: `ensure_cardtable || return 0` silently passes the test when the resource can't be discovered, instead of marking it as skipped. If `ensure_cardtable` returns 1 without reaching `mark_unverifiable` (e.g., `ensure_project` fails inside it), the QA trace will show a green pass instead of a yellow gap. Use the same pattern as `smoke_cards_write.bats`: call `ensure_cardtable` in `setup_file()` and guard the test body with `mark_unverifiable`.</violation>
</file>

<file name="e2e/recorder/record.go">

<violation number="1" location="e2e/recorder/record.go:121">
P2: Data race: `save()` reads `rp.interactions` without holding `rp.mu`, but `ServeHTTP` appends to it under the lock from a different goroutine. The signal handler in `main.go` calls `save()` while the HTTP server may still be serving requests. Snapshot the slice under the lock to avoid a race.</violation>
</file>

<file name="e2e/smoke/smoke_lifecycle.bats">

<violation number="1" location="e2e/smoke/smoke_lifecycle.bats:7">
P2: `ensure_token` in `setup_file` will abort the entire file when `BASECAMP_TOKEN` is unset, but every test here just calls `mark_out_of_scope` (which `skip`s). Remove the `setup_file` block so the tests can skip cleanly without credentials.</violation>
</file>

<file name="e2e/smoke/smoke_todos_write.bats">

<violation number="1" location="e2e/smoke/smoke_todos_write.bats:20">
P1: Bats runs each `@test` in a separate subshell, so `export SMOKE_TODO_ID` here is not visible to subsequent tests. The "complete" and "reopen" tests will always skip via `mark_unverifiable`.

Persist the ID to a file (e.g., `$BATS_FILE_TMPDIR/todo_id`) and read it back in dependent tests, similar to how `smoke_comments.bats` keeps state self-contained.</violation>
</file>

<file name="e2e/recorder/cassette.go">

<violation number="1" location="e2e/recorder/cassette.go:58">
P2: Empty-valued query parameters are silently dropped from the canonical key, so requests like `/path?flag=` and `/path` produce identical keys and can collide during replay. If the intent is to strip truly absent values, compare against the raw query string instead; otherwise include the key even when the value is empty.</violation>
</file>

<file name="e2e/smoke/smoke_misc_read.bats">

<violation number="1" location="e2e/smoke/smoke_misc_read.bats:8">
P2: `ensure_project` in `setup_file()` will skip all 5 tests if project discovery fails, but only the `schedule show` test actually uses `$QA_PROJECT`. Move `ensure_project` into that specific test (or a per-test `setup()` guard) so the other 4 tests aren't unnecessarily blocked.</violation>

<violation number="2" location="e2e/smoke/smoke_misc_read.bats:11">
P2: "people list returns people" is duplicated from `smoke_account.bats`, which already has the same test with a stronger assertion (`assert_json_not_null '.data[0].id'`). Remove the duplicate here to avoid confusion when one passes and the other doesn't.</violation>
</file>

<file name="e2e/test_helper.bash">

<violation number="1" location="e2e/test_helper.bash:296">
P2: `stop_proxy` treats normal SIGTERM shutdown as an error. `kill` sends SIGTERM → `wait` returns 143 → the `$rc -ne 0` check fires every time, dumping the log to stderr and returning non-zero. Accept 143 (SIGTERM) as a clean exit.</violation>
</file>

<file name="e2e/smoke/smoke_files_read.bats">

<violation number="1" location="e2e/smoke/smoke_files_read.bats:7">
P2: If `ensure_token` fails, its non-zero return is silently ignored and `setup_file` falls through to `ensure_project`, which will fail with a less helpful error. Guard with `|| return 1` so the clear "BASECAMP_TOKEN required" message is the one the developer sees.</violation>
</file>

<file name="e2e/smoke/run_smoke.sh">

<violation number="1" location="e2e/smoke/run_smoke.sh:90">
P1: Script exits 0 (success) when `traces.jsonl` is missing, silently masking total test failure. If bats crashes, the trace hook isn't configured, or no test files exist yet, every `bats … || true` succeeds and the final `if [[ -f … ]]` block is skipped entirely. Add an `else` branch (or a post-check) that fails when the trace file is absent after tests were expected to run.</violation>

<violation number="2" location="e2e/smoke/run_smoke.sh:101">
P2: `grep -qF` matches substrings, so an allowlist entry like `smoke_auth` would also suppress `smoke_auth_extended`. Use `grep -qxF` to require a whole-line match.</violation>
</file>

<file name="e2e/smoke/smoke_helper.bash">

<violation number="1" location="e2e/smoke/smoke_helper.bash:116">
P2: `ensure_project` (and `ensure_account`) don't call `ensure_token` before running authenticated `basecamp` commands. Every other `ensure_*` calls its prerequisite, but the chain breaks here. If the token isn't set, the failure message will be "Cannot list projects" instead of surfacing the actual root cause.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a VCR-style HTTP recording/replay proxy and introduces deterministic “happy-path” QA replay tests to catch response-parsing regressions (plus initial scaffolding for a live-server smoke suite and related Makefile targets).

Changes:

  • Add e2e/recorder/ Go proxy supporting record and replay modes for cassette-driven HTTP interactions.
  • Add e2e/qa_happypath.bats replay tests backed by committed cassettes, wired into make test-e2e.
  • Add pre-release smoke suite scaffolding under e2e/smoke/ plus make smoke / make qa-report.

Reviewed changes

Copilot reviewed 26 out of 29 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
e2e/test_helper.bash Adds setup_extra hook and proxy start/stop helpers for recording/replay.
e2e/run.sh Skips happy-path replay tests when no committed cassettes are present.
e2e/qa_happypath.bats New deterministic happy-path tests that run via replay proxy (or record mode).
e2e/recorder/main.go Recorder/replayer entrypoint: flag parsing, listener setup, mode selection.
e2e/recorder/cassette.go Cassette schema + canonical request key + Link header rewriting helper.
e2e/recorder/record.go Recording proxy implementation that forwards to upstream and saves cassettes.
e2e/recorder/replay.go Replay server that matches incoming requests to recorded interactions.
e2e/cassettes/happypath/cassette.json Committed recorded interactions used for offline replay tests.
e2e/cassettes/happypath/.gitkeep Keeps cassette directory present in git when empty.
e2e/smoke/smoke_helper.bash Helper library for live smoke suite: ensure_* discovery, tracing, skip semantics.
e2e/smoke/run_smoke.sh Orchestrates smoke suite execution (parallel/serial phases) and aggregates results.
e2e/smoke/smoke_core.bats Smoke tests for core commands (version/status/auth/config/doctor).
e2e/smoke/smoke_projects.bats Smoke tests for accounts/projects read operations.
e2e/smoke/smoke_todos_read.bats Smoke tests for todo read operations (todosets/todolists/todos).
e2e/smoke/smoke_todos_write.bats Smoke tests for todo write operations (create/complete/reopen).
e2e/smoke/smoke_files_read.bats Smoke tests for files/vaults/docs/uploads read operations.
e2e/smoke/smoke_files_write.bats Smoke tests for creating folders in Docs & Files.
e2e/smoke/smoke_messages_read.bats Smoke tests for messages/messageboards read operations.
e2e/smoke/smoke_messages_write.bats Smoke tests for creating a message.
e2e/smoke/smoke_cards_read.bats Smoke tests for cards read operations.
e2e/smoke/smoke_cards_write.bats Smoke tests for card creation.
e2e/smoke/smoke_comments.bats Smoke tests for creating a comment (via creating a todo first).
e2e/smoke/smoke_misc_read.bats Smoke tests for misc read ops (people/me/schedule/recordings/search).
e2e/smoke/smoke_account.bats Smoke tests for account-scoped operations (people/templates).
e2e/smoke/smoke_lifecycle.bats Smoke suite placeholders for lifecycle/auth flows marked out-of-scope.
e2e/smoke/RUBRIC.md Defines QA rubric dimensions for smoke trace evaluation.
e2e/smoke/.qa-allowlist Placeholder allowlist file for unverifiable tests that shouldn’t block release.
Makefile Adds record-cassettes, smoke, and qa-report targets; updates help output.
.gitignore Ignores the built recorder binary (e2e/recorder/recorder).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d466ceb1eb

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

jeremy added 2 commits March 11, 2026 17:50
HTTP recording proxy (e2e/recorder/) captures API interactions as
cassettes for deterministic offline replay. Five tests in
qa_happypath.bats gate CI via make test-e2e, catching response
parsing regressions before merge.

Recording requires a live API target with all four env vars
(TOKEN, TARGET, ACCOUNT, PROJECT) via make record-cassettes.
Replay uses committed cassettes — no credentials, no server needed.
Test scaffolding for comprehensive live-server verification via
make smoke. Bats exit codes gate pass/fail; traces track coverage
gaps (unverifiable, out-of-scope) for the QA report.

Discovery avoids pre-running tested commands to prevent masking
regressions. Cross-test state uses file persistence for BATS
subshell isolation. Not yet validated against a live account.
@jeremy
Copy link
Copy Markdown
Member Author

jeremy commented Mar 12, 2026

All 21 review findings addressed in the force-pushed commits:

Tier 1 (5b0b99c):

  • Data race in save() — snapshot under lock
  • Empty query params now included in canonical key
  • stop_proxy accepts SIGTERM exit 143 as clean
  • -account flag passed through to recorder
  • Response headers filtered through allowlist (no Set-Cookie/PII)
  • Cassette re-recorded with clean headers

Tier 2 (1a6f9bb):

  • run_smoke no longer writes pass/fail traces — bats exit codes are the pass/fail signal, traces are gap-tracking only
  • Discovery avoids pre-running tested commands (ensure_account/ensure_project prefer env vars)
  • smoke_projects.bats tests accounts/projects list directly, discovers project ID via file persistence
  • Cross-test state uses $BATS_FILE_TMPDIR for BATS subshell isolation
  • ensure_cardtable reads from project dock, not cards list
  • All ensure_* calls have || return 1 guards
  • Removed duplicate people list test
  • smoke_lifecycle.bats has no setup_file (tests skip via mark_out_of_scope)
  • grep -qxF for whole-line allowlist matching
  • bats availability checked upfront in orchestrator

@jeremy jeremy merged commit d5744c7 into main Mar 12, 2026
25 checks passed
@jeremy jeremy deleted the map-territory branch March 12, 2026 01:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request tests Tests (unit and e2e)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants