Skip to content
Merged
60 changes: 60 additions & 0 deletions .claude/skills/qa-critic.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
---
description: Grade smoke test traces against the QA rubric and produce prioritized findings
user_invocable: true
---

# QA Critic

Analyze smoke test traces and grade CLI output quality against the rubric.

## Prerequisites

Smoke traces must exist. Run `make smoke` first to generate them:

```bash
BASECAMP_PROFILE=dev make smoke
# or: BASECAMP_TOKEN=<token> make smoke
```

Traces land in `tmp/qa-traces/traces.jsonl` (or `$QA_TRACE_DIR`).

## Steps

1. **Read the rubric**: Read `e2e/smoke/RUBRIC.md` for the grading dimensions.

2. **Read results**: Two sources, each authoritative for different things:
- **BATS TAP output** (stdout from `make smoke`): Parse TAP lines to count pass (`ok ...`), fail (`not ok ...`), and skip (`ok ... # skip ...`). These are the ground truth for pass/fail.
- **Trace file** (`tmp/qa-traces/traces.jsonl`): Each line is a JSON object with fields: `test`, `command`, `exit_code`, `status`, `reason`. Traces record only gap/exclusion metadata — `unverifiable` (test could not verify due to missing data) and `out-of-scope` (intentionally excluded). Traces say nothing about pass/fail; use them only for coverage-gap analysis.

3. **Identify coverage gaps**: List all commands from the `.surface` file (lines starting with `CMD`). Cross-reference against the BATS test inventory (grep `@test` lines across `e2e/smoke/*.bats` and match the command name in each `run_smoke basecamp <command>` call). A command is covered if at least one `@test` exercises it. Traces are not useful here — passing tests leave no trace entry, so a pure-pass command group would be misclassified as uncovered.

4. **Run sample commands**: For each covered command group, run 2-3 representative commands with `--json` and without `--json` to capture both machine and human output. Evaluate against both v0 and v1 rubric dimensions.

5. **Grade v0 (automatable)**: For each command tested:
- **Functional**: Did it exit 0 with `ok: true`?
- **Non-empty**: Is `.data` present and non-null?
- **Correct types**: Are IDs numbers, names strings?
- **Summary present**: Is `.summary` a non-empty string?
- **Scriptable**: Does `--json` parse cleanly? Does `--ids-only` work where applicable?

6. **Grade v1 (critic-evaluated)**: For each command tested:
- **Readable**: Is the human output scannable, not a wall of text?
- **Discoverable**: Do breadcrumbs suggest logical next actions?
- **Consistent**: Do similar commands (e.g., all `list` commands) produce similar output shapes?
- **Helpful errors**: Run with bad input — does the error explain what's wrong and how to fix it?
- **Complete**: Are all relevant API fields surfaced?

7. **Produce findings**: Output a prioritized list of issues, grouped by severity:
- **Critical**: Command exits non-zero, crashes, or returns malformed JSON
- **High**: Missing `.summary`, empty `.data` when data exists, no breadcrumbs
- **Medium**: Inconsistent output shapes, missing fields vs API, poor error messages
- **Low**: Style/readability nits, missing `--ids-only` support

Format each finding as:
```
[SEVERITY] command: description
Evidence: <what you observed>
Expected: <what the rubric requires>
```

8. **Summary table**: End with a coverage matrix showing each command group, its test count, and a letter grade (A-F) based on v0+v1 scores.
26 changes: 26 additions & 0 deletions e2e/smoke/.qa-allowlist
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,29 @@
# Entries expire each release cycle — review and close gaps.
#
# Format: <test name> # <reason> — <owner>

# Dock tools may be disabled on the test project
test_todolistgroups_show_returns_group_detail # no todolist groups — smoke
test_tools_show_returns_a_tool # depends on messageboard dock tool — smoke
test_schedule_settings_updates_schedule_settings # schedule dock tool disabled — smoke

# Message types / inbox may not exist in all environments
test_messagetypes_list_returns_message_types # not_found on some servers — smoke
test_messagetypes_show_returns_message_type_detail # depends on messagetypes list — smoke
test_forwards_inbox_shows_project_inbox # inbox dock tool disabled — smoke
test_forwards_list_returns_forwards # inbox dock tool disabled — smoke
test_forwards_show_returns_forward_detail # inbox dock tool disabled — smoke

# Features not available in all accounts
test_search_metadata_returns_metadata # requires search projects — smoke
test_reports_schedule_returns_schedule_entries # 400 on some environments — smoke
test_timesheet_report_returns_timesheet_data # timesheets not enabled — smoke
test_timesheet_project_returns_project_timesheet # timesheets not enabled — smoke

# Lineup API may not exist on all environments / returns 204 No Content
test_lineup_create_creates_a_lineup_marker # API not available on some servers — smoke
test_lineup_update_updates_a_lineup_marker # create returns no ID — smoke
test_lineup_delete_removes_a_lineup_marker # create returns no ID — smoke

# SDK bug: TodolistOrGroup0 union type expects wrapped JSON, API returns flat
test_todolists_show_returns_todolist_detail # SDK deserialization mismatch — smoke
37 changes: 29 additions & 8 deletions e2e/smoke/run_smoke.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
# run_smoke.sh - Orchestrator for the pre-release smoke suite.
#
# Usage:
# BASECAMP_PROFILE=dev ./e2e/smoke/run_smoke.sh
# BASECAMP_TOKEN=<token> ./e2e/smoke/run_smoke.sh
#
# Runs Level 0 (read-only) tests in parallel, then Level 1+ serially.
Expand All @@ -13,10 +14,11 @@ set -euo pipefail
SMOKE_DIR="$(cd "$(dirname "$0")" && pwd)"
ROOT_DIR="$(cd "$SMOKE_DIR/../.." && pwd)"

# Require token
if [[ -z "${BASECAMP_TOKEN:-}" ]]; then
echo "Error: BASECAMP_TOKEN must be set" >&2
echo "Usage: BASECAMP_TOKEN=<token> $0" >&2
# Require auth: either a profile (carries token + base_url + account) or a bare token
if [[ -z "${BASECAMP_PROFILE:-}" && -z "${BASECAMP_TOKEN:-}" ]]; then
echo "Error: BASECAMP_PROFILE or BASECAMP_TOKEN must be set" >&2
echo "Usage: BASECAMP_PROFILE=dev $0" >&2
echo " BASECAMP_TOKEN=<token> $0" >&2
exit 1
fi

Expand All @@ -32,11 +34,17 @@ rm -rf "$QA_TRACE_DIR"
mkdir -p "$QA_TRACE_DIR"

export BASECAMP_NO_KEYRING=1
export BASECAMP_TOKEN
[[ -n "${BASECAMP_PROFILE:-}" ]] && export BASECAMP_PROFILE
[[ -n "${BASECAMP_TOKEN:-}" ]] && export BASECAMP_TOKEN
[[ -n "${BASECAMP_LAUNCHPAD_URL:-}" ]] && export BASECAMP_LAUNCHPAD_URL
export PATH="$ROOT_DIR/bin:$PATH"

# Detect parallelism
# Detect parallelism — bats -j requires GNU parallel, not moreutils parallel.
# Fall back to serial if GNU parallel isn't available.
jobs=$(nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo 1)
if ! parallel --will-cite true ::: true 2>/dev/null; then
jobs=1
fi

echo "=== Smoke Suite ==="
echo "Traces: $QA_TRACE_DIR"
Expand All @@ -50,10 +58,16 @@ level0=(
"$SMOKE_DIR"/smoke_core.bats
"$SMOKE_DIR"/smoke_projects.bats
"$SMOKE_DIR"/smoke_todos_read.bats
"$SMOKE_DIR"/smoke_todolistgroups.bats
"$SMOKE_DIR"/smoke_files_read.bats
"$SMOKE_DIR"/smoke_messages_read.bats
"$SMOKE_DIR"/smoke_cards_read.bats
"$SMOKE_DIR"/smoke_misc_read.bats
"$SMOKE_DIR"/smoke_reports.bats
"$SMOKE_DIR"/smoke_communication.bats
"$SMOKE_DIR"/smoke_checkins.bats
"$SMOKE_DIR"/smoke_schedule.bats
"$SMOKE_DIR"/smoke_tools.bats
)
level0_exist=()
for f in "${level0[@]}"; do
Expand All @@ -72,6 +86,12 @@ level1=(
"$SMOKE_DIR"/smoke_files_write.bats
"$SMOKE_DIR"/smoke_cards_write.bats
"$SMOKE_DIR"/smoke_comments.bats
"$SMOKE_DIR"/smoke_campfire.bats
"$SMOKE_DIR"/smoke_webhooks.bats
"$SMOKE_DIR"/smoke_assign.bats
"$SMOKE_DIR"/smoke_lineup.bats
"$SMOKE_DIR"/smoke_communication_write.bats
"$SMOKE_DIR"/smoke_misc_write.bats
)
level1_exist=()
for f in "${level1[@]}"; do
Expand All @@ -85,6 +105,7 @@ fi
echo ""
echo "--- Level 2+: Account-scoped tests (serial) ---"
level2=(
"$SMOKE_DIR"/smoke_projects_write.bats
"$SMOKE_DIR"/smoke_account.bats
"$SMOKE_DIR"/smoke_lifecycle.bats
)
Expand All @@ -101,11 +122,11 @@ if [[ -f "$QA_TRACE_DIR/traces.jsonl" ]]; then
if [[ "$unverified" -gt 0 ]]; then
echo "Coverage gaps: $unverified unverifiable"

# Check allowlist
# Check allowlist (strip inline comments and blank lines before matching)
allowlist="$SMOKE_DIR/.qa-allowlist"
blocking_unverified=0
while IFS= read -r test_name; do
if ! grep -qxF "$test_name" "$allowlist" 2>/dev/null; then
if ! sed 's/ *#.*//' "$allowlist" 2>/dev/null | grep -v '^$' | grep -qxF "$test_name"; then
blocking_unverified=$((blocking_unverified + 1))
echo " - $test_name (not allowlisted)"
fi
Expand Down
32 changes: 32 additions & 0 deletions e2e/smoke/smoke_account.bats
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,35 @@ setup_file() {
assert_success
assert_json_value '.ok' 'true'
}

@test "people show returns person detail" {
run_smoke basecamp people show me --json
assert_success
assert_json_value '.ok' 'true'
assert_json_not_null '.data.id'
}

@test "templates show returns template detail" {
local out
out=$(basecamp templates list --json 2>/dev/null) || mark_unverifiable "Cannot list templates"
local tmpl_id
tmpl_id=$(echo "$out" | jq -r '.data[0].id // empty')
[[ -n "$tmpl_id" ]] || mark_unverifiable "No templates found"

run_smoke basecamp templates show "$tmpl_id" --json
assert_success
assert_json_value '.ok' 'true'
assert_json_not_null '.data.id'
}

@test "people pingable returns pingable people" {
run_smoke basecamp people pingable --json
assert_success
assert_json_value '.ok' 'true'
}

@test "auth token shows current token" {
run_smoke basecamp auth token --json
assert_success
assert_json_value '.ok' 'true'
}
39 changes: 39 additions & 0 deletions e2e/smoke/smoke_assign.bats
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
#!/usr/bin/env bats
# smoke_assign.bats - Level 1: Assign and unassign operations

load smoke_helper

setup_file() {
ensure_token || return 1
ensure_project || return 1
ensure_todolist || return 1
}

@test "assign assigns a person to a todo" {
# Create a fresh todo for assignment
local todo_out
todo_out=$(basecamp todo "Assign target $(date +%s)" --list "$QA_TODOLIST" -p "$QA_PROJECT" --json 2>/dev/null) || {
mark_unverifiable "Cannot create todo for assign test"
return
}
local todo_id
todo_id=$(echo "$todo_out" | jq -r '.data.id // empty')
[[ -n "$todo_id" ]] || mark_unverifiable "No todo ID returned"

echo "$todo_id" > "$BATS_FILE_TMPDIR/assign_todo_id"

run_smoke basecamp assign "$todo_id" --to me -p "$QA_PROJECT" --json
assert_success
assert_json_value '.ok' 'true'
}

@test "unassign removes a person from a todo" {
local id_file="$BATS_FILE_TMPDIR/assign_todo_id"
[[ -f "$id_file" ]] || mark_unverifiable "No todo created in prior test"
local todo_id
todo_id=$(<"$id_file")

run_smoke basecamp unassign "$todo_id" --from me -p "$QA_PROJECT" --json
assert_success
assert_json_value '.ok' 'true'
}
45 changes: 45 additions & 0 deletions e2e/smoke/smoke_campfire.bats
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/usr/bin/env bats
# smoke_campfire.bats - Level 0/1: Campfire (chat) operations

load smoke_helper

setup_file() {
ensure_token || return 1
ensure_project || return 1
ensure_campfire || return 1
}

@test "campfire list returns campfires" {
run_smoke basecamp campfire list -p "$QA_PROJECT" --json
assert_success
assert_json_value '.ok' 'true'
}

@test "campfire messages returns lines" {
run_smoke basecamp campfire messages --chat "$QA_CAMPFIRE" -p "$QA_PROJECT" --json
assert_success
assert_json_value '.ok' 'true'
}

@test "campfire post creates a message" {
run_smoke basecamp campfire post "Smoke test $(date +%s)" \
--chat "$QA_CAMPFIRE" -p "$QA_PROJECT" --json
assert_success
assert_json_value '.ok' 'true'
assert_json_not_null '.data.id'

echo "$output" | jq -r '.data.id' > "$BATS_FILE_TMPDIR/campfire_line_id"
}

@test "campfire line shows a message" {
local id_file="$BATS_FILE_TMPDIR/campfire_line_id"
[[ -f "$id_file" ]] || mark_unverifiable "No campfire line created in prior test"
local line_id
line_id=$(<"$id_file")

run_smoke basecamp campfire line "$line_id" \
--chat "$QA_CAMPFIRE" -p "$QA_PROJECT" --json
assert_success
assert_json_value '.ok' 'true'
assert_json_not_null '.data.id'
}
22 changes: 21 additions & 1 deletion e2e/smoke/smoke_cards_read.bats
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,27 @@ setup_file() {
}

@test "cards list returns cards" {
run_smoke basecamp cards list -p "$QA_PROJECT" --json
run_smoke basecamp cards list --card-table "$QA_CARDTABLE" -p "$QA_PROJECT" --json
assert_success
assert_json_value '.ok' 'true'
}

@test "cards columns returns columns" {
run_smoke basecamp cards columns --card-table "$QA_CARDTABLE" -p "$QA_PROJECT" --json
assert_success
assert_json_value '.ok' 'true'
}

@test "cards show returns card detail" {
# Discover a card from the list
local out
out=$(basecamp cards list --card-table "$QA_CARDTABLE" -p "$QA_PROJECT" --json 2>/dev/null) || mark_unverifiable "Cannot list cards"
local card_id
card_id=$(echo "$out" | jq -r '.data[0].id // empty')
[[ -n "$card_id" ]] || mark_unverifiable "No cards in project"

run_smoke basecamp cards show "$card_id" -p "$QA_PROJECT" --json
assert_success
assert_json_value '.ok' 'true'
assert_json_not_null '.data.id'
}
Loading
Loading