Skip to content

test(e2e): fix broken Playwright tests on main (#1235 example)#1236

Open
JMBattista wants to merge 1 commit into
asheshgoplani:mainfrom
JMBattista:fix/playwright-broken-e2e
Open

test(e2e): fix broken Playwright tests on main (#1235 example)#1236
JMBattista wants to merge 1 commit into
asheshgoplani:mainfrom
JMBattista:fix/playwright-broken-e2e

Conversation

@JMBattista
Copy link
Copy Markdown
Contributor

@JMBattista JMBattista commented May 31, 2026

Summary

The playwright (e2e + screenshots) job has been failing for a while on main. It surfaces on #1235 (run), but the failures are pre-existing — not introduced by #1235; #1235 was just the discovery vector. Reproduced against clean main (f6bc0f5) before touching anything.

8 failing tests, four distinct root causes — one product bug, one test-fixture bug, and two stale test/baseline expectations. No test was skipped/.fixme'd; no assertion was weakened.

Failures, root causes & fixes

Product bug — Skills tab wired to a stub (3 tests)

  • skills.spec.js:79 UI: Skills tab lists the seeded attachment and catalog
  • skills.spec.js:92 UI: clicking Attach moves a skill from catalog to attached
  • skills.spec.js:102 UI: clicking Detach removes a skill from the attached list

Root cause: PR #1124 (c709400a), while wiring McpPane in place of the MCP stub, also clobbered the SkillsPane wiring that #1122 had added — reverting the Skills tab to a StubPane. The fully-functional SkillsPane component (panes/SkillsPane.js, with all the data-testid hooks the tests expect) was left orphaned and never mounted. gotoSkills() timed out waiting for [data-testid="skills-pane"], which the stub never renders. This is a real user-facing regression — the web Skills tab showed a "managed in the TUI today" placeholder instead of the working pane.

Fix (product, AppShell.js):

-    ${tab === 'skills'    && html`<${StubPane} title="Skills"
-                              message="Skill attachments are managed in the TUI today. The web API does not expose skill management yet."
-                              hotkey="s"/>`}
+    ${tab === 'skills'    && html`<${SkillsPane}/>`}

(plus restoring the import { SkillsPane }.)

Test-fixture bug — MCP state leaked across tests (2 tests)

  • mcps.spec.js:38 GET /api/sessions/:id/mcps returns empty arrays for fresh fixture
  • mcps.spec.js:123 PATCH on un-attached MCP returns 404

Root cause: /__fixture/reset reseeded the session store but never reset the in-memory fixtureMCPManager — its Reset() method was dead code. The MCP describe runs mode: serial, so attachments from earlier cases leaked into later ones (and, on Playwright retries, the whole serial block re-runs against already-polluted state). By the time PATCH on un-attached MCP ran, exa was still attached from the pooled:true case → got 200, not 404. On retry, empty arrays for fresh fixture then saw the leaked attachments too. Directly reproduced: mcps.spec.js:123 fails on a clean first pass.

Fix (fixture, web-fixture/main.go): hold the MCP manager on the store and call its existing Reset() from the reset handler.

-		s.seed()
+		s.seed()
+		if s.mcpMgr != nil {
+			s.mcpMgr.Reset()
+		}

Stale tests — StopClose rename (2 tests)

  • keyboard-parity.spec.js:49 ? opens the keyboard shortcuts overlay
  • keyboard-parity.spec.js:140 Shift+D opens the stop-session confirm dialog

Root cause: PR #1129 (5b0dae2a, "non-destructive Close + Undo Delete") reworked Shift+D from "Stop" into a non-destructive "Close session" (kills the process, keeps metadata) and reworded both the overlay binding label and the confirm-dialog copy. The specs still asserted the old "Stop" wording, so the assertions timed out (~6s) even though the overlay/dialog opened fine. Test bug — handler is correct; specs were stale.

-    await expect(overlay).toContainText('Stop focused session')
+    await expect(overlay).toContainText('Close focused session')
...
-    await expect(dialog).toContainText(/stop session/i)
+    await expect(dialog).toContainText(/close session/i)

Stale baseline — shortcuts overlay screenshot (1 test)

  • keyboard-parity.spec.js:191 shortcuts overlay renders consistently

Root cause: the committed baseline predated #1129 — it shows Shift+D → "Stop focused session" and lacks the Ctrl+Z → Undo last delete row that #1129 added. The diff was a clean structural shift (unchanged rows aligned as exact offset copies — i.e. font rendering matches the baseline's environment, not an env drift). Refreshed the desktop + tablet baselines only after confirming the overlay renders correctly (the ? test passes with the current labels). The tablet baseline was stale for the same reason.

Verification

Reproduced all failures on clean main, then verified the fixes locally (Linux chromium, desktop + tablet):

  • All 8 target tests pass. Focused run of keyboard-parity + mcps + skills on desktop+tablet: 66 passed.
  • Full suite: 380 passed.

Note: 9 chromium-phone failures (children-panel / close-undo / mcps-UI / skills-UI) are pre-existing on clean main (identical count with my changes stashed), are absent from the #1235 desktop-only failure set (CI's proper --with-deps browser install renders the 393px responsive layout fine; my locally hand-extracted libs don't), and are out of scope / untouched here.

Non-goals respected

No new tests/coverage; no skips/.fixme; no baseline refreshed on broken UI; no other CI gate widened; the perf-suite untouched. The only Go touched is the test fixture (tests/web/fixtures/…), and the only product change is the one-line SkillsPane re-wire fixing a real regression.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Skills management interface is now functional and available in the app.
  • UI/UX Updates

    • Updated session action terminology from "stop" to "close" across dialogs and keyboard shortcuts overlay.

…scovery)

Stabilizes 8 pre-existing Playwright e2e failures surfaced (not caused) by
PR asheshgoplani#1235's `playwright (e2e + screenshots)` job. Mix of one product
regression, one test-fixture bug, and stale test/baseline expectations.

Product bug — Skills tab wired to a stub (skills.spec.js UI tests):
PR asheshgoplani#1124 (c709400), while wiring McpPane in place of the MCP stub, also
clobbered the SkillsPane wiring that asheshgoplani#1122 had added, reverting the Skills
tab to a StubPane. The fully-functional SkillsPane component was orphaned.
Restored its import + mount in AppShell.js. This was a real user-facing
regression (web Skills tab showed a placeholder instead of the working pane).

Fixture bug — MCP state leaked across tests (mcps.spec.js asheshgoplani#38, asheshgoplani#123):
/__fixture/reset reseeded the session store but never reset the in-memory
fixtureMCPManager, whose Reset() method was dead code. Attachments leaked
across the serially-run MCP cases (and across Playwright retries, which
re-run the whole serial block), so "fresh fixture → empty arrays" and
"un-attached → 404" saw stale attachments. Wired Reset() into the handler.

Stale tests — Stop→Close rename (keyboard-parity.spec.js asheshgoplani#49, asheshgoplani#140):
asheshgoplani#1129 (5b0dae2) reworked Shift+D into a non-destructive "Close session"
(kill process, keep metadata) and reworded the overlay label + confirm
dialog accordingly. The specs still asserted the old "Stop" copy. Updated
the two assertions to match, with comments citing the commit.

Stale baseline — shortcuts overlay screenshot (keyboard-parity.spec.js asheshgoplani#191):
The committed baseline predated asheshgoplani#1129: it shows "Stop focused session" and
lacks the "Ctrl+Z Undo last delete" row. Refreshed desktop + tablet
baselines only after confirming the overlay renders correctly (the `?` test
passes with current labels). Tablet baseline was stale for the same reason.

Verified locally (chromium desktop + tablet): all 8 target tests pass; full
suite 380 passed. (9 chromium-phone failures are pre-existing on clean main,
absent from the CI asheshgoplani#1235 desktop-only failure set, and untouched here.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 31, 2026 17:51
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 31, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 97c91d7d-79a2-45d0-b38c-29a5e63047f8

📥 Commits

Reviewing files that changed from the base of the PR and between 23887fc and d765770.

⛔ Files ignored due to path filters (2)
  • tests/web/screenshots/keyboard-parity.spec.js/shortcuts-overlay-chromium-desktop.png is excluded by !**/*.png
  • tests/web/screenshots/keyboard-parity.spec.js/shortcuts-overlay-chromium-tablet.png is excluded by !**/*.png
📒 Files selected for processing (3)
  • internal/web/static/app/AppShell.js
  • tests/web/e2e/keyboard-parity.spec.js
  • tests/web/fixtures/cmd/web-fixture/main.go

📝 Walkthrough

Walkthrough

The PR replaces a SkillsPane placeholder in the app shell, introduces persistent MCP manager state in the test fixture with reset support, and updates keyboard-parity test assertions to match new session close-action wording.

Changes

UI component, fixture state management, and test assertion updates

Layer / File(s) Summary
SkillsPane component integration in AppShell
internal/web/static/app/AppShell.js
AppShell imports SkillsPane and updates the skills tab to render it instead of the StubPane placeholder.
Persistent MCP manager in test fixture with reset support
tests/web/fixtures/cmd/web-fixture/main.go
fixtureStore gains an mcpMgr field; main() creates and stores a single MCP manager instance and passes it to the HTTP server; /__fixture/reset endpoint now resets the MCP manager state alongside other fixture state.
Keyboard-parity test assertions for close-session wording
tests/web/e2e/keyboard-parity.spec.js
Test assertions for keyboard shortcuts and confirmation dialogs are updated to expect "Close focused session" and "close session" copy instead of the former "Stop focused session" wording.

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly Related PRs

  • asheshgoplani/agent-deck#1122: Both PRs modify internal/web/static/app/AppShell.js to replace the StubPane placeholder with the real SkillsPane when activeTab is 'skills'.
  • asheshgoplani/agent-deck#1090: Updates to keyboard-parity test assertions for "close focused session" wording correspond directly to copy changes introduced in this PR.
  • asheshgoplani/agent-deck#1124: Test fixture MCP manager state management changes directly support the MCP management endpoints and UI introduced in that PR.
🚥 Pre-merge checks | ✅ 6 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 70.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (6 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title follows the Conventional Commits format with the 'test' type prefix, clearly indicating the changes relate to test improvements.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Remote_parity ✅ Passed PR changes only affect internal/web/ (web UI) and test files, not TUI behavior in internal/ui/ or cmd/agent-deck/, so remote_parity check does not apply.
Test_coverage_per_surface ✅ Passed PR fixes a regression by restoring SkillsPane (which already had tests from PR #1122), not introducing new features. StubPane removal aligns with the check's intent.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant