Skip to content

fix: browser capture end-to-end pipeline#15

Open
abrichr wants to merge 4 commits intomainfrom
fix/browser-capture-e2e
Open

fix: browser capture end-to-end pipeline#15
abrichr wants to merge 4 commits intomainfrom
fix/browser-capture-e2e

Conversation

@abrichr
Copy link
Member

@abrichr abrichr commented Mar 8, 2026

Summary

  • Fix 3 bugs that prevented browser events from being captured and parsed end-to-end
  • background.js: content script sends USER_EVENT but background only relayed DOM_EVENT — events silently dropped
  • background.js: handleSetMode only read message.payload?.mode but recorder sends flat {mode: "record"} — mode never set, listeners never attached
  • browser_events.py: enum used "browser.click" prefix but content script sends raw DOM names "click" — artificial convention from port that was never tested; changed to match legacy OpenAdapt and the actual content script
  • Add BrowserMouseMoveEvent type, CaptureSession.browser_events() API, --browser-events CLI flag, and 15 e2e tests

Test plan

  • 15 e2e tests pass (both canonical DB format and raw content-script format)
  • 147/147 full test suite passes
  • Live recording verified: 84/84 events captured and parsed from Chrome extension on Hacker News (clicks, key presses, scrolls, mouse moves)
  • CI passes

🤖 Generated with Claude Code

abrichr and others added 4 commits March 8, 2026 17:38
Three bugs prevented browser events from being captured and parsed:

1. background.js only relayed DOM_EVENT messages but the content script
   sends USER_EVENT — events were silently dropped.

2. background.js handleSetMode only read message.payload?.mode but the
   recorder sends flat {mode: "record"} — mode was never set to "record"
   so the content script never attached record listeners.

3. The BrowserEventType enum used "browser.click" prefix format but the
   content script sends raw DOM event names ("click", "keydown", etc.).
   This was an artificial convention introduced during the port from
   legacy OpenAdapt that was never tested end-to-end. Legacy used raw
   names throughout.

Changes:
- background.js: add USER_EVENT relay, fix SET_MODE format handling
- browser_events.py: change enum values to raw DOM names matching the
  content script and legacy OpenAdapt, add BrowserMouseMoveEvent
- capture.py: add _parse_element_ref() and rewrite _convert_browser_event()
  to handle actual content-script message format including the recorder's
  {"message": <raw>} wrapper, add browser_events() and browser_event_count
  to CaptureSession
- cli.py: add --browser-events flag to record, show browser event breakdown
  in info command
- tests: add 15 e2e tests covering both DB roundtrip and raw content-script
  format parsing

Verified with live recording: 84/84 events captured and parsed from
Chrome extension on Hacker News.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace bare except with debug logging in _convert_browser_event
- Move lazy imports to module level (BoundingBox, ElementState, etc.)
- Remove unused imports (pytest, Recording) from test file
- Update test class names to reflect structure tested, not removed format
- Fix stale docstring in _parse_element_ref

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant