t1334: Diagnose t003.5 repeated worker deaths during cross-browser testing#2283
t1334: Diagnose t003.5 repeated worker deaths during cross-browser testing#2283marcusquinn merged 1 commit intomainfrom
Conversation
Root cause of t003.5 repeated worker deaths: the E2E test at responsive.spec.ts:672 measured <main> (SidebarInset) instead of the DashboardInset div for the max-w-7xl constraint check. On 1920px viewport, <main> fills available space after sidebar (>1280px), so the assertion always failed. Workers consumed their context windows trying to debug responsive CSS when the fix was a 4-line test selector change. Fix applied to awardsapp feature/t003.5 branch (pushed to GitHub). Self-improvement: added CI failure debugging protocol to full-loop.md to prevent future workers from skipping CI log analysis.
|
Caution Review failedThe pull request is closed. ℹ️ Recent review infoConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
WalkthroughAdds two documentation sections to the full-loop guidance: CI failure debugging steps (t1334) and mandatory headless dispatch rules (t158/t174) for supervisor-dispatched workers, with both sections duplicated across locations in the file. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Possibly related PRs
Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical issue of repeated worker deaths during cross-browser testing by implementing a two-pronged solution. It rectifies an erroneous E2E test assertion that was causing consistent failures and establishes a clearer protocol for debugging CI failures, aiming to improve efficiency and prevent future misdiagnoses. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Wed Feb 25 09:07:14 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
There was a problem hiding this comment.
Code Review
This pull request adds valuable guidance to the worker protocol for debugging CI failures, instructing workers to analyze logs before attempting code changes. This is a great step towards preventing context exhaustion and blind debugging. I've suggested an improvement to the provided gh command to make the log analysis workflow more robust, automated, and simpler.
| # 1. Identify the failing job | ||
| gh pr checks <PR_NUMBER> --repo <owner/repo> | ||
|
|
||
| # 2. Get the run ID and read failure logs | ||
| gh run view <RUN_ID> --repo <owner/repo> --log | grep -iE 'FAIL|Error.*spec|expect.*received' |
There was a problem hiding this comment.
This suggested change automates the process of finding the failing run ID and viewing its logs, making the debugging workflow faster and more reliable.
gh run listcombined withjqdirectly retrieves the ID of a failing run, avoiding the need to manually inspect the output ofgh pr checksand extract the ID from a URL.gh run view --log-failedis a more robust way to view failure logs, as it shows the full context for failed jobs without relying ongreppatterns which might miss important details.
| # 1. Identify the failing job | |
| gh pr checks <PR_NUMBER> --repo <owner/repo> | |
| # 2. Get the run ID and read failure logs | |
| gh run view <RUN_ID> --repo <owner/repo> --log | grep -iE 'FAIL|Error.*spec|expect.*received' | |
| # 1. Get the ID of the first failing run | |
| RUN_ID=$(gh run list --pr <PR_NUMBER> --repo <owner/repo> --json databaseId,conclusion --jq '.[] | select(.conclusion=="failure") | .databaseId' | head -n 1) | |
| # 2. Read failure logs for that run | |
| gh run view "$RUN_ID" --repo <owner/repo> --log-failed |
Flagged for Human ReviewReason: PR #2283 (t1334: diagnose t003.5 worker deaths) is open with pending CI. t1334 completed and identified root cause of repeated t003.5 failures. Human review needed to merge findings and unblock t003.5. This issue has been flagged by the AI supervisor for human review. Please assess and take appropriate action. Flagged by AI Supervisor (automated reasoning cycle) |
Flagged for Human ReviewReason: PR #2283 (t1334: Diagnose t003.5 worker deaths) is OPEN with PENDING CI and 0 approvals. This PR contains the root cause analysis and fix for the recurring t003.5 failures that have wasted 4+ worker sessions. It should be reviewed and merged promptly to unblock t003.5 and prevent further token waste. The investigation found CI infrastructure issues that affect all a managed private repo cross-browser testing. This issue has been flagged by the AI supervisor for human review. Please assess and take appropriate action. Flagged by AI Supervisor (automated reasoning cycle) |
🤖 Augment PR SummarySummary: This PR documents a CI-first debugging workflow to prevent agents from burning context on blind investigation when PR checks are failing. Changes:
🤖 Was this summary useful? React with 👍 or 👎 |
| gh pr checks <PR_NUMBER> --repo <owner/repo> | ||
|
|
||
| # 2. Get the run ID and read failure logs | ||
| gh run view <RUN_ID> --repo <owner/repo> --log | grep -iE 'FAIL|Error.*spec|expect.*received' |
There was a problem hiding this comment.
gh run view … --log | grep … can produce no output for many CI failures (setup/env errors, timeouts, non-test failures), which may mislead workers into thinking the logs are "empty." Consider noting that if the filter returns nothing, they should inspect the full unfiltered log to find the real failure signal.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.



Summary
Root cause investigation for t003.5 (cross-browser and device testing) which failed 3+ consecutive times with
worker_process_died_mid_task_pr_open_unmergedat both sonnet and opus tiers.Root Cause
Single failing test:
e2e/responsive.spec.ts:672— "dashboard inset constrains content width on wide desktop"The test measured the
<main>element (which isSidebarInset— fills available space after sidebar) instead of theDashboardInsetdiv inside it (which hasmax-w-7xl). On a 1920px viewport,<main>is wider than 1280px by design, so the assertionexpect(mainBox.width).toBeLessThanOrEqual(1280)always failed.Why workers died: Workers consumed their entire context windows trying to debug responsive CSS and test infrastructure when the fix was a 4-line test selector change. They never read the CI logs first to identify the exact failing assertion.
Fixes Applied
1. Test fix (a managed private repo repo, feature/t003.5 branch)
Changed the test to measure
main.locator("> div").first()(DashboardInset) instead ofmain(SidebarInset).2. Self-improvement (this repo)
Added "CI failure debugging" guidance to
.agents/scripts/commands/full-loop.mdworker protocol. Workers must now read CI logs viagh run viewbefore attempting code changes on CI failure tasks.Investigation Findings
CI Status
17d4e960tofeature/t003.5)Ref #2282
Summary by CodeRabbit