Skip to content

Shard Electron E2E workflow under CI time cap#2070

Merged
nwparker merged 1 commit into
mainfrom
nwparker/e2e-test-time
May 16, 2026
Merged

Shard Electron E2E workflow under CI time cap#2070
nwparker merged 1 commit into
mainfrom
nwparker/e2e-test-time

Conversation

@nwparker
Copy link
Copy Markdown
Contributor

@nwparker nwparker commented May 16, 2026

Summary

This PR brings the Electron E2E workflow back under the CI wall-clock cap by sharding the existing Playwright suite across five GitHub-hosted Linux runners while preserving the current per-runner worker count. It also includes the startup/reliability fixes found during review so the sharded Linux jobs can reuse a single build artifact instead of doing five competing Electron builds.

The release-blocking problem was elapsed CI time. The referenced release run entered Run E2E tests at 2026-05-16T04:01:33Z and was still running past the 30-minute cap. Increasing Playwright workers on one runner is risky for this suite because each worker launches isolated Electron app instances. This PR adds horizontal CI parallelism instead.

What Changed

CI workflow

  • Added a build e2e app prerequisite job in .github/workflows/e2e.yml.
  • The build job mirrors the Linux native-toolchain and node-gyp setup from the existing jobs, installs dependencies, runs npx electron-vite build --mode e2e, and uploads out/ as the short-lived e2e-build-out artifact.
  • Converted the E2E job to a 5-way Playwright shard matrix:
    • 1/5 (e2e 1-of-5)
    • 2/5 (e2e 2-of-5)
    • 3/5 (e2e 3-of-5)
    • 4/5 (e2e 4-of-5)
    • 5/5 (e2e 5-of-5)
  • Each shard downloads the shared build artifact and runs with SKIP_BUILD=1, so Playwright global setup reuses the prebuilt app instead of launching five concurrent electron-vite builds.
  • Kept the existing Playwright workers: 4 setting. The PR adds runner-level parallelism without increasing per-runner Electron concurrency.
  • Kept fail-fast: false so one shard failure does not hide failures from the other shards.
  • Added shard-specific trace artifact names such as playwright-traces-3-of-5.
  • Set build timeout to 10 minutes and shard timeout to 20 minutes.
  • Runs shards under xvfb-run with ORCA_E2E_FORWARD_APP_LOGS=1 so Electron stdout/stderr is visible when the app launches but fails before firstWindow().

E2E startup hardening

  • Added src/main/electron-updater-loader.ts and moved electron-updater behind the packaged-update guards.
  • Passed the loaded updater instance into registerAutoUpdaterHandlers() so src/main/updater-events.ts no longer imports electron-updater at module load.
  • This fixes the CI startup failure where direct Electron E2E launches reported app version 0.0; electron-updater validates app.getVersion() while loading, before the prior dev-mode guard could return.
  • Added regression coverage proving dev/E2E setup does not configure or load the updater path.
  • Disabled the Electron GPU process only for Linux E2E launches, keyed off the existing ORCA_E2E_USER_DATA_DIR signal. Ubuntu/Xvfb runners were hitting GPU process isn't usable; E2E coverage does not depend on GPU compositing.
  • Added unit coverage for the Linux E2E GPU branch.
  • Removed the attempted CI ORCA_E2E_FORCE_HEADFUL=1 override after logs showed it was not the root fix and could expose extra Xvfb/GPU instability.

E2E reliability fixes

  • Added tests/e2e/helpers/terminal-title-log.ts to observe transient terminal title changes from the renderer.
  • Updated Droid notification and terminal attention specs to assert against the observed title log instead of only polling the current active tab title. This removes a race where the expected state appears briefly and transitions again before polling samples it.
  • Updated the Source Control create-PR test to target the Commit message textbox by accessible role/name instead of a brittle ambiguous selector.
  • Updated the usage overview test to open Stats & Usage through the visible Settings navigation item, matching the exposed UI path.

Why This Approach

  • It targets the CI timeout directly by splitting the 91-test Electron suite across five runners.
  • It avoids increasing Playwright workers on a single runner, where Electron app concurrency can exhaust memory.
  • It avoids the first naive shard design's hidden cost: each shard would have run Playwright global setup and performed its own Electron build. Building once and downloading out/ keeps setup deterministic and avoids five parallel builds competing for CPU/RAM.
  • It preserves full test coverage. No specs are skipped or removed.
  • It keeps debugging practical because each shard uploads its own trace bundle on failure and app-process startup failures now show stderr in the job log.
  • It keeps E2E-only startup hardening scoped behind existing E2E signals, so packaged user launches keep normal updater and GPU behavior.

Validation

Local validation on the final commit:

  • pnpm exec oxlint src/main/updater.ts src/main/updater-events.ts src/main/electron-updater-loader.ts src/main/updater.test.ts src/main/updater.check-failure.test.ts src/main/updater.mac-install.test.ts src/main/startup/configure-process.ts src/main/startup/configure-process.test.ts tests/e2e/global-setup.ts tests/e2e/helpers/orca-app.ts
  • pnpm exec vitest run src/main/updater.test.ts src/main/updater.check-failure.test.ts src/main/updater.mac-install.test.ts src/main/startup/configure-process.test.ts passed: 58 tests.
  • pnpm typecheck
  • npx electron-vite build --mode e2e
  • git diff --check
  • .github/workflows/e2e.yml parsed successfully as YAML.
  • CI=1 SKIP_BUILD=1 ORCA_E2E_FORWARD_APP_LOGS=1 pnpm run test:e2e -- tests/e2e/usage-overview.spec.ts --retries=0 passed: 1 Electron smoke test.

Earlier local shard validation for the same 5-way split:

Shard Local result
1/5 19 passed in 30s
2/5 17 passed, 1 skipped in 25s
3/5 20 passed in 43s
4/5 16 passed in 53s
5/5 18 passed in 25s

Targeted affected-spec validation also passed: 5 tests passed in 16.0s.

CI Review Notes

  • The initial sharding-only version passed verify, but each shard built Electron separately, which defeated the wall-clock goal.
  • The shared-build version confirmed shards downloaded e2e-build-out and skipped local builds, but Electron startup timed out before firstWindow().
  • App-log forwarding exposed two Linux CI blockers: electron-updater validating direct-launch version 0.0, and Ubuntu/Xvfb GPU-process crashes.
  • The final pushed version addresses those blockers by lazy-loading electron-updater only after packaged-update guards and disabling GPU only for Linux E2E runs.

Risk and Follow-Up Notes

  • This increases total GitHub Actions runner usage because shards run in parallel, but it reduces wall-clock time and keeps each runner at the existing worker cap.
  • The workflow now depends on the e2e-build-out artifact between jobs. if-no-files-found: error makes missing build output fail in the build job instead of surfacing later as confusing shard failures.
  • The shard layout uses Playwright's built-in sharding, so future spec additions are automatically distributed without maintaining manual test lists.

@nwparker nwparker force-pushed the nwparker/e2e-test-time branch 4 times, most recently from 2f03073 to 2526665 Compare May 16, 2026 06:26
@nwparker nwparker force-pushed the nwparker/e2e-test-time branch from 2526665 to 17af0d4 Compare May 16, 2026 06:45
@nwparker nwparker merged commit 6eae8fa into main May 16, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant