test(e2e): add cli-detach-recovery, exec-attach, volume-readonly cases by G4614 · Pull Request #710 · boxlite-ai/boxlite

G4614 · 2026-06-09T13:34:19Z

add 5 e2e cases pinning REST contracts not covered today

Coverage gaps

Three classes of behaviour run through the SDK → API → Runner → VM
chain but had no e2e pin on the chain itself; bugs in any one of them
would silently regress through make test:integration:* (which only
exercises local-FFI):

1. CLI detach lifecycle — `boxlite run -d` returns, CLI process exits,
   does a fresh CLI process still see / exec the box?
   FFI side  src/boxlite/tests/detach.rs, recovery.rs  ✓
   E2E side  scripts/test/e2e/cases/                    ✗
2. Execution.attach / reattach contract — bogus exec_id should be a
   typed client error; reattach to a completed exec should return a
   usable handle.
   Runner side  apps/runner/.../boxlite_exec_attach_test.go  ✓
   SDK <-> API  end-to-end                                    ✗
3. host bind mount via REST — the cloud runtime intentionally dropped
   host bind mounts (#639); REST callers passing host paths must
   silently no-op, no /mnt/<x> in the guest.
   FFI surface  src/boxlite/tests/mount_security.rs  ✓
   REST contract                                      ✗

Cases shipped

scripts/test/e2e/cases/test_cli_detach_recovery.py   (2 cases)
scripts/test/e2e/cases/test_exec_attach.py           (2 cases)
scripts/test/e2e/cases/test_volume_readonly.py       (1 case)

5 cases total. Author also dropped 6 cases from this branch's earlier
incarnation that were already committed alongside their respective
fix PRs (#686 / #688 / #689 / #691 / #692 / #696), so the diff is
strictly net-new coverage.

Test plan — run against current main

Stack: local e2e, runner unchanged, no source edits.

Case	Result	Notes
`test_detached_box_exec_propagates_exit_code_on_fresh_cli`	✅ PASS	exit-code passthrough across CLI processes
`test_detached_box_survives_cli_exit_and_is_reusable`	⚠️ XFAIL (strict)	reaches step 3 (`boxlite exec <id> echo still-alive`) then hits the stdout-drop race that #563 fixes — marker drops when #563 lands
`test_attach_with_bogus_id_is_typed_error`	✅ PASS	bogus exec_id → typed `Exception` (not 5xx, not silent)
`test_reattach_after_original_completes`	⚠️ XFAIL (strict)	same stdout-drop race (#563) on the original exec's `out=='first-output'` assertion
`test_host_bind_mount_via_rest_is_silently_ignored`	✅ PASS	the box created with `volumes=[(host_dir, "/mnt/ro", True)]` reports `MOUNT_LINE=<none>` from `/proc/mounts` and the host marker file is untouched — REST silently dropped the host path

The 2 XFAILs are tied to #563 (fix(go-sdk): fold stream drain into Execution.Wait). Once #563 merges, both xfails flip xpass-strict, the markers come off, and the suite is 5/5 green. No additional fix work is needed in this PR.

🤖 Generated with Claude Code

Summary by CodeRabbit

Tests
- Added end-to-end tests verifying CLI detach survival and box reusability across fresh CLI processes, including exit code propagation tests.
- Added end-to-end tests for SDK reattach functionality, validating session state after execution completion.
- Added end-to-end test confirming proper host bind mount handling during REST execution.

…ayout Adds e2e regression tests for the surfaces that the new fix PRs (boxlite-ai#686–boxlite-ai#696) close, all under `scripts/test/e2e/cases/` so they slot into the current main layout (independent of the layout reorg in PR boxlite-ai#682). New cases: - test_files_io.py copy_in/copy_out text+binary roundtrip, include_parent + overwrite=False - test_snapshot_clone.py snapshot.create/list/restore + clone + export → import roundtrip - test_images_pull_list.py runtime.images.list (and pull-unreachable typed-error contract) - test_exec_user.py box.exec(user=) numeric uid, uid:gid, and invalid-user 4xx mapping - test_volume_readonly.py volumes=[(host, guest, RO)] mount + write rejection inside the guest - test_exec_attach.py attach to finished exec returns typed err / reattach hands back a usable handle - test_box_metrics.py box.metrics + runtime.metrics shape and counter increments - test_network_allow_net.py NetworkSpec(mode/allow_net) enforces the gvproxy filter - test_cli_detach_recovery.py boxlite run -d → fresh CLI process can ls / inspect / exec the same box id Each case targets a specific production seam exercised by one of the fix PRs; PR descriptions pin the matching test path.

test_exec_user.py → boxlite-ai#686, test_network_allow_net.py → boxlite-ai#687, test_files_io.py → boxlite-ai#688, test_box_metrics.py → boxlite-ai#689, test_snapshot_clone.py → boxlite-ai#694, test_images_pull_list.py → boxlite-ai#696. Remaining 3 cases (test_exec_attach.py, test_volume_readonly.py, test_cli_detach_recovery.py) stay here because they pin REST-path gaps that don't have a matching fix PR in this session — they document the contract for future work to land against.

coderabbitai · 2026-06-09T13:34:28Z

📝 Walkthrough

Walkthrough

This PR adds three E2E test modules covering distinct subsystems: CLI detach survival across fresh process invocations, Python SDK execution reattach with error handling, and volume mount isolation verification. All tests use subprocess spawning, async patterns, and fixtures to isolate real system behavior.

Changes

E2E Test Suite Extensions

Layer / File(s)	Summary
CLI detach recovery and reusability `scripts/test/e2e/cases/test_cli_detach_recovery.py`	Module docstring, imports, constants, and `cli` fixture establish subprocess-based testing. The `test_detached_box_survives_cli_exit_and_is_reusable` test (xfail) runs `boxlite run -d`, captures the UUID, and verifies via fresh CLI calls that `boxlite ls` and `boxlite exec` work while checking the runner journal. `test_detached_box_exec_propagates_exit_code_on_fresh_cli` confirms exit codes propagate correctly through fresh `boxlite exec`. Both clean up with `boxlite rm -f`.
SDK reattach and error handling `scripts/test/e2e/cases/test_exec_attach.py`	Module docstring and imports introduce async/UUID dependencies. `test_reattach_after_original_completes` (xfail) runs an exec, collects stdout, reattaches by execution ID, and verifies exit code matching, treating "session gone" as a soft pass. `test_attach_with_bogus_id_is_typed_error` generates a random ID, asserts the error is typed (no "500"/"Internal" messages), and forces interaction to surface deferred errors.
Host bind mount isolation verification `scripts/test/e2e/cases/test_volume_readonly.py`	Test creates a host directory with a marker file, starts a box via REST with `BoxOptions(volumes=[(host_dir, "/mnt/ro", True)])`, asserts `/mnt/ro` does not appear in guest `/proc/mounts`, verifies the marker file is unchanged, and cleans up the box.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

boxlite-ai/boxlite#622: Implemented CLI exec/run exit code propagation (avoiding hard process exit), which is directly tested by the new exit code assertion cases in this PR.

Suggested reviewers

DorianZheng

Poem

🐰 Three tests hop in to verify the way,
Detached boxes dance, then reattach to play,
SDK reattaches with grace and care,
Volumes stay hidden, none leaking there,
Fresh processes spawn—the bugs won't dare! 🎪

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately summarizes the main change: adding three new E2E test case files (cli-detach-recovery, exec-attach, volume-readonly) to the test suite.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

After running the suite end-to-end: - test_detached_box_survives_cli_exit_and_is_reusable used `boxlite info <id>` for per-box state. The CLI's `info` is system- wide (version / runtime stats) — there's no per-box info command — so the call was always `unexpected argument` exit 2. Replaced step 3 with `boxlite ls`-row parsing + a state-keyword check. Marked xfail(strict=True) because step 3's exec hits the stdout-drop race that boxlite-ai#563 fixes (same as the reattach case below). - test_reattach_after_original_completes already broken by the same stdout race — short `echo X && exit 0` returns out=''. Marked xfail(strict=True) pointing at boxlite-ai#563. - test_readonly_volume_mount_flag_and_write_reject expected a host bind mount to surface inside the guest. The cloud runtime deliberately dropped host bind mounts (boxlite-ai#639 "remove host bind mounts; only managed volumes allowed"), so this case was testing a feature that doesn't exist by design. Flipped to the negative contract — REST callers passing host paths get them silently dropped, no /mnt/<x> ever surfaces. Pin: boxlite-ai#639's product direction. Net: 3 pass + 2 xfail(strict). When boxlite-ai#563 lands, both xfails flip xpass-strict and become regular passes (markers should be dropped at that point). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai

🧹 Nitpick comments (3)

scripts/test/e2e/cases/test_cli_detach_recovery.py (2)
100-107: ⚖️ Poor tradeoff

Consider the fragility of parsing Unicode table output.

The test parses boxlite ls Unicode table output by searching for state keywords in the row. This will break if the CLI table format changes or if state names evolve. Since this is an E2E test that intentionally exercises the real CLI output, this approach is acceptable, but be aware that CLI output format changes will require test updates.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/test/e2e/cases/test_cli_detach_recovery.py` around lines 100 - 107,
The current assertion that checks ls_row for specific state keywords is fragile;
update the check in test_cli_detach_recovery.py to robustly detect the state by
using a regex search (e.g., re.search(r"\b(Running|Started|Ready|Configured)\b",
ls_row)) instead of simple substring checks on ls_row, and on failure include
the full r_ls.stdout/r_ls output in the assertion message to aid debugging;
reference variables: ls_row, r_ls, box_id.
33-36: ⚡ Quick win

Simplify the path calculation.

The current approach navigates up four parent directories to the repository root, then rebuilds the path to lib. Since __file__ is already in scripts/test/e2e/cases/, you can directly reference the sibling lib directory.
♻️ Simplified path calculation
-sys.path.insert(
-    0,
-    str(Path(__file__).resolve().parents[4] / "scripts" / "test" / "e2e" / "lib"),
-)
+sys.path.insert(0, str(Path(__file__).resolve().parents[1] / "lib"))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/test/e2e/cases/test_cli_detach_recovery.py` around lines 33 - 36, The
sys.path.insert call over-traverses to repo root and then rebuilds the path;
replace that with a direct sibling reference by computing the lib path from the
current file's parent directories: use Path(__file__).resolve().parent.parent /
"lib" (converted to str) and insert that into sys.path instead of using
Path(__file__).resolve().parents[4] / "scripts" / "test" / "e2e" / "lib"; update
the sys.path.insert line that references Path(__file__) accordingly.
scripts/test/e2e/cases/test_exec_attach.py (1)
61-61: ⚖️ Poor tradeoff

Hardcoded sleep for test settling.

The 0.5-second sleep allows runner-side session bookkeeping to flush before reattaching. While this pattern is common in E2E tests and the duration is reasonable, be aware it could introduce flakiness in heavily loaded CI environments. If flakiness occurs, consider polling for a stable condition or increasing the timeout.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/test/e2e/cases/test_exec_attach.py` at line 61, The test currently
uses a hardcoded await asyncio.sleep(0.5) which can cause flaky CI; replace this
fixed sleep in the test_exec_attach.py case with a polling wait that checks the
runner/session readiness (e.g., loop with a small interval and a timeout) rather
than sleeping a fixed 0.5s, or extract the timeout into a variable like
attach_wait_timeout and retry until the expected condition (session bookkeeping
flushed / attachable state) is true before proceeding; locate the await
asyncio.sleep(0.5) call and implement the polling/retry around the runner's
API/state used by the test.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@scripts/test/e2e/cases/test_cli_detach_recovery.py`:
- Around line 100-107: The current assertion that checks ls_row for specific
state keywords is fragile; update the check in test_cli_detach_recovery.py to
robustly detect the state by using a regex search (e.g.,
re.search(r"\b(Running|Started|Ready|Configured)\b", ls_row)) instead of simple
substring checks on ls_row, and on failure include the full r_ls.stdout/r_ls
output in the assertion message to aid debugging; reference variables: ls_row,
r_ls, box_id.
- Around line 33-36: The sys.path.insert call over-traverses to repo root and
then rebuilds the path; replace that with a direct sibling reference by
computing the lib path from the current file's parent directories: use
Path(__file__).resolve().parent.parent / "lib" (converted to str) and insert
that into sys.path instead of using Path(__file__).resolve().parents[4] /
"scripts" / "test" / "e2e" / "lib"; update the sys.path.insert line that
references Path(__file__) accordingly.

In `@scripts/test/e2e/cases/test_exec_attach.py`:
- Line 61: The test currently uses a hardcoded await asyncio.sleep(0.5) which
can cause flaky CI; replace this fixed sleep in the test_exec_attach.py case
with a polling wait that checks the runner/session readiness (e.g., loop with a
small interval and a timeout) rather than sleeping a fixed 0.5s, or extract the
timeout into a variable like attach_wait_timeout and retry until the expected
condition (session bookkeeping flushed / attachable state) is true before
proceeding; locate the await asyncio.sleep(0.5) call and implement the
polling/retry around the runner's API/state used by the test.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 8b5d6e8f-f674-49a7-b098-cf2b255848f0

📥 Commits

Reviewing files that changed from the base of the PR and between 3b0675e and ded1d23.

📒 Files selected for processing (3)

scripts/test/e2e/cases/test_cli_detach_recovery.py
scripts/test/e2e/cases/test_exec_attach.py
scripts/test/e2e/cases/test_volume_readonly.py

G4614 added 2 commits June 9, 2026 07:05

G4614 mentioned this pull request Jun 10, 2026

fix(ffi): drain stdout/stderr pumps before Wait terminal event (re-open of #685/#705) #714

Closed

G4614 marked this pull request as ready for review June 10, 2026 04:44

G4614 requested a review from a team as a code owner June 10, 2026 04:44

G4614 enabled auto-merge June 10, 2026 04:45

coderabbitai Bot reviewed Jun 10, 2026

View reviewed changes

DorianZheng approved these changes Jun 10, 2026

View reviewed changes

DorianZheng disabled auto-merge June 10, 2026 08:41

DorianZheng merged commit 04a7f8b into boxlite-ai:main Jun 10, 2026
22 of 23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(e2e): add cli-detach-recovery, exec-attach, volume-readonly cases#710

test(e2e): add cli-detach-recovery, exec-attach, volume-readonly cases#710
DorianZheng merged 3 commits into
boxlite-ai:mainfrom
G4614:test/e2e-coverage-on-main

G4614 commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

G4614 commented Jun 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage gaps

Cases shipped

Test plan — run against current main

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

G4614 commented Jun 9, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 9, 2026 •

edited

Loading