test(e2e): add cli-detach-recovery, exec-attach, volume-readonly cases#710
Conversation
…ayout Adds e2e regression tests for the surfaces that the new fix PRs (boxlite-ai#686–boxlite-ai#696) close, all under `scripts/test/e2e/cases/` so they slot into the current main layout (independent of the layout reorg in PR boxlite-ai#682). New cases: - test_files_io.py copy_in/copy_out text+binary roundtrip, include_parent + overwrite=False - test_snapshot_clone.py snapshot.create/list/restore + clone + export → import roundtrip - test_images_pull_list.py runtime.images.list (and pull-unreachable typed-error contract) - test_exec_user.py box.exec(user=) numeric uid, uid:gid, and invalid-user 4xx mapping - test_volume_readonly.py volumes=[(host, guest, RO)] mount + write rejection inside the guest - test_exec_attach.py attach to finished exec returns typed err / reattach hands back a usable handle - test_box_metrics.py box.metrics + runtime.metrics shape and counter increments - test_network_allow_net.py NetworkSpec(mode/allow_net) enforces the gvproxy filter - test_cli_detach_recovery.py boxlite run -d → fresh CLI process can ls / inspect / exec the same box id Each case targets a specific production seam exercised by one of the fix PRs; PR descriptions pin the matching test path.
test_exec_user.py → boxlite-ai#686, test_network_allow_net.py → boxlite-ai#687, test_files_io.py → boxlite-ai#688, test_box_metrics.py → boxlite-ai#689, test_snapshot_clone.py → boxlite-ai#694, test_images_pull_list.py → boxlite-ai#696. Remaining 3 cases (test_exec_attach.py, test_volume_readonly.py, test_cli_detach_recovery.py) stay here because they pin REST-path gaps that don't have a matching fix PR in this session — they document the contract for future work to land against.
📝 WalkthroughWalkthroughThis PR adds three E2E test modules covering distinct subsystems: CLI detach survival across fresh process invocations, Python SDK execution reattach with error handling, and volume mount isolation verification. All tests use subprocess spawning, async patterns, and fixtures to isolate real system behavior. ChangesE2E Test Suite Extensions
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
After running the suite end-to-end: - test_detached_box_survives_cli_exit_and_is_reusable used `boxlite info <id>` for per-box state. The CLI's `info` is system- wide (version / runtime stats) — there's no per-box info command — so the call was always `unexpected argument` exit 2. Replaced step 3 with `boxlite ls`-row parsing + a state-keyword check. Marked xfail(strict=True) because step 3's exec hits the stdout-drop race that boxlite-ai#563 fixes (same as the reattach case below). - test_reattach_after_original_completes already broken by the same stdout race — short `echo X && exit 0` returns out=''. Marked xfail(strict=True) pointing at boxlite-ai#563. - test_readonly_volume_mount_flag_and_write_reject expected a host bind mount to surface inside the guest. The cloud runtime deliberately dropped host bind mounts (boxlite-ai#639 "remove host bind mounts; only managed volumes allowed"), so this case was testing a feature that doesn't exist by design. Flipped to the negative contract — REST callers passing host paths get them silently dropped, no /mnt/<x> ever surfaces. Pin: boxlite-ai#639's product direction. Net: 3 pass + 2 xfail(strict). When boxlite-ai#563 lands, both xfails flip xpass-strict and become regular passes (markers should be dropped at that point). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (3)
scripts/test/e2e/cases/test_cli_detach_recovery.py (2)
100-107: ⚖️ Poor tradeoffConsider the fragility of parsing Unicode table output.
The test parses
boxlite lsUnicode table output by searching for state keywords in the row. This will break if the CLI table format changes or if state names evolve. Since this is an E2E test that intentionally exercises the real CLI output, this approach is acceptable, but be aware that CLI output format changes will require test updates.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/test/e2e/cases/test_cli_detach_recovery.py` around lines 100 - 107, The current assertion that checks ls_row for specific state keywords is fragile; update the check in test_cli_detach_recovery.py to robustly detect the state by using a regex search (e.g., re.search(r"\b(Running|Started|Ready|Configured)\b", ls_row)) instead of simple substring checks on ls_row, and on failure include the full r_ls.stdout/r_ls output in the assertion message to aid debugging; reference variables: ls_row, r_ls, box_id.
33-36: ⚡ Quick winSimplify the path calculation.
The current approach navigates up four parent directories to the repository root, then rebuilds the path to
lib. Since__file__is already inscripts/test/e2e/cases/, you can directly reference the siblinglibdirectory.♻️ Simplified path calculation
-sys.path.insert( - 0, - str(Path(__file__).resolve().parents[4] / "scripts" / "test" / "e2e" / "lib"), -) +sys.path.insert(0, str(Path(__file__).resolve().parents[1] / "lib"))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/test/e2e/cases/test_cli_detach_recovery.py` around lines 33 - 36, The sys.path.insert call over-traverses to repo root and then rebuilds the path; replace that with a direct sibling reference by computing the lib path from the current file's parent directories: use Path(__file__).resolve().parent.parent / "lib" (converted to str) and insert that into sys.path instead of using Path(__file__).resolve().parents[4] / "scripts" / "test" / "e2e" / "lib"; update the sys.path.insert line that references Path(__file__) accordingly.scripts/test/e2e/cases/test_exec_attach.py (1)
61-61: ⚖️ Poor tradeoffHardcoded sleep for test settling.
The 0.5-second sleep allows runner-side session bookkeeping to flush before reattaching. While this pattern is common in E2E tests and the duration is reasonable, be aware it could introduce flakiness in heavily loaded CI environments. If flakiness occurs, consider polling for a stable condition or increasing the timeout.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/test/e2e/cases/test_exec_attach.py` at line 61, The test currently uses a hardcoded await asyncio.sleep(0.5) which can cause flaky CI; replace this fixed sleep in the test_exec_attach.py case with a polling wait that checks the runner/session readiness (e.g., loop with a small interval and a timeout) rather than sleeping a fixed 0.5s, or extract the timeout into a variable like attach_wait_timeout and retry until the expected condition (session bookkeeping flushed / attachable state) is true before proceeding; locate the await asyncio.sleep(0.5) call and implement the polling/retry around the runner's API/state used by the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@scripts/test/e2e/cases/test_cli_detach_recovery.py`:
- Around line 100-107: The current assertion that checks ls_row for specific
state keywords is fragile; update the check in test_cli_detach_recovery.py to
robustly detect the state by using a regex search (e.g.,
re.search(r"\b(Running|Started|Ready|Configured)\b", ls_row)) instead of simple
substring checks on ls_row, and on failure include the full r_ls.stdout/r_ls
output in the assertion message to aid debugging; reference variables: ls_row,
r_ls, box_id.
- Around line 33-36: The sys.path.insert call over-traverses to repo root and
then rebuilds the path; replace that with a direct sibling reference by
computing the lib path from the current file's parent directories: use
Path(__file__).resolve().parent.parent / "lib" (converted to str) and insert
that into sys.path instead of using Path(__file__).resolve().parents[4] /
"scripts" / "test" / "e2e" / "lib"; update the sys.path.insert line that
references Path(__file__) accordingly.
In `@scripts/test/e2e/cases/test_exec_attach.py`:
- Line 61: The test currently uses a hardcoded await asyncio.sleep(0.5) which
can cause flaky CI; replace this fixed sleep in the test_exec_attach.py case
with a polling wait that checks the runner/session readiness (e.g., loop with a
small interval and a timeout) rather than sleeping a fixed 0.5s, or extract the
timeout into a variable like attach_wait_timeout and retry until the expected
condition (session bookkeeping flushed / attachable state) is true before
proceeding; locate the await asyncio.sleep(0.5) call and implement the
polling/retry around the runner's API/state used by the test.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 8b5d6e8f-f674-49a7-b098-cf2b255848f0
📒 Files selected for processing (3)
scripts/test/e2e/cases/test_cli_detach_recovery.pyscripts/test/e2e/cases/test_exec_attach.pyscripts/test/e2e/cases/test_volume_readonly.py
add 5 e2e cases pinning REST contracts not covered today
Coverage gaps
Three classes of behaviour run through the SDK → API → Runner → VM
chain but had no e2e pin on the chain itself; bugs in any one of them
would silently regress through
make test:integration:*(which onlyexercises local-FFI):
Cases shipped
5 cases total. Author also dropped 6 cases from this branch's earlier
incarnation that were already committed alongside their respective
fix PRs (#686 / #688 / #689 / #691 / #692 / #696), so the diff is
strictly net-new coverage.
Test plan — run against current main
Stack: local e2e, runner unchanged, no source edits.
test_detached_box_exec_propagates_exit_code_on_fresh_clitest_detached_box_survives_cli_exit_and_is_reusableboxlite exec <id> echo still-alive) then hits the stdout-drop race that #563 fixes — marker drops when #563 landstest_attach_with_bogus_id_is_typed_errorException(not 5xx, not silent)test_reattach_after_original_completesout=='first-output'assertiontest_host_bind_mount_via_rest_is_silently_ignoredvolumes=[(host_dir, "/mnt/ro", True)]reportsMOUNT_LINE=<none>from/proc/mountsand the host marker file is untouched — REST silently dropped the host pathThe 2 XFAILs are tied to #563 (
fix(go-sdk): fold stream drain into Execution.Wait). Once #563 merges, both xfails flip xpass-strict, the markers come off, and the suite is 5/5 green. No additional fix work is needed in this PR.🤖 Generated with Claude Code
Summary by CodeRabbit