[chore] Add Claude harness on E2B sandbox#5053
Conversation
…al modes) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Adds the missing “Claude harness × E2B sandbox” support in the runner by wiring Claude-specific E2B asset provisioning (including a strict allow-list credential upload path), extending the run plan with an explicit isClaude flag, and adding unit tests + design docs to complete the harness×sandbox matrix.
Changes:
- Add
isClaudetoRunPlanand propagate it through E2B execution paths. - Implement Claude-on-E2B credential provisioning via an allow-listed upload of
.credentials.jsoninto the sandbox. - Add unit tests covering E2B run planning, orchestration wiring, and Claude asset upload behavior; add accompanying design docs.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| services/runner/src/engines/sandbox_agent/run-plan.ts | Adds isClaude flag to the run plan. |
| services/runner/src/engines/sandbox_agent/e2b.ts | Implements allow-listed Claude credential upload + prepareE2BClaudeAssets. |
| services/runner/src/engines/sandbox_agent.ts | Wires prepareE2BClaudeAssets into the E2B run path. |
| services/runner/tests/unit/sandbox-agent-e2b-assets.test.ts | New unit tests for Claude-on-E2B asset provisioning and allow-list behavior. |
| services/runner/tests/unit/sandbox-agent-e2b-run-plan.test.ts | Unit tests for isClaude/isE2B plan derivation and refusal cases. |
| services/runner/tests/unit/sandbox-agent-orchestration.test.ts | Orchestration-level tests ensuring E2B Claude asset prep is invoked and teardown happens. |
| docs/design/agent-workflows/projects/add-claude-e2b/tasks.md | Task checklist for the Claude-on-E2B work. |
| docs/design/agent-workflows/projects/add-claude-e2b/specs.md | Spec document describing the intended TS changes and credential flow. |
| docs/design/agent-workflows/projects/add-claude-e2b/research.md | Research/investigation notes and rationale for the changes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| * | ||
| * Run: pnpm test (or: pnpm exec vitest run tests/unit/sandbox-agent-e2b-assets.test.ts) | ||
| */ | ||
| import { afterEach, beforeEach, describe, it, vi } from "vitest"; |
|
|
||
| describe("Claude-on-E2B orchestration", () => { | ||
| it("calls prepareE2BClaudeAssets (injected) and prepareE2BPiAssets (injected) on E2B", async () => { | ||
| const { calls, deps } = fakeHarness({ cwd: "/root/work/agenta-e2b-fake" }); |
| const localDir = process.env.CLAUDE_CONFIG_DIR || join(process.env.HOME ?? "", ".claude"); | ||
| if (!existsSync(localDir)) return; |
| - [x] `workspace.ts`: extend `PrepareWorkspaceInput.plan` with `isE2b`; change `if (plan.isDaytona)` | ||
| to `if (plan.isDaytona || plan.isE2b)` so E2B uses the sandbox fs API. | ||
| - [x] `e2b.ts`: add `prepareE2bClaudeAssets` — uploads `~/.claude/` on `runtime_provided` credential | ||
| mode; export `PrepareE2bClaudeAssetsInput`. | ||
| - [x] `sandbox_agent.ts`: import and call `prepareE2bClaudeAssets` in the E2B block; add it to |
| - Managed key never written to the sandbox filesystem (env-only, same as Pi-on-E2B). | ||
| - Own-login upload is gated by `shouldUploadOwnLogin` (same function as Pi), so it never fires | ||
| when a resolved key is present. | ||
| - Restricted-network refusal already in `buildRunPlan` — unchanged. | ||
| - `autoPause: true` + `timeoutMs` backstop already in `buildE2bCreate` — unchanged. | ||
|
|
||
| ## Foundation seam | ||
|
|
||
| When the non-Pi remote-bootstrap generalization lands, the `prepareE2bClaudeAssets` function | ||
| folds into a generic `prepareE2bHarnessAssets(plan)` dispatcher. The `isDaytona || isE2b` | ||
| workspace arm is already the generalized form. |
| | Gap | Fix | | ||
| |---|---| | ||
| | `prepareWorkspace` falls through to local for E2B | Extend `PrepareWorkspaceInput` plan type with `isE2b`; add `isDaytona \|\| isE2b` arm that uses the sandbox fs API. The Daytona and E2B arms are identical in shape (both use `sandbox.mkdirFs` / `sandbox.writeFsFile`). | | ||
| | `prepareE2bPiAssets` returns early for Claude | Add `prepareE2bClaudeAssets` that uploads the Claude own-login from `~/.claude/` if `credentialMode === "runtime_provided"` (same gate as Pi's auth.json path). Wire it in `sandbox_agent.ts` next to the Pi call. | | ||
|
|
||
| ## Credential modes | ||
|
|
||
| - `credentialMode="env"`: `ANTHROPIC_API_KEY` arrives in `plan.secrets`, merged into `env` before | ||
| `buildSandboxProvider`, and carried into the sandbox through `buildE2bCreate({}, secrets).envs`. | ||
| No file upload needed. |
| - `prepareWorkspace` E2B arm → folds onto a single `isDaytona || isE2b` branch (already done here). | ||
| - `prepareE2bClaudeAssets` own-login upload → folds onto a generic `prepareE2bHarnessAssets` | ||
| that dispatches by `acpAgent`. The Pi arm stays separate (pi-specific: extension, skills-in-pi-dir, | ||
| system-prompt). | ||
| - `buildE2bCreate` envs param already carries arbitrary secrets → no change needed. |
| ## e2b.ts — add prepareE2bClaudeAssets | ||
|
|
||
| New export: | ||
|
|
||
| ```typescript | ||
| export interface PrepareE2bClaudeAssetsInput { | ||
| sandbox: any; | ||
| plan: Pick<RunPlan, "isClaude" | "credentialMode">; | ||
| log?: Log; | ||
| } | ||
|
|
||
| export async function prepareE2bClaudeAssets({ | ||
| sandbox, | ||
| plan, | ||
| log = () => {}, | ||
| }: PrepareE2bClaudeAssetsInput): Promise<void> | ||
| ``` |
| ## sandbox_agent.ts — wire prepareE2bClaudeAssets | ||
|
|
||
| In the E2B asset-prep block (currently only `prepareE2bPiAssets`): | ||
|
|
||
| ```typescript | ||
| } else if (plan.isE2b) { | ||
| await (deps.prepareE2bPiAssets ?? prepareE2bPiAssets)({ sandbox, plan, log: logger }); | ||
| await (deps.prepareE2bClaudeAssets ?? prepareE2bClaudeAssets)({ sandbox, plan, log: logger }); | ||
| } | ||
| ``` | ||
|
|
||
| `SandboxAgentDeps` gains `prepareE2bClaudeAssets?: typeof prepareE2bClaudeAssets`. |
| ``` | ||
| credentialMode="env": | ||
| ANTHROPIC_API_KEY in plan.secrets | ||
| → merged into env by sandbox_agent.ts | ||
| → carried into sandbox by buildE2bCreate({}, secrets).envs | ||
| (no file upload) | ||
|
|
||
| credentialMode="runtime_provided": | ||
| prepareE2bClaudeAssets uploads ~/.claude/ into /root/.claude/ in the sandbox | ||
| (best-effort; same pattern as Pi auth.json upload) | ||
| ``` |
Context
Claude worked locally and, in a sibling PR, on Daytona. E2B needed its own Claude auth-provisioning path, completing Claude's row of the harness x sandbox matrix.
Changes
Adds Claude harnessFiles provisioning and auth upload in
e2b.ts, following the same allow-list discipline as the Daytona branch: uploads exactly.credentials.json(never a directory scan) fromCLAUDE_CONFIG_DIR || ~/.claude, to the E2B-side Claude dir (default/root/.claude, overridable). Both managed and self-managed credential modes are supported.Tests / notes
sandbox-agent-e2b-assets.test.ts,sandbox-agent-e2b-run-plan.test.ts,sandbox-agent-orchestration.test.ts.chore/add-sandbox-e2b, notbig-agents— this is the last of the 12 harness x sandbox cells, completing the full matrix.chore/add-remote-tools-gateuntil the relay-client shim lands.