docs(agent-workflows): MCP delivery architecture — current state and directions by mmabrouk · Pull Request #5067 · Agenta-AI/agenta

mmabrouk · 2026-07-04T17:37:28Z

What this is

Two docs that answer one question: how do tools reach an MCP-client harness (Claude today, Codex next) on each sandbox backend, for both our own tools and external MCP servers. This is the research you asked for on the Daytona MCP gap, written up as a reviewable direction paper.

Why now

We are cleaning up MCP for the MVP. The short-term work is already moving (#5047 merged, #4985 being recut, #4912 to be recut). This doc records the architecture those PRs fit into, so each next step follows a plan instead of ad-hoc fixes.

What it says (30-second version)

Today two separate things are both called "MCP": our internal agenta-tools channel, and user-declared mcp_servers. Conflating them already caused one regression.
Our channel breaks on Daytona because it binds the runner's loopback. Pi survives because its extension lives inside the sandbox. The fix pattern is to put an MCP front-end inside the sandbox too (fix(agent): deliver Claude gateway tools on Daytona via an in-sandbox stdio MCP relay shim (F-042) #4873 already implements this).
Short term: fail loud (done, [fix] Refuse tools on non-Pi harness x remote sandbox instead of silently dropping them #5047), in-sandbox shim (fix(agent): deliver Claude gateway tools on Daytona via an in-sandbox stdio MCP relay shim (F-042) #4873), un-gate http user MCP ([feat] Enable user MCP servers by default (http; stdio stays off-by-design) #4912).
Long term: three options. L1 run user stdio MCP servers inside the sandbox. L2 one authenticated platform MCP gateway URL that works from any backend. L3 fully managed MCP hosting. Recommendation: L1 when demand appears, L2 as the eventual convergence, L3 only on product pull.

What to review

The two decision points are marked with inline comments: the L1 credential trade-off and the L1-vs-L2 commitment. Everything in "Part 1" is verified against code with file:line refs; you can skim it.

https://claude.ai/code/session_01HhBEUFbXETNjYdcz71AGrT

…rections Research synthesis of the two MCP layers (internal agenta-tools channel vs user mcp_servers), the Daytona gap, and short/long-term directions (in-sandbox shim, un-gate http user MCP, in-sandbox user stdio, platform MCP gateway), with the open-PR state of play (#5047/#4985/#4912/#4873/E2B stack). Claude-Session: https://claude.ai/code/session_01HhBEUFbXETNjYdcz71AGrT

vercel · 2026-07-04T17:37:33Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jul 4, 2026 5:38pm

coderabbitai · 2026-07-04T17:37:35Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: edd713b2-484f-46a5-8166-d0002318bacd

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs/mcp-delivery-architecture

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

mmabrouk

Review guide (Claude). Part 1 is verified current-state, safe to skim. Part 2 is the framing. Part 3 holds the decisions. The two that need your call are marked inline below; questions 1 and 2 in the doc's closing list are already answered by this session's work (#5047 merged, #4911 kept open and out of MVP scope).

mmabrouk · 2026-07-04T17:37:58Z

+- Pros: solves "npx some-mcp-server"-class servers (the bulk of the ecosystem) with no new
+  hosting infra; isolation story is the one we already have; works identically for local (the
+  local backend also runs a sandbox-agent daemon) and Daytona.
+- Cons: user secrets enter the sandbox (a real, deliberate weakening of the invariant — though


Decision point 1 — the L1 credential trade-off.

Today no user secret ever enters a sandbox; only provider API keys do (via the create envVars). L1 (running user stdio MCP servers inside the sandbox) would put user MCP secrets there too.

My read: this is the same trust boundary we already accept for provider keys, and the sandbox is exactly where we run untrusted code. But it is a deliberate weakening of a stated invariant, so it should be a decision, not a drift. No action needed now — this only matters when we schedule L1.

mmabrouk · 2026-07-04T17:37:58Z

+   still "our tools only"?
+3. For L1: is "user secrets may enter the sandbox env (like provider keys already do)" an
+   acceptable, documented weakening of the no-credential-in-sandbox invariant?
+4. Does the L2 gateway feel like the right eventual convergence (worth shaping new work so it


Decision point 2 — sandbox-side delivery vs the platform gateway (L2).

This is the one worth 5 minutes of your time. If L2 (one authenticated platform MCP URL, any backend connects to it) is the eventual end state, then near-term work should keep the tool-spec contract transport-agnostic so the in-sandbox shim (#4873) stays swappable. If instead sandbox-side delivery is the permanent model, we can bind things tighter to the relay protocol.

My recommendation: treat L2 as the convergence target but build nothing for it yet. The only cost today is keeping the public-spec seam clean, which #4985 and #4873 already do.

dosubot Bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jul 4, 2026

dosubot Bot added the documentation Improvements or additions to documentation label Jul 4, 2026

mmabrouk commented Jul 4, 2026

View reviewed changes

vercel Bot deployed to Preview July 4, 2026 17:38 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs(agent-workflows): MCP delivery architecture — current state and directions#5067

docs(agent-workflows): MCP delivery architecture — current state and directions#5067
mmabrouk wants to merge 1 commit into
big-agentsfrom
docs/mcp-delivery-architecture

mmabrouk commented Jul 4, 2026

Uh oh!

vercel Bot commented Jul 4, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jul 4, 2026

Review skipped

Uh oh!

mmabrouk left a comment

Uh oh!

mmabrouk Jul 4, 2026

Uh oh!

mmabrouk Jul 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mmabrouk commented Jul 4, 2026

What this is

Why now

What it says (30-second version)

What to review

Uh oh!

vercel Bot commented Jul 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jul 4, 2026

Review skipped

Uh oh!

mmabrouk left a comment

Choose a reason for hiding this comment

Uh oh!

mmabrouk Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk Jul 4, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vercel Bot commented Jul 4, 2026 •

edited

Loading