docs(agent-workflows): MCP delivery architecture — current state and directions#5067
docs(agent-workflows): MCP delivery architecture — current state and directions#5067mmabrouk wants to merge 1 commit into
Conversation
…rections Research synthesis of the two MCP layers (internal agenta-tools channel vs user mcp_servers), the Daytona gap, and short/long-term directions (in-sandbox shim, un-gate http user MCP, in-sandbox user stdio, platform MCP gateway), with the open-PR state of play (#5047/#4985/#4912/#4873/E2B stack). Claude-Session: https://claude.ai/code/session_01HhBEUFbXETNjYdcz71AGrT
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
mmabrouk
left a comment
There was a problem hiding this comment.
Review guide (Claude). Part 1 is verified current-state, safe to skim. Part 2 is the framing. Part 3 holds the decisions. The two that need your call are marked inline below; questions 1 and 2 in the doc's closing list are already answered by this session's work (#5047 merged, #4911 kept open and out of MVP scope).
| - Pros: solves "npx some-mcp-server"-class servers (the bulk of the ecosystem) with no new | ||
| hosting infra; isolation story is the one we already have; works identically for local (the | ||
| local backend also runs a sandbox-agent daemon) and Daytona. | ||
| - Cons: user secrets enter the sandbox (a real, deliberate weakening of the invariant — though |
There was a problem hiding this comment.
Decision point 1 — the L1 credential trade-off.
Today no user secret ever enters a sandbox; only provider API keys do (via the create envVars). L1 (running user stdio MCP servers inside the sandbox) would put user MCP secrets there too.
My read: this is the same trust boundary we already accept for provider keys, and the sandbox is exactly where we run untrusted code. But it is a deliberate weakening of a stated invariant, so it should be a decision, not a drift. No action needed now — this only matters when we schedule L1.
| still "our tools only"? | ||
| 3. For L1: is "user secrets may enter the sandbox env (like provider keys already do)" an | ||
| acceptable, documented weakening of the no-credential-in-sandbox invariant? | ||
| 4. Does the L2 gateway feel like the right eventual convergence (worth shaping new work so it |
There was a problem hiding this comment.
Decision point 2 — sandbox-side delivery vs the platform gateway (L2).
This is the one worth 5 minutes of your time. If L2 (one authenticated platform MCP URL, any backend connects to it) is the eventual end state, then near-term work should keep the tool-spec contract transport-agnostic so the in-sandbox shim (#4873) stays swappable. If instead sandbox-side delivery is the permanent model, we can bind things tighter to the relay protocol.
My recommendation: treat L2 as the convergence target but build nothing for it yet. The only cost today is keeping the public-spec seam clean, which #4985 and #4873 already do.
What this is
Two docs that answer one question: how do tools reach an MCP-client harness (Claude today, Codex next) on each sandbox backend, for both our own tools and external MCP servers. This is the research you asked for on the Daytona MCP gap, written up as a reviewable direction paper.
Why now
We are cleaning up MCP for the MVP. The short-term work is already moving (#5047 merged, #4985 being recut, #4912 to be recut). This doc records the architecture those PRs fit into, so each next step follows a plan instead of ad-hoc fixes.
What it says (30-second version)
agenta-toolschannel, and user-declaredmcp_servers. Conflating them already caused one regression.What to review
The two decision points are marked with inline comments: the L1 credential trade-off and the L1-vs-L2 commitment. Everything in "Part 1" is verified against code with file:line refs; you can skim it.
https://claude.ai/code/session_01HhBEUFbXETNjYdcz71AGrT