[RFC 003] Add MCP (Model Context Protocol) support - Phase 1 by Darktex · Pull Request #224 · huggingface/OpenEnv

Darktex · 2025-11-25T02:16:56Z

Summary

This PR implements RFC 003: MCP (Model Context Protocol) support for OpenEnv - Phase 1.

Add strongly-typed MCP types: Tool, ToolError, ListToolsAction, CallToolAction, ListToolsObservation, CallToolObservation
Add MCPEnvironment base class for MCP-enabled environments with reserved tool name validation
Add WebSocket MCP message handler for JSON-RPC protocol (tools/list, tools/call)
Update echo_env as reference implementation with FastMCP tools

Key Design Decisions

WebSocket-only transport: MCP uses type: "mcp" WebSocket messages (no HTTP /mcp endpoint)
Per-environment MCP client: Each environment instance has its own MCP client
Dual API support: step(ListToolsAction/CallToolAction) for training, WebSocket /mcp for production
Reserved tool names: Hard error if tools named reset, step, state, or close are registered (protects dual API boundary)

Files Changed

Core MCP Infrastructure:

src/openenv/core/env_server/mcp_types.py - Tool, ToolError, actions/observations, WS message types
src/openenv/core/env_server/mcp_environment.py - MCPEnvironment base class
src/openenv/core/env_server/http_server.py - WebSocket MCP handler
src/openenv/core/env_server/__init__.py - Exports

Echo Environment Reference:

envs/echo_env/server/mcp_server.py - FastMCP tool definitions
envs/echo_env/server/echo_environment.py - MCP action support

Tests:

tests/core/test_mcp/test_mcp_types.py - 18 tests
tests/core/test_mcp/test_mcp_environment.py - 12 tests

Test plan

All 30 MCP tests pass: PYTHONPATH=src:envs pytest tests/core/test_mcp/ -v
MCP imports work correctly from package
Reserved tool names are validated

Review Comments Addressed

✅ Renamed types.py → mcp_types.py (Pankit's feedback)
✅ Strongly typed Tool/ToolError classes (Pankit's feedback)
✅ Reserved tool names cause hard error (alignment safeguard)
✅ Added comprehensive tests
✅ WebSocket-only transport (breaking change accommodation)

Darktex · 2025-11-25T02:18:21Z

@@ -0,0 +1,329 @@
+# RFC-003 Implementation Journal


This was written by Claude. Keeping this here while I implement, but I will remove it when the stack lands.

Darktex · 2025-11-25T02:19:52Z



-class EchoEnv(HTTPEnvClient[EchoAction, EchoObservation]):
+class EchoEnv(HTTPEnvClient[CallToolAction, Observation]):


For now I followed the HTTPEnvClient convention, but to me this is the weakest part of OpenEnv as we require that people write boilerplate that we can generate for them.

I will tackle this when we go into supporting ToolCall and CodeAct at the same time, as that is a natural fit for generating this boilerplate on the fly.

Darktex · 2025-11-25T02:20:45Z

+        Raises:
+            ValueError: If MCP client not configured or action type invalid
+        """
+        from .types import CallToolObservation, ListToolsObservation


Claude this, but me no like. Will fix.

Darktex · 2025-11-25T02:21:26Z

-            observation = await loop.run_in_executor(
-                self._executor, self.env.step, action
-            )
+            # Handle MCP actions asynchronously (don't use thread pool for async operations)


@pankit-eng this is where I need your eyes the most. Does this make sense?

Copilot

Pull request overview

This PR implements the core MCP (Model Context Protocol) infrastructure for OpenEnv as the first phase of RFC-003. It introduces a new MCPEnvironment base class that allows environments to expose tools via the MCP protocol, and converts the echo_env as a reference implementation. The changes enable both gym-style orchestration through HTTP endpoints and direct MCP JSON-RPC access for production agents.

Key Changes:

Added MCP action types (ListToolsAction, CallToolAction) and corresponding observations to the core type system
Implemented MCPEnvironment base class that handles MCP client/server communication boilerplate
Extended HTTP server with /mcp JSON-RPC endpoint for direct tool access alongside existing /step endpoint
Refactored echo_env to use MCP pattern, replacing custom EchoAction/EchoObservation with MCP tools

Reviewed changes

Copilot reviewed 15 out of 17 changed files in this pull request and generated 16 comments.

Show a summary per file

File	Description
`src/core/env_server/types.py`	Added MCP action and observation types (`ListToolsAction`, `CallToolAction`, `ListToolsObservation`, `CallToolObservation`)
`src/core/env_server/mcp_environment.py`	New base class providing MCP infrastructure for environments
`src/core/env_server/http_server.py`	Added `/mcp` JSON-RPC endpoint and MCP action routing in `/step` handler
`src/core/env_server/__init__.py`	Exported new MCP types and `MCPEnvironment` class
`src/core/pyproject.toml`	Added `mcp` and `fastmcp` dependencies
`src/envs/echo_env/server/mcp_server.py`	New MCP tool definitions for echo environment
`src/envs/echo_env/server/echo_environment.py`	Refactored to extend `MCPEnvironment` instead of custom implementation
`src/envs/echo_env/server/app.py`	Updated to use base `Action`/`Observation` types for MCP support
`src/envs/echo_env/models.py`	Removed custom `EchoAction` and `EchoObservation` (deprecated in favor of MCP)
`src/envs/echo_env/client.py`	Updated to handle MCP actions and parse MCP observations
`src/envs/echo_env/__init__.py`	Removed exports of deprecated custom action/observation types
`tests/core/mcp/test_mcp.py`	New test file for MCP client/server integration
`examples/test_mcp_integration.py`	Integration test script demonstrating MCP usage
`examples/echo_mcp_demo.py`	Demo script showing MCP tool interaction
`rfcs/RFC-003-implementation-journal.md`	Implementation journal documenting decisions and progress

Comments suppressed due to low confidence (1)

src/core/env_server/http_server.py:17

Import of 'json' is not used.

import json

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.

Copilot · 2025-11-25T02:24:16Z

+            Returns:
+                JSON-RPC 2.0 response
+            """
+            if self.env.mcp_client is None:


The /mcp endpoint assumes self.env.mcp_client exists, but not all Environment instances will have this attribute. This will raise an AttributeError for non-MCP environments. Add a hasattr(self.env, 'mcp_client') check or check if the env is an instance of MCPEnvironment first.

Suggested change

if self.env.mcp_client is None:

if not hasattr(self.env, "mcp_client") or self.env.mcp_client is None:

@pankit-eng this is good catch by copilot. Do you think we should just add that line?

Copilot · 2025-11-25T02:24:17Z

+        >>> tools = client.list_tools()
+        >>> print(tools)  # [{"name": "echo_message", ...}]
+        >>>
+        >>> # Call echo_message tool
+        >>> result = client.echo_message("Hello!")
+        >>> print(result["echoed_message"])  # "Hello!"


The docstring mentions list_tools() and echo_message() methods that are not implemented in the EchoEnv class. Either implement these convenience methods or update the documentation to reflect the actual API (using step() with ListToolsAction and CallToolAction).

Copilot · 2025-11-25T02:24:17Z

+
+    # List available tools
+    print("2. Listing available tools...")
+    tools = client.list_tools()


The list_tools() method is not implemented in the EchoEnv client class (see client.py). This call will fail with an AttributeError. Either implement the method in the client or use the step() method with a ListToolsAction.

Copilot · 2025-11-25T02:24:17Z

+
+    # Call echo_message tool using convenience method
+    print("3. Calling echo_message tool...")
+    result = client.echo_message("Hello from MCP!")


The echo_message() method is not implemented in the EchoEnv client class (see client.py). This call will fail with an AttributeError. Either implement the method in the client or use the step() method with a CallToolAction.

Copilot · 2025-11-25T02:24:17Z

+                    "id": None,
+                }
+
+            body = await request.json()


The /mcp endpoint lacks error handling for malformed JSON. If request.json() fails (e.g., invalid JSON in request body), it will raise an exception. Consider wrapping the await request.json() call in a try-except block and returning a JSON-RPC error with code -32700 ("Parse error") for invalid JSON.

Copilot · 2025-11-25T02:24:19Z

+            return CallToolAction(
+                tool_name=action_data["tool_name"],


Accessing action_data["tool_name"] will raise a KeyError if the tool_name field is missing from the request. Use action_data.get("tool_name") with an appropriate error message if it's None, or validate that the field exists before accessing it.

Suggested change

return CallToolAction(

tool_name=action_data["tool_name"],

tool_name = action_data.get("tool_name")

if tool_name is None:

raise ValueError("Missing required field 'tool_name' for CallToolAction")

return CallToolAction(

tool_name=tool_name,

Copilot · 2025-11-25T02:24:19Z

+
+            try:
+                # Reuse MCP client from environment (avoids creating duplicate client)
+                async with self.env.mcp_client:


The MCP client is being used within an async with block in the /mcp endpoint (line 168) and also in _handle_mcp_action (mcp_environment.py line 133). This could lead to issues if the client doesn't support being entered multiple times concurrently. Consider whether the client should be managed at the environment level or if each endpoint should manage its own context.

Copilot · 2025-11-25T02:24:19Z

+
+        self.mcp_server = mcp_server
+        self.mcp_client = Client(mcp_server)
+        super().__init__(mcp_client=self.mcp_client)


The Environment.__init__ base class doesn't accept an mcp_client parameter. This call will fail with a TypeError. The base class __init__ signature is def __init__(self, transform: Transform | None = None) (see interfaces.py line 95).

Either:

Remove the mcp_client=self.mcp_client argument here, or

Update the Environment.__init__ signature to accept mcp_client as an optional parameter

Suggested change

super().__init__(mcp_client=self.mcp_client)

super().__init__()

Copilot · 2025-11-25T02:24:19Z

+    from core.env_server.types import (
+        CallToolAction,
+        CallToolObservation,
+        ListToolsAction,
+        ListToolsObservation,
+        Observation,
+        State,
+    )


Import of 'ListToolsAction' is not used.

Copilot · 2025-11-25T02:24:20Z

+    from openenv_core.env_server.types import (
+        CallToolAction,
+        CallToolObservation,
+        ListToolsAction,
+        ListToolsObservation,
+        Observation,
+        State,
+    )


Import of 'ListToolsAction' is not used.

Darktex · 2025-11-25T19:05:20Z

@copilot I accept all your suggestions. Can you implement these changes?

Copilot · 2025-11-25T19:05:30Z

@Darktex I've opened a new pull request, #226, to work on those changes. Once the pull request is ready, I'll request review from you.

pankit-eng · 2025-11-30T13:51:18Z

See if we can rename the file name and not use types.py. It starts conflicting usually.

Yeah good point

pankit-eng · 2025-12-01T18:22:08Z

+
+    # Reset the environment
+    print("1. Resetting environment...")
+    result = client.reset()


is resetting a pre-req to list_tools call? Or in other words, does reset() semantically have any impact on the list_tools?

Keep in mind that list_tools() should be removed since instead we are just going to allow env.step(ListToolsAction()). Apart from that, hmmm I think that we may still want you to call .reset() first to initialize the env, in case some of your tools require state (e.g. a driving sim)

pankit-eng · 2025-12-01T18:30:33Z

+    Contains the list of available tools with their schemas.
+    """
+
+    tools: List[Dict[str, Any]] = field(default_factory=list)


@Darktex - can we use a strongly types schema(openAPI or similar) like how MCP list/tools does.
Each ToolSpec is defined as:

{ name: string; // Unique identifier for the tool description?: string; // Human-readable description inputSchema: { // JSON Schema for the tool's parameters type: "object", properties: { ... } // Tool-specific parameters } }

And this API returns a list of the above typed object:

[ToolSpec]

Good callout

pankit-eng · 2025-12-01T18:35:42Z

    metadata: Dict[str, Any] = field(default_factory=dict)


+@dataclass(kw_only=True)


I would strongly recommend that we create typed objects for tools and use that in client as well as the server:

@dataclass class Tool: name: str description: str input_schema: Dict[str, Any] output_schema: Dict[str, Any]

pankit-eng · 2025-12-01T18:47:54Z

+    Contains the result of calling a tool, or an error if the call failed.
+    """
+
+    result: Optional[Any] = None


All fields must be required for a CallToolObservation. Lets remove optional

We should be specific about error here: error coming from the tool execution needs to be part of the result object and not the error field. In other words, we should assume that tool will bundle the error in the Result and we will return the same as is in the result field.

For any other errors including but not limited to - invalid args, transport level issue, etc.., we can have an error field as we have here. Error field could also be more expanded:

class Error: error_type: enum error_str: description ....

pankit-eng · 2025-12-01T18:52:00Z

+        from fastmcp import Client
+
+        self.mcp_server = mcp_server
+        self.mcp_client = Client(mcp_server)


I would imagine that we will need to support multiple mcp servers and hence, multiple mcp clients in one environment. Are you planning on adding the routing support in a separate PR for the same?

@pankit-eng Correct! For now, let's do a single, local MCP server

Darktex · 2025-12-01T21:18:20Z

@copilot Can you send me a PR implementing Pankit's comments?

Copilot · 2025-12-01T21:18:30Z

@Darktex I've opened a new pull request, #230, to work on those changes. Once the pull request is ready, I'll request review from you.

Darktex · 2025-12-01T22:30:39Z

@copilot solve merge conflicts and rebase to Main

Copilot · 2025-12-01T22:30:49Z

@Darktex I've opened a new pull request, #231, to work on those changes. Once the pull request is ready, I'll request review from you.

Darktex · 2025-12-05T20:08:35Z

    "if TYPE_CHECKING:",
 ]
+
+[tool.pytest.ini_options]


Claude added this to fix pytest. This is the sort of "pytest hell" that I have encountered before and this looks like the sort of StackOverflow solution I would copypasta to fix things. Does this sound reasonable to people? @pankit-eng

Darktex · 2026-01-13T02:22:38Z

Note: This is an automated review by Claude Code (alignment-reviewer agent), not a human review. The account posting this is shared with the human maintainer.

Since I cannot access the PR directly via gh, let me analyze the PR based on the diff provided in the user's message. The PR implements RFC-003 MCP support. Let me create a comprehensive review based on the diff and the design documents I've read.

PR #224 Review: [RFC 003] [1/n] Implement basic local MCP server & add MCP client

Previous Review Summary

Key points from existing reviews:

Copilot identified an unused import json in http_server.py:17
Multiple reviewers (@Darktex, @pankit-eng) have commented, with @pankit-eng requesting changes
Copilot created follow-up PRs (Address code review feedback for MCP implementation #226, Add strongly typed MCP types: Tool, ToolError, rename types.py to mcp_types.py #230, Merge main and rename MCP types to mcp_types.py #231) to address reviewer feedback
Merge conflicts were identified that need resolution and rebase to main

Tier 1: Critical Issues (Bugs, Security, Lint)

🔴 BLOCKER: Lint Check Failed

The lint hook requires uv to be installed, which is missing from the environment. However, based on the code review, I can identify several issues:

🔴 BUG: Unused Import

File: src/core/env_server/http_server.py:17

import json

This import is never used in the file. Remove it.

Impact: Code cleanliness violation (already flagged by Copilot).

🟡 BUG: Missing Error Handling in MCP Endpoint

File: src/core/env_server/http_server.py:132-235

The /mcp endpoint has several error handling gaps:

No timeout handling: MCP client calls (list_tools(), call_tool()) have no timeout, which could hang the server indefinitely.
Context manager reuse issue: The code uses async with self.env.mcp_client inside the endpoint, but mcp_client may already be in use by another request. This could cause issues with concurrent requests.

# Line 169-170
async with self.env.mcp_client:
    if method == "tools/list":

Recommended fix:

Add timeout to MCP operations
Verify thread-safety of MCP client reuse
Add proper concurrent request handling

🟡 POTENTIAL BUG: Double Async Context Manager

File: src/core/env_server/mcp_environment.py:138-141

async with self.mcp_client:
    if isinstance(action, ListToolsAction):
        tools = await self.mcp_client.list_tools()

The MCPEnvironment uses async with self.mcp_client in _handle_mcp_action(), which is also called from the HTTP endpoint that may already have an active context. This could lead to nested context managers on the same client.

Verification needed: Check if FastMCP's Client supports nested async with contexts.

🟡 INCONSISTENT ERROR HANDLING

File: src/core/env_server/mcp_environment.py:154-167

The CallToolAction handler catches all exceptions and converts them to ToolError:

except Exception as e:
    return CallToolObservation(
        done=False,
        tool_name=action.tool_name,
        error=ToolError(
            error_type=ToolErrorType.EXECUTION_ERROR,
            message=str(e),
        ),
    )

However, ListToolsAction handler (lines 140-150) has NO exception handling. If list_tools() fails, it will crash the server.

Fix: Add try/except around list_tools() and return appropriate error observation.

🔴 MISSING FILE: models.py

File: src/envs/echo_env/models.py

The diff shows this file was deleted (deleted file mode 100644), but the client.py still has comments suggesting deprecated types exist:

# From client.py imports (lines 7-16)
try:
    from core.client_types import StepResult
    # ... no mention of models.py

However, the implementation journal mentions:

Deprecate src/envs/echo_env/models.py (EchoAction deprecated)

Was this file supposed to be kept with deprecation warnings, or fully deleted? The RFC journal says "deprecate" but the diff shows "delete".

VERIFICATION NEEDED: Confirm whether backward compatibility was intentionally broken or if deprecation warnings should have been kept.

Tier 2: Alignment Issues (Principles & Invariants)

⚠️ ALIGNMENT FLAG: Dual API Boundary Violation Risk

Invariant at risk: Dual API Boundary (INVARIANTS.md:41-58)

The concern: The /mcp endpoint exposes MCP tools directly to any HTTP client, which could potentially allow agents to access infrastructure-level operations if environment builders mistakenly expose reset() or simulation control as MCP tools.

From INVARIANTS.md:

Critical: The Gym-like API is NOT accessible to the agent being trained.
Violations to flag:

Exposing reset(), step(), or state() via MCP tools

Current risk: The architecture ASSUMES environment builders won't expose simulation control as tools, but there's no programmatic enforcement.

Recommended safeguards:

Add a reserved keyword check that prevents tools named reset, step, state from being registered
Document this requirement prominently in MCPEnvironment docstring
Add runtime validation in MCPEnvironment.__init__() that scans tool names

Suggested reviewer: @Darktex (RFC author)

⚠️ POTENTIAL PRINCIPLE VIOLATION: Missing Type Safety

Principle: "Type safety with generics and Pydantic across the wire" (PRINCIPLES.md:16)

File: src/core/env_server/http_server.py:88-100

The /step endpoint now handles MCP actions with a runtime type check instead of leveraging the type system:

# Lines 103-105
if isinstance(action, (ListToolsAction, CallToolAction)):
    if not isinstance(self.env, MCPEnvironment):
        raise RuntimeError(...)

This bypasses the generic type system (Environment[ActT, ObsT]). The environment's action_cls parameter is ignored for MCP actions.

Issue: This violates the principle of using generics across the wire. MCP actions are handled via special-case runtime checks rather than being part of the action type union.

Alternative approach (more type-safe):

Define MCPAction = ListToolsAction | CallToolAction
Have environments declare: Environment[MCPAction | CustomAction, Observation]
Let the type system handle this instead of runtime isinstance checks

Impact: Medium - Works correctly but doesn't follow the repo's type safety principle.

⚠️ DESIGN CONCERN: Sync step() Method in Async-First Environment

File: src/core/env_server/mcp_environment.py:82-110

The MCPEnvironment.step() method is synchronous but calls async MCP operations via asyncio.run():

def step(self, action: Action) -> Observation:
    # ...
    return asyncio.run(self._handle_mcp_action(action))

Documentation says (line 89-92):

NOTE: This is a sync method that internally runs async MCP operations.
When called from sync context, it uses asyncio.run(). When called from
HTTPEnvServer (async context), HTTPEnvServer intercepts MCP actions and
calls _handle_mcp_action() directly...

Problem: This creates two code paths with different behavior:

Direct env.step() calls → asyncio.run() (new event loop)
HTTP /step endpoint → _handle_mcp_action() (existing event loop)

This violates the minimize lifecycle deltas principle (PRINCIPLES.md:7) - the same step() call behaves differently depending on how it's invoked.

Recommended approach:

Make Environment.step() async across the board (breaking change, needs RFC)
OR: Document this as a known limitation of MCP environments in the implementation journal

📋 MISSING DOCUMENTATION: No Migration Guide

From RFC-003 Implementation Journal (lines 119-120):

**PR #2: Environment Migration**
- Convert coding_env, browsergym_env, etc. to MCP pattern
- Document migration guide for environment authors

The migration guide is deferred to PR #2, but existing environment authors need guidance now on:

How to convert existing environments to MCP
What breaks vs. what's backward compatible
When to use MCPEnvironment vs. keep existing Environment pattern

Impact: Low (deferred to future PR), but should be tracked.

🟡 INCONSISTENCY: Missing MCP Dependencies in echo_env Dockerfile

From Implementation Journal Checklist (line 103):

- [ ] Update echo_env Dockerfile dependencies

This is marked incomplete. The PR adds MCP dependencies to src/core/pyproject.toml but the echo_env Dockerfile may not have these dependencies.

Verification needed: Check if envs/echo_env/server/Dockerfile builds successfully with new MCP dependencies.

Tier 2: Code Quality & Best Practices

📝 INCOMPLETE TESTS

From Implementation Journal (lines 107-110):

#### Tests
- [ ] Create `tests/core/mcp/test_client.py`
- [ ] Create `tests/core/mcp/test_server.py`
- [ ] Update `tests/envs/test_echo_env.py`
- [ ] Verify other envs still work (no regressions)

All test items are incomplete. The PR includes example scripts but no pytest tests.

Impact: High - No automated testing for core MCP functionality. Manual testing only.

📝 DEBUG CODE IN EXAMPLES

File: examples/test_mcp_integration.py

Lines 10-60 contain extensive print statements:

print("=" * 60)
print("Testing MCP Integration")
print("=" * 60)
# ... many more print statements

This is acceptable for an example/demo script, but the file is named test_mcp_integration.py which suggests it should be a pytest test, not a manual script.

Recommendation: Either:

Rename to demo_mcp_integration.py to clarify it's a manual demo
OR: Convert to proper pytest test in tests/ directory

📝 MISSING DOCSTRINGS

Several new files lack module-level docstrings:

src/core/env_server/mcp_types.py - Has docstring ✅
src/envs/echo_env/server/mcp_server.py - Missing ❌

Low priority but should be added for consistency.

Summary & Recommendations

Tier 1 Issues (Must Fix Before Merge)

✅ Remove unused import json in http_server.py
✅ Add exception handling for list_tools() in MCPEnvironment._handle_mcp_action()
⚠️ Verify MCP client context manager safety with concurrent requests
⚠️ Add timeout to MCP operations in /mcp endpoint
⚠️ Clarify models.py deletion vs deprecation strategy

Tier 2 Issues (Should Address)

ALIGNMENT FLAG: Add safeguards against exposing simulation control (reset, step, state) as MCP tools
Consider making type handling more consistent with generics instead of runtime isinstance checks
Document the sync/async behavior difference between direct step() calls and HTTP endpoint calls
Add pytest tests before merging (currently all test checklist items incomplete)
Update echo_env Dockerfile with MCP dependencies
Rename test_mcp_integration.py → demo_mcp_integration.py or convert to pytest

Deferred (Future PRs)

Migration guide (PR Remove envtorch dir nesting and add base env code #2)
Comprehensive environment migration (PR Remove envtorch dir nesting and add base env code #2)
CodeAct integration (PR Add boiler plate code for CodingEnv #3)
Performance optimization for local servers (PR Add BaseHTTPClient to talk to Envs via JSON over RPC. #4)

Overall Assessment

Architecture: ✅ Solid implementation of RFC-003 core infrastructure

Code Quality: ⚠️ Missing tests and some error handling gaps

Alignment: ⚠️ One critical alignment flag regarding dual API boundary enforcement

Recommendation: Request changes to address Tier 1 issues and the ALIGNMENT FLAG before merging. The architecture is sound, but the safeguards and error handling need strengthening.

Automated review by Claude Code | Learn more about OpenEnv's agentic workflow

burtenshaw

Functionally looks great. I left some small nits mainly on consistncy. Other than that, I'd consider adding some tests. For example:

test the EchoEnvironment mcp faturs like list and call tools.
test passing a FastMCP in a test MCPEnvironment
test tools list and call via websockets

burtenshaw · 2026-01-16T20:07:44Z

        self._state = State(episode_id=str(uuid4()), step_count=0)
        self._reset_count = 0
+        # Initialize MCP client for the local MCP server
+        self._mcp_client = Client(mcp)


# echo_environment.py line 77 self._mcp_client = Client(mcp) # Private

Suggested change

self._mcp_client = Client(mcp)

self.mcp_client = Client(mcp)

burtenshaw · 2026-01-16T20:09:28Z

+                return await self._mcp_client.list_tools()
+
+        # Run async operation synchronously
+        tools_result = asyncio.get_event_loop().run_until_complete(_list_tools())


Suggested change

tools_result = asyncio.get_event_loop().run_until_complete(_list_tools())

tools_result = asyncio.run(self._async_list_tools())

burtenshaw · 2026-01-16T20:09:49Z

+
+        try:
+            # Run async operation synchronously
+            result = asyncio.get_event_loop().run_until_complete(_call_tool())


Suggested change

result = asyncio.get_event_loop().run_until_complete(_call_tool())

result = asyncio.run(self._async_list_tools())

burtenshaw · 2026-01-16T20:14:43Z

+
+            return CallToolObservation(
+                tool_name=action.tool_name,
+                result=tool_result,


I notice hear that result is a string of the text content but in MCPEnvironment it is a tool call response. Is the plan for this to be used flexibly, or should we be consistent?

*no opinion, just curious.

burtenshaw · 2026-01-16T20:16:32Z

+                                            )
+                                            response = WSMCPResponse(data={
+                                                "jsonrpc": "2.0",
+                                                "result": result,


In this case result is the raw object which may not be JSON serializable.

burtenshaw · 2026-01-16T21:03:13Z

+
+        try:
+            # Run the async call_tool with timeout
+            result = asyncio.run(


Should we gracefully handle this async call in case we're running in an async loop? I think we'll need this to work with the new async client.

For example, you could put the tool cool in a function like _call and do:

try: loop = asyncio.get_running_loop() except RuntimeError: # no loop running loop = None if loop is not None: # already in async context import concurrent.futures with concurrent.futures.ThreadPoolExecutor() as pool: future = pool.submit(asyncio.run, _call()) result = future.result() else: # no async context result = asyncio.run(_call())

Coming back to this. We might want a module core that handles our async operations because they're becoming important to the lib.

Implements RFC 003 for MCP integration in OpenEnv: Core MCP Infrastructure: - Add mcp_types.py with Tool, ToolError, ListToolsAction, CallToolAction, and corresponding observations - Add MCPEnvironment base class with reserved tool name validation, timeout handling, and proper error categorization - Add WebSocket MCP message handler for JSON-RPC via "type: mcp" messages Echo Environment Reference Implementation: - Add mcp_server.py with echo_message and echo_with_length tools - Update echo_environment.py with MCP action support while maintaining backwards compatibility with EchoAction Design decisions: - WebSocket-only for MCP (no HTTP /mcp endpoint) - Per-environment MCP client lifecycle - Both step() and WebSocket /mcp paths supported - reset() NOT required for list_tools() - Reserved tool names (reset, step, state, close) raise hard error Adds fastmcp>=2.0.0 dependency and comprehensive tests.

The directory name 'mcp' was shadowing the external mcp package during pytest collection, causing 'No module named mcp.types' errors.

This commit addresses outstanding review comments from burtenshaw, Copilot, and Claude's automated review on PR 224. Changes: - Add integration tests for EchoEnvironment MCP features (list/call tools) - Add integration tests for MCPEnvironment with FastMCP servers - Add WebSocket MCP tools/list and tools/call tests - Fix EchoEnvironment to use asyncio.run() instead of deprecated get_event_loop() - Fix WebSocket MCP handler to use async context manager for MCP client - Fix MCPEnvironment tool name validation to check _tools attribute - Add echo_mcp_demo.py example showing canonical SIM mode pattern - Expose mcp_client as public attribute on EchoEnvironment for WebSocket support

The WebSocket tests use synchronous TestClient, not async/await, so the asyncio marks were causing warnings.

…servations) The function in prompts.py is named format_observations (plural) but was incorrectly imported as format_observation (singular).

EchoEnvironment now properly inherits from MCPEnvironment instead of duplicating MCP handling logic. This: - Demonstrates the canonical way to build an MCPEnvironment subclass - Removes code duplication between EchoEnvironment and MCPEnvironment - Uses _step_impl() for domain-specific EchoAction handling Also updated tests and demo to handle CallToolResult objects returned by MCPEnvironment (accessing .data attribute for the actual value).

Darktex · 2026-01-21T00:18:29Z

Whops, merged this too soon. Created #312 to address your comments @burtenshaw

Darktex requested review from Copilot, pankit-eng and zkwentz November 25, 2025 02:16

meta-cla Bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 25, 2025

Copilot started reviewing on behalf of Darktex November 25, 2025 02:17 View session

Darktex commented Nov 25, 2025

View reviewed changes

Copilot finished reviewing on behalf of Darktex November 25, 2025 02:22

Copilot AI reviewed Nov 25, 2025

View reviewed changes

Copilot AI mentioned this pull request Nov 25, 2025

Address code review feedback for MCP implementation #226

Merged

16 tasks

pankit-eng reviewed Nov 30, 2025

View reviewed changes

pankit-eng suggested changes Dec 1, 2025

View reviewed changes

Copilot AI mentioned this pull request Dec 1, 2025

Add strongly typed MCP types: Tool, ToolError, rename types.py to mcp_types.py #230

Closed

13 tasks

Copilot AI mentioned this pull request Dec 1, 2025

Merge main and rename MCP types to mcp_types.py #231

Merged

Darktex force-pushed the rfc-003-first-pr branch 2 times, most recently from c7072a2 to 7a74252 Compare December 5, 2025 19:25

Darktex commented Dec 5, 2025

View reviewed changes

Darktex force-pushed the rfc-003-first-pr branch from 1379c2d to a301125 Compare December 5, 2025 21:12

Darktex mentioned this pull request Jan 13, 2026

[RFC 003] Add MCP (Model Context Protocol) support - Phase 1 #290

Closed

3 tasks

Darktex force-pushed the rfc-003-first-pr branch from 2156f1f to dddb61f Compare January 13, 2026 07:41

Darktex changed the title ~~[RFC 003] [1/n] Implement basic local MCP server & add MCP client~~ [RFC 003] Add MCP (Model Context Protocol) support - Phase 1 Jan 13, 2026

Darktex requested a review from burtenshaw January 16, 2026 19:40

This comment was marked as outdated.

Sign in to view

burtenshaw reviewed Jan 16, 2026

View reviewed changes

Darktex added 4 commits January 20, 2026 15:19

Rename tests/core/mcp to tests/core/test_mcp

79b1290

The directory name 'mcp' was shadowing the external mcp package during pytest collection, causing 'No module named mcp.types' errors.

Fix lint: remove unused imports in MCP files

51c133f

style: Format MCP files with ruff

2d77844

Darktex force-pushed the rfc-003-first-pr branch from da71d38 to 2d77844 Compare January 20, 2026 23:23

Darktex added 4 commits January 20, 2026 15:30

fix: Remove unnecessary @pytest.mark.asyncio decorators

fb8e464

The WebSocket tests use synchronous TestClient, not async/await, so the asyncio marks were causing warnings.

fix: Correct typo in repl_env import (format_observation -> format_ob…

e845ea1

…servations) The function in prompts.py is named format_observations (plural) but was incorrectly imported as format_observation (singular).

Darktex merged commit 7f24d1f into main Jan 21, 2026
4 checks passed

Darktex deleted the rfc-003-first-pr branch January 21, 2026 00:06

Darktex mentioned this pull request Jan 21, 2026

fix: Address PR 224 review comments for MCP support #312

Merged

2 tasks

greptile-apps Bot mentioned this pull request Feb 23, 2026

Add new RFCs to the README #403

Merged

12 tasks



		class EchoEnv(HTTPEnvClient[EchoAction, EchoObservation]):
		class EchoEnv(HTTPEnvClient[CallToolAction, Observation]):

	if self.env.mcp_client is None:
	if not hasattr(self.env, "mcp_client") or self.env.mcp_client is None:

-            return CallToolAction(
-                tool_name=action_data["tool_name"],
+            tool_name = action_data.get("tool_name")
+            if tool_name is None:
+                raise ValueError("Missing required field 'tool_name' for CallToolAction")
+            return CallToolAction(
+                tool_name=tool_name,

	super().__init__(mcp_client=self.mcp_client)
	super().__init__()

		metadata: Dict[str, Any] = field(default_factory=dict)


		@dataclass(kw_only=True)

	tools_result = asyncio.get_event_loop().run_until_complete(_list_tools())
	tools_result = asyncio.run(self._async_list_tools())

	result = asyncio.get_event_loop().run_until_complete(_call_tool())
	result = asyncio.run(self._async_list_tools())

Conversation

Darktex commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key Design Decisions

Files Changed

Test plan

Review Comments Addressed

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Darktex commented Nov 25, 2025

Uh oh!

Copilot AI commented Nov 25, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Darktex commented Dec 1, 2025

Uh oh!

Darktex commented Nov 25, 2025 •

edited

Loading