[RFC 003] Add MCP (Model Context Protocol) support - Phase 1#224
Conversation
| @@ -0,0 +1,329 @@ | |||
| # RFC-003 Implementation Journal | |||
There was a problem hiding this comment.
This was written by Claude. Keeping this here while I implement, but I will remove it when the stack lands.
|
|
||
|
|
||
| class EchoEnv(HTTPEnvClient[EchoAction, EchoObservation]): | ||
| class EchoEnv(HTTPEnvClient[CallToolAction, Observation]): |
There was a problem hiding this comment.
For now I followed the HTTPEnvClient convention, but to me this is the weakest part of OpenEnv as we require that people write boilerplate that we can generate for them.
I will tackle this when we go into supporting ToolCall and CodeAct at the same time, as that is a natural fit for generating this boilerplate on the fly.
| Raises: | ||
| ValueError: If MCP client not configured or action type invalid | ||
| """ | ||
| from .types import CallToolObservation, ListToolsObservation |
There was a problem hiding this comment.
Claude this, but me no like. Will fix.
| observation = await loop.run_in_executor( | ||
| self._executor, self.env.step, action | ||
| ) | ||
| # Handle MCP actions asynchronously (don't use thread pool for async operations) |
There was a problem hiding this comment.
@pankit-eng this is where I need your eyes the most. Does this make sense?
There was a problem hiding this comment.
Pull request overview
This PR implements the core MCP (Model Context Protocol) infrastructure for OpenEnv as the first phase of RFC-003. It introduces a new MCPEnvironment base class that allows environments to expose tools via the MCP protocol, and converts the echo_env as a reference implementation. The changes enable both gym-style orchestration through HTTP endpoints and direct MCP JSON-RPC access for production agents.
Key Changes:
- Added MCP action types (
ListToolsAction,CallToolAction) and corresponding observations to the core type system - Implemented
MCPEnvironmentbase class that handles MCP client/server communication boilerplate - Extended HTTP server with
/mcpJSON-RPC endpoint for direct tool access alongside existing/stependpoint - Refactored echo_env to use MCP pattern, replacing custom
EchoAction/EchoObservationwith MCP tools
Reviewed changes
Copilot reviewed 15 out of 17 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
src/core/env_server/types.py |
Added MCP action and observation types (ListToolsAction, CallToolAction, ListToolsObservation, CallToolObservation) |
src/core/env_server/mcp_environment.py |
New base class providing MCP infrastructure for environments |
src/core/env_server/http_server.py |
Added /mcp JSON-RPC endpoint and MCP action routing in /step handler |
src/core/env_server/__init__.py |
Exported new MCP types and MCPEnvironment class |
src/core/pyproject.toml |
Added mcp and fastmcp dependencies |
src/envs/echo_env/server/mcp_server.py |
New MCP tool definitions for echo environment |
src/envs/echo_env/server/echo_environment.py |
Refactored to extend MCPEnvironment instead of custom implementation |
src/envs/echo_env/server/app.py |
Updated to use base Action/Observation types for MCP support |
src/envs/echo_env/models.py |
Removed custom EchoAction and EchoObservation (deprecated in favor of MCP) |
src/envs/echo_env/client.py |
Updated to handle MCP actions and parse MCP observations |
src/envs/echo_env/__init__.py |
Removed exports of deprecated custom action/observation types |
tests/core/mcp/test_mcp.py |
New test file for MCP client/server integration |
examples/test_mcp_integration.py |
Integration test script demonstrating MCP usage |
examples/echo_mcp_demo.py |
Demo script showing MCP tool interaction |
rfcs/RFC-003-implementation-journal.md |
Implementation journal documenting decisions and progress |
Comments suppressed due to low confidence (1)
src/core/env_server/http_server.py:17
- Import of 'json' is not used.
import json
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
| Returns: | ||
| JSON-RPC 2.0 response | ||
| """ | ||
| if self.env.mcp_client is None: |
There was a problem hiding this comment.
The /mcp endpoint assumes self.env.mcp_client exists, but not all Environment instances will have this attribute. This will raise an AttributeError for non-MCP environments. Add a hasattr(self.env, 'mcp_client') check or check if the env is an instance of MCPEnvironment first.
| if self.env.mcp_client is None: | |
| if not hasattr(self.env, "mcp_client") or self.env.mcp_client is None: |
There was a problem hiding this comment.
@pankit-eng this is good catch by copilot. Do you think we should just add that line?
| >>> tools = client.list_tools() | ||
| >>> print(tools) # [{"name": "echo_message", ...}] | ||
| >>> | ||
| >>> # Call echo_message tool | ||
| >>> result = client.echo_message("Hello!") | ||
| >>> print(result["echoed_message"]) # "Hello!" |
There was a problem hiding this comment.
The docstring mentions list_tools() and echo_message() methods that are not implemented in the EchoEnv class. Either implement these convenience methods or update the documentation to reflect the actual API (using step() with ListToolsAction and CallToolAction).
|
|
||
| # List available tools | ||
| print("2. Listing available tools...") | ||
| tools = client.list_tools() |
There was a problem hiding this comment.
The list_tools() method is not implemented in the EchoEnv client class (see client.py). This call will fail with an AttributeError. Either implement the method in the client or use the step() method with a ListToolsAction.
|
|
||
| # Call echo_message tool using convenience method | ||
| print("3. Calling echo_message tool...") | ||
| result = client.echo_message("Hello from MCP!") |
There was a problem hiding this comment.
The echo_message() method is not implemented in the EchoEnv client class (see client.py). This call will fail with an AttributeError. Either implement the method in the client or use the step() method with a CallToolAction.
| "id": None, | ||
| } | ||
|
|
||
| body = await request.json() |
There was a problem hiding this comment.
The /mcp endpoint lacks error handling for malformed JSON. If request.json() fails (e.g., invalid JSON in request body), it will raise an exception. Consider wrapping the await request.json() call in a try-except block and returning a JSON-RPC error with code -32700 ("Parse error") for invalid JSON.
| return CallToolAction( | ||
| tool_name=action_data["tool_name"], |
There was a problem hiding this comment.
Accessing action_data["tool_name"] will raise a KeyError if the tool_name field is missing from the request. Use action_data.get("tool_name") with an appropriate error message if it's None, or validate that the field exists before accessing it.
| return CallToolAction( | |
| tool_name=action_data["tool_name"], | |
| tool_name = action_data.get("tool_name") | |
| if tool_name is None: | |
| raise ValueError("Missing required field 'tool_name' for CallToolAction") | |
| return CallToolAction( | |
| tool_name=tool_name, |
|
|
||
| try: | ||
| # Reuse MCP client from environment (avoids creating duplicate client) | ||
| async with self.env.mcp_client: |
There was a problem hiding this comment.
The MCP client is being used within an async with block in the /mcp endpoint (line 168) and also in _handle_mcp_action (mcp_environment.py line 133). This could lead to issues if the client doesn't support being entered multiple times concurrently. Consider whether the client should be managed at the environment level or if each endpoint should manage its own context.
|
|
||
| self.mcp_server = mcp_server | ||
| self.mcp_client = Client(mcp_server) | ||
| super().__init__(mcp_client=self.mcp_client) |
There was a problem hiding this comment.
The Environment.__init__ base class doesn't accept an mcp_client parameter. This call will fail with a TypeError. The base class __init__ signature is def __init__(self, transform: Transform | None = None) (see interfaces.py line 95).
Either:
- Remove the
mcp_client=self.mcp_clientargument here, or - Update the
Environment.__init__signature to acceptmcp_clientas an optional parameter
| super().__init__(mcp_client=self.mcp_client) | |
| super().__init__() |
| from core.env_server.types import ( | ||
| CallToolAction, | ||
| CallToolObservation, | ||
| ListToolsAction, | ||
| ListToolsObservation, | ||
| Observation, | ||
| State, | ||
| ) |
There was a problem hiding this comment.
Import of 'ListToolsAction' is not used.
| from openenv_core.env_server.types import ( | ||
| CallToolAction, | ||
| CallToolObservation, | ||
| ListToolsAction, | ||
| ListToolsObservation, | ||
| Observation, | ||
| State, | ||
| ) |
There was a problem hiding this comment.
Import of 'ListToolsAction' is not used.
|
@copilot I accept all your suggestions. Can you implement these changes? |
There was a problem hiding this comment.
See if we can rename the file name and not use types.py. It starts conflicting usually.
|
|
||
| # Reset the environment | ||
| print("1. Resetting environment...") | ||
| result = client.reset() |
There was a problem hiding this comment.
is resetting a pre-req to list_tools call? Or in other words, does reset() semantically have any impact on the list_tools?
There was a problem hiding this comment.
Keep in mind that list_tools() should be removed since instead we are just going to allow env.step(ListToolsAction()). Apart from that, hmmm I think that we may still want you to call .reset() first to initialize the env, in case some of your tools require state (e.g. a driving sim)
| Contains the list of available tools with their schemas. | ||
| """ | ||
|
|
||
| tools: List[Dict[str, Any]] = field(default_factory=list) |
There was a problem hiding this comment.
@Darktex - can we use a strongly types schema(openAPI or similar) like how MCP list/tools does.
Each ToolSpec is defined as:
{
name: string; // Unique identifier for the tool
description?: string; // Human-readable description
inputSchema: { // JSON Schema for the tool's parameters
type: "object",
properties: { ... } // Tool-specific parameters
}
}
And this API returns a list of the above typed object:
[ToolSpec]
| metadata: Dict[str, Any] = field(default_factory=dict) | ||
|
|
||
|
|
||
| @dataclass(kw_only=True) |
There was a problem hiding this comment.
I would strongly recommend that we create typed objects for tools and use that in client as well as the server:
@dataclass
class Tool:
name: str
description: str
input_schema: Dict[str, Any]
output_schema: Dict[str, Any]
| Contains the result of calling a tool, or an error if the call failed. | ||
| """ | ||
|
|
||
| result: Optional[Any] = None |
There was a problem hiding this comment.
All fields must be required for a CallToolObservation. Lets remove optional
We should be specific about error here: error coming from the tool execution needs to be part of the result object and not the error field. In other words, we should assume that tool will bundle the error in the Result and we will return the same as is in the result field.
For any other errors including but not limited to - invalid args, transport level issue, etc.., we can have an error field as we have here. Error field could also be more expanded:
class Error:
error_type: enum
error_str: description
....
| from fastmcp import Client | ||
|
|
||
| self.mcp_server = mcp_server | ||
| self.mcp_client = Client(mcp_server) |
There was a problem hiding this comment.
I would imagine that we will need to support multiple mcp servers and hence, multiple mcp clients in one environment. Are you planning on adding the routing support in a separate PR for the same?
There was a problem hiding this comment.
@pankit-eng Correct! For now, let's do a single, local MCP server
|
@copilot Can you send me a PR implementing Pankit's comments? |
|
@copilot solve merge conflicts and rebase to Main |
c7072a2 to
7a74252
Compare
| "if TYPE_CHECKING:", | ||
| ] | ||
|
|
||
| [tool.pytest.ini_options] |
There was a problem hiding this comment.
Claude added this to fix pytest. This is the sort of "pytest hell" that I have encountered before and this looks like the sort of StackOverflow solution I would copypasta to fix things. Does this sound reasonable to people? @pankit-eng
1379c2d to
a301125
Compare
Since I cannot access the PR directly via gh, let me analyze the PR based on the diff provided in the user's message. The PR implements RFC-003 MCP support. Let me create a comprehensive review based on the diff and the design documents I've read. PR #224 Review: [RFC 003] [1/n] Implement basic local MCP server & add MCP clientPrevious Review SummaryKey points from existing reviews:
Tier 1: Critical Issues (Bugs, Security, Lint)🔴 BLOCKER: Lint Check FailedThe lint hook requires 🔴 BUG: Unused ImportFile: import jsonThis import is never used in the file. Remove it. Impact: Code cleanliness violation (already flagged by Copilot). 🟡 BUG: Missing Error Handling in MCP EndpointFile: The
# Line 169-170
async with self.env.mcp_client:
if method == "tools/list":Recommended fix:
🟡 POTENTIAL BUG: Double Async Context ManagerFile: async with self.mcp_client:
if isinstance(action, ListToolsAction):
tools = await self.mcp_client.list_tools()The Verification needed: Check if FastMCP's 🟡 INCONSISTENT ERROR HANDLINGFile: The except Exception as e:
return CallToolObservation(
done=False,
tool_name=action.tool_name,
error=ToolError(
error_type=ToolErrorType.EXECUTION_ERROR,
message=str(e),
),
)However, Fix: Add try/except around 🔴 MISSING FILE: models.pyFile: The diff shows this file was deleted ( # From client.py imports (lines 7-16)
try:
from core.client_types import StepResult
# ... no mention of models.pyHowever, the implementation journal mentions:
Was this file supposed to be kept with deprecation warnings, or fully deleted? The RFC journal says "deprecate" but the diff shows "delete". VERIFICATION NEEDED: Confirm whether backward compatibility was intentionally broken or if deprecation warnings should have been kept. Tier 2: Alignment Issues (Principles & Invariants)
|
2156f1f to
dddb61f
Compare
burtenshaw
left a comment
There was a problem hiding this comment.
Functionally looks great. I left some small nits mainly on consistncy. Other than that, I'd consider adding some tests. For example:
- test the
EchoEnvironmentmcp faturs like list and call tools. - test passing a
FastMCPin a testMCPEnvironment - test tools list and call via websockets
| self._state = State(episode_id=str(uuid4()), step_count=0) | ||
| self._reset_count = 0 | ||
| # Initialize MCP client for the local MCP server | ||
| self._mcp_client = Client(mcp) |
There was a problem hiding this comment.
# echo_environment.py line 77
self._mcp_client = Client(mcp) # Private
| self._mcp_client = Client(mcp) | |
| self.mcp_client = Client(mcp) |
| return await self._mcp_client.list_tools() | ||
|
|
||
| # Run async operation synchronously | ||
| tools_result = asyncio.get_event_loop().run_until_complete(_list_tools()) |
There was a problem hiding this comment.
| tools_result = asyncio.get_event_loop().run_until_complete(_list_tools()) | |
| tools_result = asyncio.run(self._async_list_tools()) |
|
|
||
| try: | ||
| # Run async operation synchronously | ||
| result = asyncio.get_event_loop().run_until_complete(_call_tool()) |
There was a problem hiding this comment.
| result = asyncio.get_event_loop().run_until_complete(_call_tool()) | |
| result = asyncio.run(self._async_list_tools()) |
|
|
||
| return CallToolObservation( | ||
| tool_name=action.tool_name, | ||
| result=tool_result, |
There was a problem hiding this comment.
I notice hear that result is a string of the text content but in MCPEnvironment it is a tool call response. Is the plan for this to be used flexibly, or should we be consistent?
*no opinion, just curious.
| ) | ||
| response = WSMCPResponse(data={ | ||
| "jsonrpc": "2.0", | ||
| "result": result, |
There was a problem hiding this comment.
In this case result is the raw object which may not be JSON serializable.
|
|
||
| try: | ||
| # Run the async call_tool with timeout | ||
| result = asyncio.run( |
There was a problem hiding this comment.
Should we gracefully handle this async call in case we're running in an async loop? I think we'll need this to work with the new async client.
For example, you could put the tool cool in a function like _call and do:
try:
loop = asyncio.get_running_loop()
except RuntimeError:
# no loop running
loop = None
if loop is not None:
# already in async context
import concurrent.futures
with concurrent.futures.ThreadPoolExecutor() as pool:
future = pool.submit(asyncio.run, _call())
result = future.result()
else:
# no async context
result = asyncio.run(_call())There was a problem hiding this comment.
Coming back to this. We might want a module core that handles our async operations because they're becoming important to the lib.
Implements RFC 003 for MCP integration in OpenEnv: Core MCP Infrastructure: - Add mcp_types.py with Tool, ToolError, ListToolsAction, CallToolAction, and corresponding observations - Add MCPEnvironment base class with reserved tool name validation, timeout handling, and proper error categorization - Add WebSocket MCP message handler for JSON-RPC via "type: mcp" messages Echo Environment Reference Implementation: - Add mcp_server.py with echo_message and echo_with_length tools - Update echo_environment.py with MCP action support while maintaining backwards compatibility with EchoAction Design decisions: - WebSocket-only for MCP (no HTTP /mcp endpoint) - Per-environment MCP client lifecycle - Both step() and WebSocket /mcp paths supported - reset() NOT required for list_tools() - Reserved tool names (reset, step, state, close) raise hard error Adds fastmcp>=2.0.0 dependency and comprehensive tests.
The directory name 'mcp' was shadowing the external mcp package during pytest collection, causing 'No module named mcp.types' errors.
da71d38 to
2d77844
Compare
This commit addresses outstanding review comments from burtenshaw, Copilot, and Claude's automated review on PR 224. Changes: - Add integration tests for EchoEnvironment MCP features (list/call tools) - Add integration tests for MCPEnvironment with FastMCP servers - Add WebSocket MCP tools/list and tools/call tests - Fix EchoEnvironment to use asyncio.run() instead of deprecated get_event_loop() - Fix WebSocket MCP handler to use async context manager for MCP client - Fix MCPEnvironment tool name validation to check _tools attribute - Add echo_mcp_demo.py example showing canonical SIM mode pattern - Expose mcp_client as public attribute on EchoEnvironment for WebSocket support
The WebSocket tests use synchronous TestClient, not async/await, so the asyncio marks were causing warnings.
…servations) The function in prompts.py is named format_observations (plural) but was incorrectly imported as format_observation (singular).
EchoEnvironment now properly inherits from MCPEnvironment instead of duplicating MCP handling logic. This: - Demonstrates the canonical way to build an MCPEnvironment subclass - Removes code duplication between EchoEnvironment and MCPEnvironment - Uses _step_impl() for domain-specific EchoAction handling Also updated tests and demo to handle CallToolResult objects returned by MCPEnvironment (accessing .data attribute for the actual value).
|
Whops, merged this too soon. Created #312 to address your comments @burtenshaw |
Summary
This PR implements RFC 003: MCP (Model Context Protocol) support for OpenEnv - Phase 1.
Tool,ToolError,ListToolsAction,CallToolAction,ListToolsObservation,CallToolObservationMCPEnvironmentbase class for MCP-enabled environments with reserved tool name validationtools/list,tools/call)Key Design Decisions
type: "mcp"WebSocket messages (no HTTP/mcpendpoint)step(ListToolsAction/CallToolAction)for training, WebSocket/mcpfor productionreset,step,state, orcloseare registered (protects dual API boundary)Files Changed
Core MCP Infrastructure:
src/openenv/core/env_server/mcp_types.py- Tool, ToolError, actions/observations, WS message typessrc/openenv/core/env_server/mcp_environment.py- MCPEnvironment base classsrc/openenv/core/env_server/http_server.py- WebSocket MCP handlersrc/openenv/core/env_server/__init__.py- ExportsEcho Environment Reference:
envs/echo_env/server/mcp_server.py- FastMCP tool definitionsenvs/echo_env/server/echo_environment.py- MCP action supportTests:
tests/core/test_mcp/test_mcp_types.py- 18 teststests/core/test_mcp/test_mcp_environment.py- 12 testsTest plan
PYTHONPATH=src:envs pytest tests/core/test_mcp/ -vReview Comments Addressed
types.py→mcp_types.py(Pankit's feedback)