Skip to content

MCP server restart cause Agent to fail #693

Open
@BarAshkenazi

Description

@BarAshkenazi

Please read this first

  • Have you read the docs?Agents SDK docs YES
  • Have you searched for related issues? Others may have faced similar issues. YES

Describe the bug

I use agent with a single MCP server I've created for it and all works great.
but when I restart my MCP server and the agent tries to do something with it, it get 400 error code because the session ID does not exist (which make sense because the MCP server does not hold it anymore) but the agent stop working without any exception that I can handle.

example error:
Error in post_writer: Client error '400 Bad Request' for url 'http://localhost:8181/message?sessionId=72ca14a0-731b-4db1-a21a-b2212e285g94'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400

Debug information

  • Agents SDK version: (e.g. v0.0.3) v0.0.14
  • Python version (e.g. Python 3.10) 3.13.1

Repro steps

Ideally provide a minimal python script that can be run to reproduce the bug.

import asyncio
from colorama import Fore, Back, Style
import os
import shutil
import subprocess
import time
from typing import Any
from dotenv import load_dotenv

from agents import Agent, Runner, gen_trace_id, trace
from agents.mcp import MCPServer, MCPServerSse
from agents.model_settings import ModelSettings
import nest_asyncio
nest_asyncio.apply()

def init_agent(mcp_server: MCPServer) -> Agent:
    agent = Agent(
        name="Assistant",
        instructions="use your provided tools and MCPs to help the user" ,
        mcp_servers=[mcp_server],
        model="gpt-4.1",
        model_settings=ModelSettings(tool_choice="auto"), )
    return agent

def assemble_conversation(result, new_input) -> Agent:
    if result != None:
        return result.to_input_list() + [{'content': new_input,
                                                'role': 'user'}]
    else:
        return new_input

async def run(mcp_server: MCPServer):
    result = None
    agent = init_agent(mcp_server)

    while True:
        # Get user input
        user_input = input("You: ").strip()
        # Check if user wants to exit
        if user_input.lower() == "exit":
            print("Ending conversation.")
            break

        try:
            result = Runner.run_sync(agent, assemble_conversation(result, user_input))
            print(Fore.GREEN + f"Assistant: {result.final_output}")
        except Exception as e:
            print(Fore.RED + f"Error:: {str(e)}")
        finally:
            print(Fore.RESET + "-" * 60)

async def main():
    load_dotenv()

    async with MCPServerSse(
        name="SSE Python Server",
        params={
            "url": "http://localhost:8181/sse",
        },
    ) as server:
        await run(server)

if __name__ == "__main__":
    print("starting conversation.")
    asyncio.run(main())

Expected behavior

The code works fine. But as I've mentioned, the agent fails to recover after the MCP server is restarted (which can happen naturally). and I just get the error:

Error in post_writer: Client error '400 Bad Request' for url 'http://localhost:8181/message?sessionId=72ca14a0-731b-4db1-a21a-b2212e285g94'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400

I think there should be a way to do recover with new session ID

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions