Closed
Description
Please read this first
- Have you read the docs? Agents SDK docs Yes
- Have you searched for related issues? Yes, others may have faced similar issues.
Describe the bug
Runner.run_streamed()
is not able to produce a proper stream with LitellmModel, tested with "openai/gpt-4o". When using:
result = Runner.run_streamed(triage_agent, message)
async for event in result.stream_events():
print(event)
Only a single AgentUpdatedStreamEvent
object is printed, rather than a continuous stream of events.
Reproduction Code
Agent definitions:
history_tutor_agent = Agent(
name="History Tutor",
handoff_description="Specialist agent for historical questions",
instructions="You provide assistance with historical queries. Explain important events and context clearly.",
model=LitellmModel(model=model, api_key=api_key),
)
math_tutor_agent = Agent(
name="Math Tutor",
handoff_description="Specialist agent for math questions",
instructions="You provide help with math problems. Explain your reasoning at each step and include examples",
model=LitellmModel(model=model, api_key=api_key),
)
triage_agent = Agent(
name="Triage Agent",
instructions="You determine which agent to use based on the user's homework question",
model=LitellmModel(model=model, api_key=api_key),
handoffs=[history_tutor_agent, math_tutor_agent]
)
Expected Behavior
A continuous stream of events should be produced as the agent processes the request, similar to how it works with default OpenAI models.
Current Behavior
Only a single event is output in the stream:
AgentUpdatedStreamEvent(new_agent=Agent(name='Triage Agent', instructions="You determine which agent to use based on the user's homework question", handoff_description=None, handoffs=[Agent(name='History Tutor', instructions='You provide assistance with historical queries. Explain important events and context clearly.', handoff_description='Specialist agent for historical questions', handoffs=[], model=<agents.extensions.models.litellm_model.LitellmModel object at 0x0000026AB77C67B0>, model_settings=ModelSettings(temperature=None, top_p=None, frequency_penalty=None, presence_penalty=None, tool_choice=None, parallel_tool_calls=None, truncation=None, max_tokens=None, reasoning=None, metadata=None, store=None, include_usage=None, extra_query=None, extra_body=None), tools=[], mcp_servers=[], mcp_config={}, input_guardrails=[], output_guardrails=[], output_type=None, hooks=None, tool_use_behavior='run_llm_again', reset_tool_choice=True), Agent(name='Math Tutor', instructions='You provide help with math problems. Explain your reasoning at each step and include examples', handoff_description='Specialist agent for math questions', handoffs=[], model=<agents.extensions.models.litellm_model.LitellmModel object at 0x0000026AB7828A50>, model_settings=ModelSettings(temperature=None, top_p=None, frequency_penalty=None, presence_penalty=None, tool_choice=None, parallel_tool_calls=None, truncation=None, max_tokens=None, reasoning=None, metadata=None, store=None, include_usage=None, extra_query=None, extra_body=None), tools=[], mcp_servers=[], mcp_config={}, input_guardrails=[], output_guardrails=[], output_type=None, hooks=None, tool_use_behavior='run_llm_again', reset_tool_choice=True)], model=<agents.extensions.models.litellm_model.LitellmModel object at 0x0000026AB7828E10>, model_settings=ModelSettings(temperature=None, top_p=None, frequency_penalty=None, presence_penalty=None, tool_choice=None, parallel_tool_calls=None, truncation=None, max_tokens=None, reasoning=None, metadata=None, store=None, include_usage=None, extra_query=None, extra_body=None), tools=[], mcp_servers=[], mcp_config={}, input_guardrails=[], output_guardrails=[], output_type=None, hooks=None, tool_use_behavior='run_llm_again', reset_tool_choice=True), type='agent_updated_stream_event')
Additional Information
Runner.run()
(non-streaming version) works correctly with the same LitellmModel agents.- When using default OpenAI models (without specifying
model=LitellmModel
), streaming works correctly:
history_tutor_agent = Agent(
name="History Tutor",
handoff_description="Specialist agent for historical questions",
instructions="You provide assistance with historical queries. Explain important events and context clearly.",
)
math_tutor_agent = Agent(
name="Math Tutor",
handoff_description="Specialist agent for math questions",
instructions="You provide help with math problems. Explain your reasoning at each step and include examples",
)
triage_agent = Agent(
name="Triage Agent",
instructions="You determine which agent to use based on the user's homework question",
handoffs=[history_tutor_agent, math_tutor_agent]
)
This indicates the issue is specifically with the LitellmModel integration in the streaming functionality of the Agents SDK.
Environment Information
- Tested with model: "openai/gpt-4o" via LitellmModel
- Issue appears to be related to the interaction between LitellmModel and the streaming functionality