Summary:
When using an Agent as a tool inside an orchestrator agent (the multi-agent pattern), the sub-agent's default callback_handler bleeds its token stream into the parent agent's output stream. This causes duplicate/interleaved output and the orchestrator generates a degraded final response because it "sees" the sub-agent already responded.
This is not documented anywhere. The fix (callback_handler=None) exists but is completely undiscoverable. A production developer building a multi-agent pipeline will hit this and have no path to the solution.
Steps to Reproduce:
from strands import Agent, tool
from strands.models import ... # your model
sub_agent = Agent(
model=model,
system_prompt="You are a news fetcher.",
tools=[some_tool]
# No callback_handler=None here — this is the bug trigger
)
@tool(name="live_news", description="Fetches latest news")
def live_news(input: str) -> str:
result = sub_agent(input)
return str(result)
orchestrator = Agent(
model=model,
system_prompt="You are Jarvis.",
tools=[live_news]
)
result = orchestrator("What's the latest news?")
Observed Behavior
The output stream contains two interleaved responses:
From sub-agent leaking into parent stream:
* Xi and Putin meet in Beijing days after Trump's visit
* Massie loses primary challenge in victory for Trump
* Cavs blow 22-point lead in Game 1 loss to Knicks
Then the orchestrator's actual response after tool result:
Sir, I've retrieved the latest news for you.
Tokens from both the sub-agent and the orchestrator are emitted with no way to distinguish them. The orchestrator also generates a weaker response because the callback output makes it behave as if the answer was already streamed.
Additionally, without callback_handler=None, the orchestrator's framing degrades to a brief acknowledgment instead of using the tool result content:
Bad (default callback_handler on sub-agent):
Agent Result:
Sir, I've retrieved the latest news for you. ← weak, doesn't include actual headlines
Good (callback_handler=None on sub-agent):
Agent Result:
Sir, here are the latest headlines:
* Xi and Putin meet in Beijing... ← proper response with content
Root Cause:
The callback_handler in Strands is ambient/global per agent instance and is not scoped or isolated when an agent runs as a sub-tool inside a parent agent. The sub-agent shares the same stream context, so its tokens are emitted directly into the parent's output channel.
There is no stream boundary between nested agents. The framework does not automatically suppress or redirect sub-agent output when it is invoked as a tool.
Fix / Workaround
Set callback_handler=None on any agent used as a tool:
sub_agent = Agent(
model=model,
system_prompt="You are a news fetcher.",
tools=[some_tool],
callback_handler=None # ← Critical: isolates sub-agent stream from parent
)
@tool(name="live_news", description="Fetches latest news")
def live_news(input: str) -> str:
result = sub_agent(input)
return str(result) # Returns clean text content to orchestrator
With callback_handler=None:
- Sub-agent runs silently
- str(result) correctly returns only the final assistant text
- Orchestrator receives clean tool result content and frames a proper response
What the Docs Should Add
- A dedicated warning on the Custom Tools / Agent-as-Tool page:
⚠️ Important — Multi-Agent Pipelines: When using an Agent as a tool inside an orchestrator, always set callback_handler=None on the sub-agent. Without this, the sub-agent's token stream bleeds into the parent agent's output stream, causing duplicate output and degraded orchestrator responses.
- A canonical agent-as-tool example showing callback_handler=None:
# ✅ Correct pattern for agent-as-tool
sub_agent = Agent(
model=model,
system_prompt="...",
tools=[...],
callback_handler=None # Required when used as a tool
)
@tool(name="sub_agent_tool", description="...")
def sub_agent_tool(input: str) -> str:
result = sub_agent(input)
return str(result)
- Clarify whether this is a known limitation or intended behavior. If callback_handler=None is the officially recommended pattern for sub-agents, it should be the default shown in all multi-agent examples.
Severity: High — silent data corruption in output with no documented path to fix.
Summary:
When using an Agent as a tool inside an orchestrator agent (the multi-agent pattern), the sub-agent's default callback_handler bleeds its token stream into the parent agent's output stream. This causes duplicate/interleaved output and the orchestrator generates a degraded final response because it "sees" the sub-agent already responded.
This is not documented anywhere. The fix (callback_handler=None) exists but is completely undiscoverable. A production developer building a multi-agent pipeline will hit this and have no path to the solution.
Steps to Reproduce:
Observed Behavior
The output stream contains two interleaved responses:
Tokens from both the sub-agent and the orchestrator are emitted with no way to distinguish them. The orchestrator also generates a weaker response because the callback output makes it behave as if the answer was already streamed.
Additionally, without callback_handler=None, the orchestrator's framing degrades to a brief acknowledgment instead of using the tool result content:
Root Cause:
The callback_handler in Strands is ambient/global per agent instance and is not scoped or isolated when an agent runs as a sub-tool inside a parent agent. The sub-agent shares the same stream context, so its tokens are emitted directly into the parent's output channel.
There is no stream boundary between nested agents. The framework does not automatically suppress or redirect sub-agent output when it is invoked as a tool.
Fix / Workaround
Set callback_handler=None on any agent used as a tool:
With callback_handler=None:
What the Docs Should Add
Severity: High — silent data corruption in output with no documented path to fix.