Optimize Latency for Parallel Agent Runs with Streaming

### Please read this first

- **Have you read the docs?**[Agents SDK docs](https://openai.github.io/openai-agents-python/) -> Yes
- **Have you searched for related issues?** Others may have had similar requests -> Yes

### Question
I'm implementing a parallel translation pattern in the `examples`, where multiple agents generate translations simultaneously, and a selection agent chooses the best one. While this approach provides quality benefits, it introduces significant latency in the user experience, especially when streaming responses.

### Current Implementation

The current implementation follows this pattern:
```python
async def main():
    msg = input("Enter message for translation")
    
    with trace("Parallel translation"):
        # Run 3 translation agents in parallel (takes ~10s total)
        res_1, res_2, res_3 = await asyncio.gather(
            Runner.run(spanish_agent, msg),
            Runner.run(spanish_agent, msg),
            Runner.run(spanish_agent, msg),
        )
        
        # Collect outputs and combine
        outputs = [
            ItemHelpers.text_message_outputs(res_1.new_items),
            ItemHelpers.text_message_outputs(res_2.new_items),
            ItemHelpers.text_message_outputs(res_3.new_items),
        ]
        translations = "\n\n".join(outputs)
        
        # Run selection agent to pick best (adds more latency)
        best_translation = await Runner.run(
            translation_picker,
            f"Input: {msg}\n\nTranslations:\n{translations}",
        )
```

### Problem Statement

The current workflow creates a significant latency issue in streaming scenarios:

1. All translation agents must complete execution (taking ~10s in parallel)
2. Only after all translations are complete can the selection agent begin processing
3. The UI remains without any output until the selection agent starts streaming
4. This latency compounds when this is part of a longer agent chain

This implementation leads to poor user experience due to long waiting periods without feedback, especially in complex agent workflows where subsequent agents depend on this translation output.

What's the recommended approach to optimize this pattern for streaming scenarios while maintaining the quality benefits of parallel execution and selection?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize Latency for Parallel Agent Runs with Streaming #498

Please read this first

Question

Current Implementation

Problem Statement

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimize Latency for Parallel Agent Runs with Streaming #498

Description

Please read this first

Question

Current Implementation

Problem Statement

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions