Skip to content

[FEATURE] Bidirectional Streaming #217

Open
@pgrayy

Description

@pgrayy

Overview

Bidirectional streaming enables real-time, continuous communication between clients and AI models in both directions simultaneously. Unlike traditional request-response patterns, this approach allows for simultaneous data exchange where both client and model can send and receive data incrementally. This creates a more natural interaction flow where content is processed as it arrives, without waiting for complete messages, and conversations can adapt dynamically based on ongoing inputs and outputs.

Model Support

Several providers are building models with bidirectional streaming capabilities. Some examples include:

  • Amazon: Amazon's Nova Sonic model offers real-time speech processing, interruption handling, context-aware responses, and low latency interactions, making it particularly effective for voice assistants (docs).
  • OpenAI: OpenAI also provides models with bidirectional streaming capabilities, allowing for dynamic conversation flows where the model can receive new information while generating a response (announcement).

Request

Support a bidirectional streaming interface in Strands.

Prototype

To help facilitate discussion, we have implemented a prototype for bidirectional streaming under https://github.com/pgrayy/strands-sdk-python-async (see README for instructions on testing). The prototype implements a flexible architecture designed to handle real-time, two-way communication between clients and AI models. The implementation focuses on supporting audio-based interactions with Nova Sonic while establishing patterns that could extend to other models and modalities. The key components are:

  • Bidirectional Agent: The Agent class in the bidirectional module provides an async context manager for sending data (send), an async generator for receiving data (receive), and a method to initialize bidirectional streaming (bistream). For example usage, please see https://github.com/pgrayy/sdk-python-async/blob/main/scripts/agents/bidirectional.py.
  • Model Sender/Receiver: The abstract Sender and Receiver interfaces define the contract for model providers. The Sender handles outgoing events to the model with context managers for different content types (text, audio, tools), while the Receiver processes incoming events from the model and constructs message history.
  • Event System: Events are structured as typed objects representing different kinds of streaming content, including session events (start/end), prompt events (start/end), and content events (text, audio, system, tool).
  • Nova Implementation: The Nova implementation demonstrates how to adapt a specific model to the bidirectional interface, providing a concrete example of the architecture in action.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions