Skip to content

Recipes for scalable, streaming-first multi-agent systems in cloud. It combines A2A protocol, OpenAI Agents SDK, and Dapr building blocks (PubSub + Virtual Actors) to show how AI agents can stream outputs and coordinate at scale.

Notifications You must be signed in to change notification settings

mjunaidca/infinite-agent-streams

Repository files navigation

๐ŸŒŒ Infinite Agent Streams

Infinite Agent Streams is a playground for experimenting with streaming multi-agent systems across different runtimes and patterns. It provides progressive โ€œrecipesโ€ (R1 โ†’ R4) to explore A2A protocol, Dapr Virtual Actors, PubSub transports, and Ray Actors, all tied together with a frontend inspector UI. The agentic engine is OpenAI Agents SDK and can be swapped with any Agentic Framework.


๐Ÿš€ Vision

Build a foundation for scalable, streaming-first AI agents that can run in the cloud, collaborate in real time, and scale from lightweight demos to production-grade deployments.

Key themes:

  • ๐Ÿ“ก Streaming-first (SSE / WebSocket / PubSub)
  • ๐ŸŽญ Actors as Agents (Dapr + Ray)
  • ๐Ÿ”Œ Pluggable Transports (Dapr PubSub, Redis, Ray Streams, A2A protocol)
  • ๐Ÿ–ผ๏ธ Multi-Modal Extensions (text, image, video, voice, attachments)
  • ๐Ÿ› ๏ธ Custom UI Inspector (for debugging + visualization)

๐Ÿ“‚ Repository Layout

Each recipe is fully self-contained with code, setup steps, and docs:

infinite-agent-streams/
โ”‚
โ”œโ”€โ”€ r1-a2a-agents/
โ”œโ”€โ”€ r2-a2a-dapr-eda-agents/
โ”œโ”€โ”€ r3-a2a-dapr-actors/
โ”œโ”€โ”€ r4-a2a-ray-actors/
โ”‚
โ”œโ”€โ”€ edition-modalities/   # Extra: Image, Video, Voice, Attachments
โ”œโ”€โ”€ ui-inspector/         # Custom frontend for visualization
โ””โ”€โ”€ README.md

๐Ÿ”‘ Recipes

  • Pure A2A with Agents SDK.
  • Stateless interactions over HTTP transport.
  • Best for minimal ping-pong agent demo.
sequenceDiagram
  participant FE as Frontend
  participant A2A as A2A Layer
  participant Agent as Agent SDK
  FE->>A2A: Send user query
  A2A->>Agent: Forward via AgentExecutor
  Agent->>A2A: Response
  A2A->>FE: Stream back SSE
Loading

This recipe runs everything in one container + one FastAPI server โ€” keeping things simple and cost-free. Perfect if you donโ€™t know Kubernetes yet or want to deploy on serverless containers (Fly.io, Railway, Render, etc.).

+--------------------------------------+
|         ๐Ÿš€ Single Container           |
|                                      |
|  +-----------+   +-------------+     |
|  |   A2A     |   |     BFF     |     |
|  |  Server   |<->|   (API)     |     |
|  +-----------+   +-------------+     |
|          \             |             |
|           \            |             |
|            \           v             |
|          +------------------+        |
|          |   AI Agent(s)    |        |
|          |  (OpenAI, etc.)  |        |
|          +------------------+        |
|                                      |
+--------------------------------------+

โšก No extra infra required. You know just Python and any Agentic Engine. ๐Ÿ‘‰ See r1-a2a-agents/README.md for full setup instructions.


Now our agents will move from point to point to EDA we will separate A2A+BFF from the Agent to allow independent scaling and sidecar-powered resilience.

  • Adds Dapr PubSub (e.g., Redis) for scalable async streaming.
  • Decouples A2A from agent runtime.
  • Setup:

    • Container 1: A2A Server (with Dapr sidecar)
    • Container 2: Agent (with Dapr sidecar)
  • Infra: Still container-based (Docker), but more production-friendly.

  • Highlights:

    • PubSub and service discovery via Dapr
    • Agents can scale independently
    • Same local dev story as Recipe 1, but closer to Kubernetes-native
flowchart LR
  FE[Frontend UI] --> A2A[A2A Layer]
  A2A --> PubSub[(Dapr PubSub / Redis)]
  PubSub --> Agent[Agent SDK Worker]
  Agent --> PubSub
  PubSub --> A2A --> FE
Loading

๐Ÿ‘‰ See r2-a2a-dapr-eda-agents/README.md for details.


R3: A2A + Agents SDK + Dapr Virtual Actors + PubSub

  • Each agent = Dapr Virtual Actor with state + concurrency safety.
  • Uses PubSub for event-driven streaming.
  • Ideal for multi-agent collaboration.
flowchart LR
  FE[Frontend] --> A2A
  A2A --> PubSub[(Redis / Kafka)]
  PubSub --> Actor1[Dapr Virtual Actor A]
  PubSub --> Actor2[Dapr Virtual Actor B]
  Actor1 --> PubSub
  Actor2 --> PubSub
  PubSub --> A2A --> FE
Loading

R4: A2A + Ray Actors (Scalable Compute) [Dropped]

After going through Ray Docs for now decided to drop this recipe. Instead will take the work done in first 3 and get it production ready for multi users and multi tenants and multi agent tasks for each user.

  • Integrates Ray Actors for heavy compute / distributed workloads.
  • A2A orchestrates Ray Actors just like Dapr ones.
  • Use case: LLM pipelines, simulations, ML model serving.

R4 (Rebooted): A2A + Dapr Actors for Multi-User / Multi-Agent Tasks

  • Frontend: SSE / WebSocket inspector + user UI.

  • A2A Layer: Orchestrates requests, routes to correct agent actor.

  • Dapr Actors: Each agent represents a virtual actor:

    • Holds state (memory, session, tools, goals)
    • Handles multiple tasks concurrently (using async patterns / reminders)
    • Scales horizontally per user / tenant
  • PubSub Layer: Dapr handles async communication for streaming events, job queues, and inter-agent collaboration.

  • Data Layer: Vector DB / Redis / SQL for agent memory, embeddings, and cross-agent data.

flowchart LR
  FE[Frontend UI] --> A2A[A2A Layer]
  A2A --> PubSub[(Dapr PubSub)]
  PubSub --> Actor1[Dapr Virtual Actor A]
  PubSub --> Actor2[Dapr Virtual Actor B]
  Actor1 --> PubSub
  Actor2 --> PubSub
  PubSub --> A2A --> FE
  Actor1 --> DB[(Vector DB / Redis)]
  Actor2 --> DB
Loading

  1. Streaming-first: SSE / WebSocket support integrated with PubSub โ†’ partial results can be pushed to frontend.
  2. Multi-tasking: Each actor supports multiple async jobs concurrently using Python asyncio + Dapr reminders for background tasks.
  3. Multi-user / multi-tenant:
    • Actors scoped by user_id and tenant_id
    • Enables isolated memory, limits, and policies per user
    • Actor Naming: "agent-{tenant_id}-{user_id}-{agent_id}"
  4. State & Memory:
    • Keep persistent state in actor + external DB for heavy/large memory
    • Handles retries and idempotency for robust production behavior
  5. Inter-agent collaboration:
    • PubSub channels allow agents to communicate / delegate subtasks
    • Supports future extensions like A2A or MCP patterns

๐ŸŽจ Edition: Modalities & Custom UI

  • Extend agents with Image, Video, Voice, and File Attachments.
  • Provide a Custom Inspector UI for monitoring multi-agent streams.
  • Full E2E streaming demo with interactive visualizations.

๐Ÿ“ Architecture Overview

flowchart TD
  FE[Frontend UI] --> BFF[FastAPI BFF]
  BFF --> A2A[A2A Layer]
  A2A --> Dapr[Dapr Virtual Actors / PubSub]
  A2A --> Ray[Ray Actor Cluster]
  Dapr -->|Stream Events| A2A
  Ray -->|Stream Results| A2A
  A2A --> BFF --> FE
Loading
  • BFF (Backend for Frontend) = FastAPI with SSE/WebSockets.
  • A2A = common protocol layer for streaming.
  • Dapr / Ray = interchangeable agent runtimes.

โš–๏ธ Why These Choices?

  • Dapr PubSub โ†’ Abstracts Kafka, Redis, RabbitMQ.
  • Dapr Actors โ†’ Lightweight, stateful, event-driven agents.
  • Ray Actors โ†’ Scalable compute and distributed tasks.
  • A2A Protocol โ†’ Standardized agent-to-agent communication.
  • FastAPI BFF โ†’ Simple frontend bridge for real-time SSE/WebSocket.

โš–๏ธ Ray vs Dapr vs Both

Runtime Best For Limitation
Dapr Cloud-native agents, service integration, Pub/Sub, stateful actors Actors donโ€™t natively stream (need Pub/Sub or WS)
Ray Heavy compute, agent swarms, distributed ML tasks Python-only, not ideal for service mesh or infra integration
Both Dapr orchestrates, Ray executes massive parallel workloads More moving parts, hybrid setup

๐Ÿ‘‰ Use Dapr if your priority is cloud infra + streaming. ๐Ÿ‘‰ Use Ray if your priority is distributed compute. ๐Ÿ‘‰ Use Both when you need scalable infra + scalable compute.


๐Ÿ› ๏ธ Development Setup

  1. Clone repo:

    git clone https://github.com/mjunaidca/infinite-agent-streams.git
    cd infinite-agent-streams
  2. Install deps:

  3. Run a recipe:

  4. Start inspector UI:



๐Ÿค Contributing

  • Each recipe is standalone โ†’ PRs should add a new recipe or improve an existing one.
  • Use GitHub Issues for design discussions.
  • Add diagrams and documentation for clarity.

๐Ÿ”ฎ Future Roadmap

  • Realtime media infra (WebRTC, LiveKit)
  • Observability (OpenTelemetry + Jaeger)
  • WASM agents in sidecars
  • Hybrid K8s cluster (Dapr + Ray)
  • Expand A2A to realtime transport (beyond HTTP-only)

๐Ÿ“œ License

MIT โ€“ open for experimentation, learning, and building the future of agent streaming systems.


๐ŸŽฏ Vision

A cloud-native runtime for AI agents that is:

  • Streaming-first
  • Scalable by design
  • Modular (plug in LangGraph, AutoGen, CrewAI, etc. via A2A)

This repo is an idea lab: fork, extend, and contribute recipes that push multi-agent systems forward.


About

Recipes for scalable, streaming-first multi-agent systems in cloud. It combines A2A protocol, OpenAI Agents SDK, and Dapr building blocks (PubSub + Virtual Actors) to show how AI agents can stream outputs and coordinate at scale.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published