Search before reporting
Motivation
AI agents are becoming a major workload for data infrastructure. Apache Doris already ships an official MCP server ( apache/doris-mcp-server ) so that AI agents can query and manage the database through the Model Context Protocol natively. This gives Doris a direct bridge into the agentic AI ecosystem.
Pulsar — as the messaging and streaming backbone — is equally well-positioned for this, arguably even more so: agents need to communicate with each other, react to real-time events, and consume/produce streaming data. But Pulsar currently has no official, project-owned AI integration layer.
StreamNative has already validated this space with their MCP Server (Apache 2.0) and Agent Engine (built on Pulsar Functions), which proves the demand and technical feasibility. However, these live outside the Apache Pulsar project, making them invisible to the broader community and harder to evolve alongside Pulsar itself.
Doris and other Apache projects are already embracing the AI agent ecosystem — Pulsar risks being left out. Pulsar's architecture (pub-sub, Functions, multi-tenancy, geo-replication) is a natural fit for agent systems and should be leveraged.
Solution
Make AI agent support a first-class concern, delivered in three phases:
Phase 1: Official Pulsar MCP Server
A standalone server (repo apache/pulsar-mcp-server, like apache/doris-mcp-server) that exposes Pulsar operations as MCP tools. Under the hood it translates MCP calls into Pulsar's existing admin REST API — no broker changes required.
AI Agent pulsar-mcp-server Pulsar Cluster
(Cursor, (Go or Python)
Claude, etc.)
| | |
| MCP: list_topics() | |
| --------------------------> | GET /admin/v2/persistent/... |
| | -----------------------------> |
| | <----------------------------- |
| <-------------------------- | |
Supported tools:
| Category |
Examples |
| Topic admin |
list_topics, describe_topic, create_topic, delete_topic |
| Subscriptions |
list_subscriptions, check_backlog, seek |
| Messages |
produce_message, consume_messages, peek_messages |
| Observability |
get_broker_list, get_topic_stats, get_namespace_stats |
| Schema |
get_schema, set_schema, delete_schema |
Supports stdio (local IDE), SSE (web), and Streamable HTTP (remote). Auth via existing Pulsar token/OAuth.
Phase 2: Pulsar Functions as Agent Runtime
Pulsar Functions already provides a lightweight, serverless model that maps naturally to the event-driven agent loop:
Topic --message--> Agent Function --reasoning--> LLM API --result--> Output Topic
|
| state (compaction topic)
v
Agent State
Key additions:
- Thin Pulsar Function wrappers for agent frameworks (LangChain, LlamaIndex) — handle deserialization -> agent invocation -> result publishing.
- Compaction topics as persistent agent state store (conversation history, checkpoints).
- Lifecycle hooks and checkpointing for long-running/reactive agents.
Phase 3: Multi-Agent Communication Patterns
Document and provide reference implementations for common multi-agent topologies using existing Pulsar primitives:
| Pattern |
Pulsar mechanism |
| Request/response |
Non-persistent topic per agent, Key_Shared correlation |
| Publish/subscribe |
Persistent topic, Shared subscription |
| Agent discovery |
Compaction topic (each agent writes its capabilities) |
| Ordered pipeline |
Key_Shared subscription, same-key ordering |
Alternatives
- Rely on StreamNative's MCP server — works, but stays external, not discoverable by the Pulsar community, and may diverge from Pulsar's evolution.
- One-off REST integrations per agent framework — fragments effort, lacks a common standard, and misses the chance to define project-wide primitives.
Anything else?
No response
Are you willing to submit a PR?
Search before reporting
Motivation
AI agents are becoming a major workload for data infrastructure. Apache Doris already ships an official MCP server ( apache/doris-mcp-server ) so that AI agents can query and manage the database through the Model Context Protocol natively. This gives Doris a direct bridge into the agentic AI ecosystem.
Pulsar — as the messaging and streaming backbone — is equally well-positioned for this, arguably even more so: agents need to communicate with each other, react to real-time events, and consume/produce streaming data. But Pulsar currently has no official, project-owned AI integration layer.
StreamNative has already validated this space with their MCP Server (Apache 2.0) and Agent Engine (built on Pulsar Functions), which proves the demand and technical feasibility. However, these live outside the Apache Pulsar project, making them invisible to the broader community and harder to evolve alongside Pulsar itself.
Doris and other Apache projects are already embracing the AI agent ecosystem — Pulsar risks being left out. Pulsar's architecture (pub-sub, Functions, multi-tenancy, geo-replication) is a natural fit for agent systems and should be leveraged.
Solution
Make AI agent support a first-class concern, delivered in three phases:
Phase 1: Official Pulsar MCP Server
A standalone server (repo
apache/pulsar-mcp-server, likeapache/doris-mcp-server) that exposes Pulsar operations as MCP tools. Under the hood it translates MCP calls into Pulsar's existing admin REST API — no broker changes required.Supported tools:
list_topics,describe_topic,create_topic,delete_topiclist_subscriptions,check_backlog,seekproduce_message,consume_messages,peek_messagesget_broker_list,get_topic_stats,get_namespace_statsget_schema,set_schema,delete_schemaSupports stdio (local IDE), SSE (web), and Streamable HTTP (remote). Auth via existing Pulsar token/OAuth.
Phase 2: Pulsar Functions as Agent Runtime
Pulsar Functions already provides a lightweight, serverless model that maps naturally to the event-driven agent loop:
Key additions:
Phase 3: Multi-Agent Communication Patterns
Document and provide reference implementations for common multi-agent topologies using existing Pulsar primitives:
Alternatives
Anything else?
No response
Are you willing to submit a PR?