Skip to content

[feature] Native support for AI agent ecosystem (MCP, agent runtime, agent communication) #25881

@Chinakeyboardman

Description

@Chinakeyboardman

Search before reporting

  • I searched in the issues and found nothing similar.

Motivation

AI agents are becoming a major workload for data infrastructure. Apache Doris already ships an official MCP server ( apache/doris-mcp-server ) so that AI agents can query and manage the database through the Model Context Protocol natively. This gives Doris a direct bridge into the agentic AI ecosystem.

Pulsar — as the messaging and streaming backbone — is equally well-positioned for this, arguably even more so: agents need to communicate with each other, react to real-time events, and consume/produce streaming data. But Pulsar currently has no official, project-owned AI integration layer.

StreamNative has already validated this space with their MCP Server (Apache 2.0) and Agent Engine (built on Pulsar Functions), which proves the demand and technical feasibility. However, these live outside the Apache Pulsar project, making them invisible to the broader community and harder to evolve alongside Pulsar itself.

Doris and other Apache projects are already embracing the AI agent ecosystem — Pulsar risks being left out. Pulsar's architecture (pub-sub, Functions, multi-tenancy, geo-replication) is a natural fit for agent systems and should be leveraged.

Solution

Make AI agent support a first-class concern, delivered in three phases:

Phase 1: Official Pulsar MCP Server

A standalone server (repo apache/pulsar-mcp-server, like apache/doris-mcp-server) that exposes Pulsar operations as MCP tools. Under the hood it translates MCP calls into Pulsar's existing admin REST API — no broker changes required.

   AI Agent                   pulsar-mcp-server                  Pulsar Cluster
  (Cursor,                     (Go or Python)
   Claude, etc.)
      |                             |                                |
      |  MCP: list_topics()         |                                |
      | --------------------------> |  GET /admin/v2/persistent/...  |
      |                             | -----------------------------> |
      |                             | <----------------------------- |
      | <-------------------------- |                                |

Supported tools:

Category Examples
Topic admin list_topics, describe_topic, create_topic, delete_topic
Subscriptions list_subscriptions, check_backlog, seek
Messages produce_message, consume_messages, peek_messages
Observability get_broker_list, get_topic_stats, get_namespace_stats
Schema get_schema, set_schema, delete_schema

Supports stdio (local IDE), SSE (web), and Streamable HTTP (remote). Auth via existing Pulsar token/OAuth.

Phase 2: Pulsar Functions as Agent Runtime

Pulsar Functions already provides a lightweight, serverless model that maps naturally to the event-driven agent loop:

   Topic --message--> Agent Function --reasoning--> LLM API --result--> Output Topic
                          |
                          | state (compaction topic)
                          v
                     Agent State

Key additions:

  1. Thin Pulsar Function wrappers for agent frameworks (LangChain, LlamaIndex) — handle deserialization -> agent invocation -> result publishing.
  2. Compaction topics as persistent agent state store (conversation history, checkpoints).
  3. Lifecycle hooks and checkpointing for long-running/reactive agents.

Phase 3: Multi-Agent Communication Patterns

Document and provide reference implementations for common multi-agent topologies using existing Pulsar primitives:

Pattern Pulsar mechanism
Request/response Non-persistent topic per agent, Key_Shared correlation
Publish/subscribe Persistent topic, Shared subscription
Agent discovery Compaction topic (each agent writes its capabilities)
Ordered pipeline Key_Shared subscription, same-key ordering

Alternatives

  1. Rely on StreamNative's MCP server — works, but stays external, not discoverable by the Pulsar community, and may diverge from Pulsar's evolution.
  2. One-off REST integrations per agent framework — fragments effort, lacks a common standard, and misses the chance to define project-wide primitives.

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/enhancementThe enhancements for the existing features or docs. e.g. reduce memory usage of the delayed messages

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions