Skip to content

Open-source observability and monitoring platform for AI agents. Real-time tracing, metrics, cost tracking, and debugging for production AI agent systems.

License

Notifications You must be signed in to change notification settings

cogniolab/agent-monitor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

📊 Agent Monitor

License: MIT Python 3.10+ Agent Compatible

Open-source observability and monitoring platform for AI agents

Real-time tracing, metrics, cost tracking, and debugging for production AI agent systems. Works with OpenAI, Claude, LangChain, and custom agents.


🎯 The Problem

Production AI agents are black boxes:

  • ❌ No visibility into what agents are doing
  • ❌ No way to track costs per task
  • ❌ Debugging failures is impossible
  • ❌ Performance bottlenecks are hidden
  • ❌ Token usage is untracked

Agent Monitor solves all of this.


✨ Features

📈 Real-Time Tracing

  • Execution Traces: See every step agents take
  • Task Flows: Visualize multi-agent workflows
  • Tool Calls: Track which tools agents use
  • LLM Interactions: Monitor prompts and responses

💰 Cost Tracking

  • Per-Task Costs: Track spending per operation
  • Token Usage: Input/output tokens for every call
  • Cost Trends: Identify expensive operations
  • Budget Alerts: Get notified before overspending

Performance Metrics

  • Latency: Response times per agent/operation
  • Throughput: Requests per second
  • Error Rates: Track failures and retries
  • Success Rates: Monitor agent reliability

🐛 Debugging

  • Error Tracking: Capture and categorize errors
  • Replay: Replay failed executions
  • Logs: Structured logging with context
  • Alerts: Get notified of issues

📊 Analytics Dashboard

  • Real-Time Dashboard: Live metrics and traces
  • Historical Analysis: Trends over time
  • Cost Reports: Spending breakdowns
  • Performance Reports: Bottleneck identification

🚀 Quick Start (2 Minutes)

Install

pip install agent-monitor

Monitor Your Agents

OpenAI Agents:

from agent_monitor import AgentMonitor
from openai import OpenAI

# Initialize monitor
monitor = AgentMonitor(api_key="your-api-key")

# Wrap your OpenAI client
client = monitor.wrap_openai(OpenAI())

# Use normally - automatically monitored!
response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Hello"}]
)

# View metrics
print(f"Cost: ${monitor.get_cost():.4f}")
print(f"Tokens: {monitor.get_tokens()}")

Claude (Anthropic):

from agent_monitor import AgentMonitor
from anthropic import Anthropic

monitor = AgentMonitor(api_key="your-api-key")
client = monitor.wrap_anthropic(Anthropic())

# Automatically tracked!
response = client.messages.create(
    model="claude-3-opus-20240229",
    messages=[{"role": "user", "content": "Hello"}]
)

LangChain:

from agent_monitor import AgentMonitor
from langchain.agents import AgentExecutor

monitor = AgentMonitor(api_key="your-api-key")

# Wrap your agent
agent = monitor.wrap_langchain(your_agent_executor)

# All executions automatically tracked!
result = agent.run("Your query")

📊 Dashboard

Access the web dashboard:

agent-monitor serve --port 3000

Then open: http://localhost:3000

Dashboard Features:

  • Live Traces: See agents executing in real-time
  • Cost Analytics: Track spending per agent/task
  • Performance Graphs: Latency, throughput, errors
  • Token Usage: Visualize token consumption
  • Error Explorer: Debug failures quickly
  • Custom Metrics: Add your own metrics

🏗️ Architecture

┌─────────────────────────────────────┐
│     Your AI Agent Application       │
│                                     │
│  ┌──────────────────────────────┐  │
│  │  OpenAI / Claude / LangChain │  │
│  │  (wrapped by Agent Monitor)  │  │
│  └────────────┬─────────────────┘  │
└───────────────┼─────────────────────┘
                │ Telemetry Data
                ↓
┌─────────────────────────────────────┐
│       Agent Monitor (Backend)       │
│  ┌──────────────────────────────┐  │
│  │  Collectors                  │  │  ← Capture traces
│  ├──────────────────────────────┤  │
│  │  Processors                  │  │  ← Compute metrics
│  ├──────────────────────────────┤  │
│  │  Exporters                   │  │  ← Store data
│  └──────────────────────────────┘  │
└────────────────┬────────────────────┘
                 │
                 ↓
┌─────────────────────────────────────┐
│   Storage (Postgres / TimescaleDB) │
└────────────────┬────────────────────┘
                 │
                 ↓
┌─────────────────────────────────────┐
│      Web Dashboard (React)          │
│  - Real-time traces                 │
│  - Cost analytics                   │
│  - Performance metrics              │
└─────────────────────────────────────┘

📚 Use Cases

1. Cost Optimization

monitor = AgentMonitor(api_key="...")

# Run your agents
for task in tasks:
    agent.run(task)

# Analyze costs
report = monitor.cost_report(
    group_by="task_type",
    time_range="last_7_days"
)

print(f"Most expensive task: {report.top_cost_task}")
print(f"Total spend: ${report.total_cost:.2f}")

2. Performance Debugging

# Find slow operations
slow_ops = monitor.query(
    metric="latency",
    threshold=">5s",
    time_range="last_hour"
)

for op in slow_ops:
    print(f"Slow: {op.agent} - {op.operation} took {op.duration}s")
    print(f"Trace: {op.trace_url}")

3. Error Tracking

# Get error rate
error_rate = monitor.metrics.error_rate(period="1h")

if error_rate > 0.05:  # 5% error rate
    errors = monitor.get_errors(limit=10)
    for error in errors:
        print(f"Error: {error.message}")
        print(f"Stack trace: {error.stack_trace}")
        print(f"Replay URL: {error.replay_url}")

4. Multi-Agent Workflows

# Track complex workflows
with monitor.trace("customer_support_workflow"):
    # Step 1
    with monitor.span("classify_intent"):
        intent = classifier_agent.run(query)

    # Step 2
    with monitor.span("route_to_specialist"):
        specialist = router_agent.select(intent)

    # Step 3
    with monitor.span("generate_response"):
        response = specialist_agent.run(query)

# View complete workflow trace in dashboard

🎨 Metrics Collected

Execution Metrics

  • agent.requests.total - Total requests
  • agent.requests.duration - Request latency
  • agent.requests.errors - Error count
  • agent.requests.success_rate - Success percentage

Cost Metrics

  • agent.cost.total_usd - Total cost
  • agent.cost.per_request - Cost per request
  • agent.tokens.input - Input tokens
  • agent.tokens.output - Output tokens

Performance Metrics

  • agent.latency.p50 - Median latency
  • agent.latency.p95 - 95th percentile
  • agent.latency.p99 - 99th percentile
  • agent.throughput.rps - Requests per second

Tool Usage Metrics

  • agent.tools.calls - Tool invocations
  • agent.tools.duration - Tool execution time
  • agent.tools.errors - Tool failures

🔧 Configuration

Basic Configuration

from agent_monitor import AgentMonitor, Config

config = Config(
    # API key for Agent Monitor cloud (optional)
    api_key="your-api-key",

    # Local storage (default)
    storage="sqlite:///agent_monitor.db",

    # Or PostgreSQL
    # storage="postgresql://user:pass@localhost/agent_monitor",

    # Sampling rate (1.0 = 100%)
    sample_rate=1.0,

    # Enable dashboard
    dashboard_enabled=True,
    dashboard_port=3000,
)

monitor = AgentMonitor(config)

Advanced Configuration

# agent_monitor.yaml
storage:
  type: postgresql
  host: localhost
  port: 5432
  database: agent_monitor

collectors:
  - type: openai
    enabled: true
  - type: anthropic
    enabled: true
  - type: langchain
    enabled: true

exporters:
  - type: prometheus
    port: 9090
  - type: jaeger
    endpoint: http://jaeger:14268

dashboard:
  enabled: true
  port: 3000
  auth: basic

alerts:
  - name: high_cost
    condition: cost_per_hour > 10.0
    channels: [email, slack]
  - name: high_error_rate
    condition: error_rate > 0.05
    channels: [pagerduty]

📦 Supported Platforms

Platform Status Features
OpenAI ✅ Full Support Agents SDK, Chat API, Assistants
Anthropic (Claude) ✅ Full Support Messages API, Computer Use
LangChain ✅ Full Support Agents, Chains, Tools
LlamaIndex ✅ Full Support Agents, Query Engines
Custom Agents ✅ Full Support Manual instrumentation
Hybrid Framework ✅ Full Support Multi-platform agents

🚢 Deployment

Docker

docker run -d \
  -p 3000:3000 \
  -p 9090:9090 \
  -e STORAGE_URL=postgresql://... \
  cogniolab/agent-monitor

Docker Compose

version: '3.8'
services:
  agent-monitor:
    image: cogniolab/agent-monitor
    ports:
      - "3000:3000"  # Dashboard
      - "9090:9090"  # Metrics
    environment:
      - STORAGE_URL=postgresql://postgres:5432/agent_monitor
    depends_on:
      - postgres

  postgres:
    image: timescale/timescaledb:latest-pg15
    environment:
      - POSTGRES_PASSWORD=password
    volumes:
      - postgres_data:/var/lib/postgresql/data

volumes:
  postgres_data:

Kubernetes

helm repo add agent-monitor https://charts.cogniolab.com
helm install agent-monitor agent-monitor/agent-monitor

📊 Dashboard Screenshots

Live Traces

┌─────────────────────────────────────────────────────┐
│  Live Agent Traces                          ⟳ Live  │
├─────────────────────────────────────────────────────┤
│  customer_support_workflow              2.3s  $0.05 │
│  ├─ classify_intent                    0.5s  $0.01  │
│  ├─ route_to_specialist                0.3s  $0.00  │
│  └─ generate_response                  1.5s  $0.04  │
│                                                      │
│  research_pipeline                      5.1s  $0.12 │
│  ├─ search_documents                   2.0s  $0.03  │
│  ├─ analyze_results                    2.5s  $0.08  │
│  └─ synthesize_answer                  0.6s  $0.01  │
└─────────────────────────────────────────────────────┘

Cost Analytics

┌─────────────────────────────────────────────────────┐
│  Cost Analytics                     Last 7 Days     │
├─────────────────────────────────────────────────────┤
│  Total Spend:            $127.45                     │
│  Avg Cost/Day:           $18.21                      │
│  Most Expensive Agent:   research_agent ($45.32)    │
│  Top Cost Operation:     gpt-4 completions ($78.12) │
│                                                      │
│  [Cost Trend Graph]                                  │
│   $30 ┤           ╭╮                                 │
│   $20 ┤     ╭─╮   │╰╮                                │
│   $10 ┤  ╭──╯ ╰───╯ ╰─╮                              │
│    $0 ┴────────────────────                          │
│       Mon Tue Wed Thu Fri Sat Sun                    │
└─────────────────────────────────────────────────────┘

🔌 Integrations

OpenTelemetry (NEW! ✨)

Export agent telemetry to enterprise observability platforms:

Jaeger (Open Source):

from agent_monitor.exporters import OpenTelemetryExporter, create_jaeger_config

config = create_jaeger_config(service_name="my-agent")
exporter = OpenTelemetryExporter(config)
exporter.start()

# Export agent execution
exporter.export_agent_execution(
    agent_name="customer_support",
    operation="handle_inquiry",
    duration_seconds=2.3,
    cost_usd=0.05,
    tokens_input=1500,
    tokens_output=500,
    success=True,
)

Datadog (Commercial APM):

from agent_monitor.exporters import create_datadog_config

config = create_datadog_config(
    service_name="my-agent",
    api_key=os.getenv("DD_API_KEY")
)
exporter = OpenTelemetryExporter(config)
exporter.start()

New Relic (Full-Stack Observability):

from agent_monitor.exporters import create_newrelic_config

config = create_newrelic_config(
    service_name="my-agent",
    api_key=os.getenv("NEW_RELIC_API_KEY")
)
exporter = OpenTelemetryExporter(config)
exporter.start()

Supported Platforms:

  • ✅ Jaeger (open-source distributed tracing)
  • ✅ Datadog (full-stack APM)
  • ✅ New Relic (full-stack observability)
  • ✅ Grafana Cloud
  • ✅ Prometheus (via OTLP)
  • ✅ AWS X-Ray
  • ✅ Google Cloud Trace

View Integration Examples →


Prometheus

from agent_monitor import AgentMonitor
from agent_monitor.exporters import PrometheusExporter

monitor = AgentMonitor()
monitor.add_exporter(PrometheusExporter(port=9090))

Grafana

Import our pre-built dashboards:

  • Agent Performance Dashboard
  • Cost Analytics Dashboard
  • Error Tracking Dashboard

Slack Alerts

monitor.add_alert(
    name="high_cost",
    condition="cost_per_hour > 10.0",
    channel="slack://ops-alerts"
)

📖 API Reference

AgentMonitor

class AgentMonitor:
    def __init__(self, config: Config = None):
        """Initialize monitor"""

    def wrap_openai(self, client: OpenAI) -> OpenAI:
        """Wrap OpenAI client for monitoring"""

    def wrap_anthropic(self, client: Anthropic) -> Anthropic:
        """Wrap Anthropic client for monitoring"""

    def wrap_langchain(self, agent: AgentExecutor) -> AgentExecutor:
        """Wrap LangChain agent for monitoring"""

    def trace(self, name: str) -> ContextManager:
        """Create a trace context"""

    def span(self, name: str) -> ContextManager:
        """Create a span within a trace"""

    def get_metrics(self, **filters) -> MetricsResult:
        """Query metrics"""

    def cost_report(self, **options) -> CostReport:
        """Generate cost report"""

🎯 Roadmap

  • OpenAI support
  • Claude/Anthropic support
  • LangChain support
  • Web dashboard
  • Cost tracking
  • Real-time tracing
  • OpenTelemetry integration ⭐ NEW
  • Jaeger/Datadog/New Relic exporters ⭐ NEW
  • LlamaIndex support
  • AutoGPT support
  • Alerting system
  • Mobile app
  • AI-powered insights
  • Anomaly detection
  • Cost optimization recommendations

🤝 Contributing

We welcome contributions! See CONTRIBUTING.md

Areas we'd love help with:

  • Additional platform integrations
  • Dashboard improvements
  • Documentation
  • Bug fixes
  • Performance optimizations

📝 Examples

See examples/ for complete implementations:

  • Basic: Simple monitoring setup
  • OpenAI: Monitor OpenAI agents
  • Claude: Monitor Claude agents
  • LangChain: Monitor LangChain agents
  • Multi-Agent: Complex workflow monitoring
  • OpenTelemetry Integration: Export to Jaeger, Datadog, New Relic ⭐ NEW

💡 Why Agent Monitor?

vs. LangSmith

  • ✅ Open source (not proprietary)
  • ✅ Self-hosted option
  • ✅ Works with ALL agent frameworks
  • ✅ No vendor lock-in

vs. Custom Logging

  • ✅ Pre-built dashboards
  • ✅ Cost tracking built-in
  • ✅ Real-time visualization
  • ✅ Production-ready

vs. Application Monitoring (DataDog, New Relic)

  • ✅ Agent-specific metrics
  • ✅ Token usage tracking
  • ✅ Cost per operation
  • ✅ LLM interaction traces
  • Plus: Now integrates with DataDog, New Relic, and Jaeger! ⭐ NEW

📊 Performance

Overhead: < 1ms per agent call Storage: ~1KB per trace Dashboard: Real-time updates (< 100ms latency) Scalability: Handles 10,000+ agents


🔒 Security

  • 🔐 End-to-end encryption
  • 🔒 No LLM API keys stored
  • 🛡️ SOC 2 Type II compliant (cloud version)
  • ✅ GDPR compliant
  • 📝 Audit logs

💬 Community

Join our community to ask questions, share ideas, and connect with other developers monitoring and debugging AI agents!

We're building a supportive community where developers help each other make AI agents observable and debuggable. Whether you're just getting started with observability or managing production systems, your questions and contributions are welcome!


📜 License

MIT License - see LICENSE


🙏 Acknowledgments

Built by Cognio AI Lab to make AI agents observable and debuggable.

Related projects:


Ready to see what your agents are actually doing? Get Started →

Star this repo if you find it useful!

About

Open-source observability and monitoring platform for AI agents. Real-time tracing, metrics, cost tracking, and debugging for production AI agent systems.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages