Open-source observability and monitoring platform for AI agents
Real-time tracing, metrics, cost tracking, and debugging for production AI agent systems. Works with OpenAI, Claude, LangChain, and custom agents.
Production AI agents are black boxes:
- ❌ No visibility into what agents are doing
- ❌ No way to track costs per task
- ❌ Debugging failures is impossible
- ❌ Performance bottlenecks are hidden
- ❌ Token usage is untracked
Agent Monitor solves all of this.
- Execution Traces: See every step agents take
- Task Flows: Visualize multi-agent workflows
- Tool Calls: Track which tools agents use
- LLM Interactions: Monitor prompts and responses
- Per-Task Costs: Track spending per operation
- Token Usage: Input/output tokens for every call
- Cost Trends: Identify expensive operations
- Budget Alerts: Get notified before overspending
- Latency: Response times per agent/operation
- Throughput: Requests per second
- Error Rates: Track failures and retries
- Success Rates: Monitor agent reliability
- Error Tracking: Capture and categorize errors
- Replay: Replay failed executions
- Logs: Structured logging with context
- Alerts: Get notified of issues
- Real-Time Dashboard: Live metrics and traces
- Historical Analysis: Trends over time
- Cost Reports: Spending breakdowns
- Performance Reports: Bottleneck identification
pip install agent-monitorOpenAI Agents:
from agent_monitor import AgentMonitor
from openai import OpenAI
# Initialize monitor
monitor = AgentMonitor(api_key="your-api-key")
# Wrap your OpenAI client
client = monitor.wrap_openai(OpenAI())
# Use normally - automatically monitored!
response = client.chat.completions.create(
model="gpt-4",
messages=[{"role": "user", "content": "Hello"}]
)
# View metrics
print(f"Cost: ${monitor.get_cost():.4f}")
print(f"Tokens: {monitor.get_tokens()}")Claude (Anthropic):
from agent_monitor import AgentMonitor
from anthropic import Anthropic
monitor = AgentMonitor(api_key="your-api-key")
client = monitor.wrap_anthropic(Anthropic())
# Automatically tracked!
response = client.messages.create(
model="claude-3-opus-20240229",
messages=[{"role": "user", "content": "Hello"}]
)LangChain:
from agent_monitor import AgentMonitor
from langchain.agents import AgentExecutor
monitor = AgentMonitor(api_key="your-api-key")
# Wrap your agent
agent = monitor.wrap_langchain(your_agent_executor)
# All executions automatically tracked!
result = agent.run("Your query")Access the web dashboard:
agent-monitor serve --port 3000Then open: http://localhost:3000
- ✅ Live Traces: See agents executing in real-time
- ✅ Cost Analytics: Track spending per agent/task
- ✅ Performance Graphs: Latency, throughput, errors
- ✅ Token Usage: Visualize token consumption
- ✅ Error Explorer: Debug failures quickly
- ✅ Custom Metrics: Add your own metrics
┌─────────────────────────────────────┐
│ Your AI Agent Application │
│ │
│ ┌──────────────────────────────┐ │
│ │ OpenAI / Claude / LangChain │ │
│ │ (wrapped by Agent Monitor) │ │
│ └────────────┬─────────────────┘ │
└───────────────┼─────────────────────┘
│ Telemetry Data
↓
┌─────────────────────────────────────┐
│ Agent Monitor (Backend) │
│ ┌──────────────────────────────┐ │
│ │ Collectors │ │ ← Capture traces
│ ├──────────────────────────────┤ │
│ │ Processors │ │ ← Compute metrics
│ ├──────────────────────────────┤ │
│ │ Exporters │ │ ← Store data
│ └──────────────────────────────┘ │
└────────────────┬────────────────────┘
│
↓
┌─────────────────────────────────────┐
│ Storage (Postgres / TimescaleDB) │
└────────────────┬────────────────────┘
│
↓
┌─────────────────────────────────────┐
│ Web Dashboard (React) │
│ - Real-time traces │
│ - Cost analytics │
│ - Performance metrics │
└─────────────────────────────────────┘
monitor = AgentMonitor(api_key="...")
# Run your agents
for task in tasks:
agent.run(task)
# Analyze costs
report = monitor.cost_report(
group_by="task_type",
time_range="last_7_days"
)
print(f"Most expensive task: {report.top_cost_task}")
print(f"Total spend: ${report.total_cost:.2f}")# Find slow operations
slow_ops = monitor.query(
metric="latency",
threshold=">5s",
time_range="last_hour"
)
for op in slow_ops:
print(f"Slow: {op.agent} - {op.operation} took {op.duration}s")
print(f"Trace: {op.trace_url}")# Get error rate
error_rate = monitor.metrics.error_rate(period="1h")
if error_rate > 0.05: # 5% error rate
errors = monitor.get_errors(limit=10)
for error in errors:
print(f"Error: {error.message}")
print(f"Stack trace: {error.stack_trace}")
print(f"Replay URL: {error.replay_url}")# Track complex workflows
with monitor.trace("customer_support_workflow"):
# Step 1
with monitor.span("classify_intent"):
intent = classifier_agent.run(query)
# Step 2
with monitor.span("route_to_specialist"):
specialist = router_agent.select(intent)
# Step 3
with monitor.span("generate_response"):
response = specialist_agent.run(query)
# View complete workflow trace in dashboardagent.requests.total- Total requestsagent.requests.duration- Request latencyagent.requests.errors- Error countagent.requests.success_rate- Success percentage
agent.cost.total_usd- Total costagent.cost.per_request- Cost per requestagent.tokens.input- Input tokensagent.tokens.output- Output tokens
agent.latency.p50- Median latencyagent.latency.p95- 95th percentileagent.latency.p99- 99th percentileagent.throughput.rps- Requests per second
agent.tools.calls- Tool invocationsagent.tools.duration- Tool execution timeagent.tools.errors- Tool failures
from agent_monitor import AgentMonitor, Config
config = Config(
# API key for Agent Monitor cloud (optional)
api_key="your-api-key",
# Local storage (default)
storage="sqlite:///agent_monitor.db",
# Or PostgreSQL
# storage="postgresql://user:pass@localhost/agent_monitor",
# Sampling rate (1.0 = 100%)
sample_rate=1.0,
# Enable dashboard
dashboard_enabled=True,
dashboard_port=3000,
)
monitor = AgentMonitor(config)# agent_monitor.yaml
storage:
type: postgresql
host: localhost
port: 5432
database: agent_monitor
collectors:
- type: openai
enabled: true
- type: anthropic
enabled: true
- type: langchain
enabled: true
exporters:
- type: prometheus
port: 9090
- type: jaeger
endpoint: http://jaeger:14268
dashboard:
enabled: true
port: 3000
auth: basic
alerts:
- name: high_cost
condition: cost_per_hour > 10.0
channels: [email, slack]
- name: high_error_rate
condition: error_rate > 0.05
channels: [pagerduty]| Platform | Status | Features |
|---|---|---|
| OpenAI | ✅ Full Support | Agents SDK, Chat API, Assistants |
| Anthropic (Claude) | ✅ Full Support | Messages API, Computer Use |
| LangChain | ✅ Full Support | Agents, Chains, Tools |
| LlamaIndex | ✅ Full Support | Agents, Query Engines |
| Custom Agents | ✅ Full Support | Manual instrumentation |
| Hybrid Framework | ✅ Full Support | Multi-platform agents |
docker run -d \
-p 3000:3000 \
-p 9090:9090 \
-e STORAGE_URL=postgresql://... \
cogniolab/agent-monitorversion: '3.8'
services:
agent-monitor:
image: cogniolab/agent-monitor
ports:
- "3000:3000" # Dashboard
- "9090:9090" # Metrics
environment:
- STORAGE_URL=postgresql://postgres:5432/agent_monitor
depends_on:
- postgres
postgres:
image: timescale/timescaledb:latest-pg15
environment:
- POSTGRES_PASSWORD=password
volumes:
- postgres_data:/var/lib/postgresql/data
volumes:
postgres_data:helm repo add agent-monitor https://charts.cogniolab.com
helm install agent-monitor agent-monitor/agent-monitor┌─────────────────────────────────────────────────────┐
│ Live Agent Traces ⟳ Live │
├─────────────────────────────────────────────────────┤
│ customer_support_workflow 2.3s $0.05 │
│ ├─ classify_intent 0.5s $0.01 │
│ ├─ route_to_specialist 0.3s $0.00 │
│ └─ generate_response 1.5s $0.04 │
│ │
│ research_pipeline 5.1s $0.12 │
│ ├─ search_documents 2.0s $0.03 │
│ ├─ analyze_results 2.5s $0.08 │
│ └─ synthesize_answer 0.6s $0.01 │
└─────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────┐
│ Cost Analytics Last 7 Days │
├─────────────────────────────────────────────────────┤
│ Total Spend: $127.45 │
│ Avg Cost/Day: $18.21 │
│ Most Expensive Agent: research_agent ($45.32) │
│ Top Cost Operation: gpt-4 completions ($78.12) │
│ │
│ [Cost Trend Graph] │
│ $30 ┤ ╭╮ │
│ $20 ┤ ╭─╮ │╰╮ │
│ $10 ┤ ╭──╯ ╰───╯ ╰─╮ │
│ $0 ┴──────────────────── │
│ Mon Tue Wed Thu Fri Sat Sun │
└─────────────────────────────────────────────────────┘
Export agent telemetry to enterprise observability platforms:
Jaeger (Open Source):
from agent_monitor.exporters import OpenTelemetryExporter, create_jaeger_config
config = create_jaeger_config(service_name="my-agent")
exporter = OpenTelemetryExporter(config)
exporter.start()
# Export agent execution
exporter.export_agent_execution(
agent_name="customer_support",
operation="handle_inquiry",
duration_seconds=2.3,
cost_usd=0.05,
tokens_input=1500,
tokens_output=500,
success=True,
)Datadog (Commercial APM):
from agent_monitor.exporters import create_datadog_config
config = create_datadog_config(
service_name="my-agent",
api_key=os.getenv("DD_API_KEY")
)
exporter = OpenTelemetryExporter(config)
exporter.start()New Relic (Full-Stack Observability):
from agent_monitor.exporters import create_newrelic_config
config = create_newrelic_config(
service_name="my-agent",
api_key=os.getenv("NEW_RELIC_API_KEY")
)
exporter = OpenTelemetryExporter(config)
exporter.start()Supported Platforms:
- ✅ Jaeger (open-source distributed tracing)
- ✅ Datadog (full-stack APM)
- ✅ New Relic (full-stack observability)
- ✅ Grafana Cloud
- ✅ Prometheus (via OTLP)
- ✅ AWS X-Ray
- ✅ Google Cloud Trace
from agent_monitor import AgentMonitor
from agent_monitor.exporters import PrometheusExporter
monitor = AgentMonitor()
monitor.add_exporter(PrometheusExporter(port=9090))Import our pre-built dashboards:
- Agent Performance Dashboard
- Cost Analytics Dashboard
- Error Tracking Dashboard
monitor.add_alert(
name="high_cost",
condition="cost_per_hour > 10.0",
channel="slack://ops-alerts"
)class AgentMonitor:
def __init__(self, config: Config = None):
"""Initialize monitor"""
def wrap_openai(self, client: OpenAI) -> OpenAI:
"""Wrap OpenAI client for monitoring"""
def wrap_anthropic(self, client: Anthropic) -> Anthropic:
"""Wrap Anthropic client for monitoring"""
def wrap_langchain(self, agent: AgentExecutor) -> AgentExecutor:
"""Wrap LangChain agent for monitoring"""
def trace(self, name: str) -> ContextManager:
"""Create a trace context"""
def span(self, name: str) -> ContextManager:
"""Create a span within a trace"""
def get_metrics(self, **filters) -> MetricsResult:
"""Query metrics"""
def cost_report(self, **options) -> CostReport:
"""Generate cost report"""- OpenAI support
- Claude/Anthropic support
- LangChain support
- Web dashboard
- Cost tracking
- Real-time tracing
- OpenTelemetry integration ⭐ NEW
- Jaeger/Datadog/New Relic exporters ⭐ NEW
- LlamaIndex support
- AutoGPT support
- Alerting system
- Mobile app
- AI-powered insights
- Anomaly detection
- Cost optimization recommendations
We welcome contributions! See CONTRIBUTING.md
Areas we'd love help with:
- Additional platform integrations
- Dashboard improvements
- Documentation
- Bug fixes
- Performance optimizations
See examples/ for complete implementations:
- Basic: Simple monitoring setup
- OpenAI: Monitor OpenAI agents
- Claude: Monitor Claude agents
- LangChain: Monitor LangChain agents
- Multi-Agent: Complex workflow monitoring
- OpenTelemetry Integration: Export to Jaeger, Datadog, New Relic ⭐ NEW
- ✅ Open source (not proprietary)
- ✅ Self-hosted option
- ✅ Works with ALL agent frameworks
- ✅ No vendor lock-in
- ✅ Pre-built dashboards
- ✅ Cost tracking built-in
- ✅ Real-time visualization
- ✅ Production-ready
- ✅ Agent-specific metrics
- ✅ Token usage tracking
- ✅ Cost per operation
- ✅ LLM interaction traces
- ✅ Plus: Now integrates with DataDog, New Relic, and Jaeger! ⭐ NEW
Overhead: < 1ms per agent call Storage: ~1KB per trace Dashboard: Real-time updates (< 100ms latency) Scalability: Handles 10,000+ agents
- 🔐 End-to-end encryption
- 🔒 No LLM API keys stored
- 🛡️ SOC 2 Type II compliant (cloud version)
- ✅ GDPR compliant
- 📝 Audit logs
Join our community to ask questions, share ideas, and connect with other developers monitoring and debugging AI agents!
- GitHub Discussions - Ask questions, share your work, and discuss best practices
- GitHub Issues - Bug reports and feature requests
- Email: dev@cogniolab.com
We're building a supportive community where developers help each other make AI agents observable and debuggable. Whether you're just getting started with observability or managing production systems, your questions and contributions are welcome!
MIT License - see LICENSE
Built by Cognio AI Lab to make AI agents observable and debuggable.
Related projects:
Ready to see what your agents are actually doing? Get Started →
⭐ Star this repo if you find it useful!