Skip to content

Commit 75ddbc2

Browse files
feat: add comprehensive monitoring examples for PraisonAI Agents
Add monitoring example files demonstrating various aspects of agent monitoring: - Basic monitoring with simple agent monitoring and task timing metrics - Advanced comprehensive session monitoring capabilities - Integration examples for monitoring integrations - Telemetry integration examples - Documentation with usage examples and best practices These examples provide developers with practical implementations for monitoring PraisonAI agents in production environments. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 590c128 commit 75ddbc2

File tree

6 files changed

+1991
-0
lines changed

6 files changed

+1991
-0
lines changed

examples/monitoring/README.md

Lines changed: 303 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,303 @@
1+
# PraisonAI Performance Monitoring Examples
2+
3+
This directory contains comprehensive examples demonstrating various performance monitoring approaches for PraisonAI Agents without modifying existing code.
4+
5+
## 📁 Directory Structure
6+
7+
```
8+
monitoring/
9+
├── basic/ # Simple monitoring examples
10+
│ ├── simple_agent_monitoring.py
11+
│ └── task_timing_metrics.py
12+
├── advanced/ # Complex monitoring implementations
13+
│ └── comprehensive_session_monitoring.py
14+
├── integration/ # Integration with external systems
15+
│ └── monitoring_integrations.py
16+
├── telemetry/ # Telemetry-specific examples
17+
│ └── telemetry_integration.py
18+
└── README.md # This file
19+
```
20+
21+
## 🚀 Quick Start
22+
23+
All examples are self-contained and can be run independently:
24+
25+
```bash
26+
# Basic agent monitoring
27+
python examples/monitoring/basic/simple_agent_monitoring.py
28+
29+
# Task timing and metrics
30+
python examples/monitoring/basic/task_timing_metrics.py
31+
32+
# Advanced session monitoring
33+
python examples/monitoring/advanced/comprehensive_session_monitoring.py
34+
35+
# Telemetry integration
36+
python examples/monitoring/telemetry/telemetry_integration.py
37+
38+
# External integrations
39+
python examples/monitoring/integration/monitoring_integrations.py
40+
```
41+
42+
## 📊 Monitoring Options Overview
43+
44+
### 1. Basic Monitoring (`basic/`)
45+
46+
**Simple Agent Monitoring** (`simple_agent_monitoring.py`)
47+
- ✅ Enable metrics with `track_metrics=True`
48+
- ✅ Access metrics via `agent.last_metrics`
49+
- ✅ Session-level aggregation with `MetricsCollector`
50+
- ✅ Basic performance reporting
51+
52+
**Task Timing & Metrics** (`task_timing_metrics.py`)
53+
- ✅ Manual performance metrics creation
54+
- ✅ Context managers for automatic timing
55+
- ✅ Token metrics aggregation
56+
- ✅ Custom timing measurements
57+
- ✅ Metrics export to files
58+
59+
### 2. Advanced Monitoring (`advanced/`)
60+
61+
**Comprehensive Session Monitoring** (`comprehensive_session_monitoring.py`)
62+
- ✅ Multi-agent session tracking
63+
- ✅ Real-time performance monitoring
64+
- ✅ Automated performance alerts
65+
- ✅ Live monitoring dashboard
66+
- ✅ Agent performance rankings
67+
- ✅ Comprehensive analytics and reporting
68+
- ✅ Export capabilities (JSON + summary)
69+
70+
### 3. Telemetry Integration (`telemetry/`)
71+
72+
**Telemetry Integration** (`telemetry_integration.py`)
73+
- ✅ Automatic telemetry tracking
74+
- ✅ PostHog integration for analytics
75+
- ✅ Custom event tracking
76+
- ✅ Environment-based configuration
77+
- ✅ Debug logging for development
78+
- ✅ Session-level data aggregation
79+
80+
### 4. External Integrations (`integration/`)
81+
82+
**Monitoring Integrations** (`monitoring_integrations.py`)
83+
- ✅ SQLite database logging
84+
- ✅ Webhook notifications with cooldowns
85+
- ✅ Real-time HTTP dashboard
86+
- ✅ Configurable performance alerts
87+
- ✅ API endpoints for metrics
88+
- ✅ Multi-system integration
89+
90+
## 🔧 Configuration Options
91+
92+
### Environment Variables
93+
94+
```bash
95+
# Disable telemetry completely
96+
export PRAISONAI_TELEMETRY_DISABLED=true
97+
98+
# Enable PostHog integration
99+
export POSTHOG_API_KEY=your_posthog_key
100+
101+
# Enable debug logging
102+
export LOGLEVEL=DEBUG
103+
```
104+
105+
### Agent Configuration
106+
107+
```python
108+
# Basic monitoring
109+
agent = Agent(
110+
name="MyAgent",
111+
role="Data Analyst",
112+
track_metrics=True # Enable monitoring
113+
)
114+
115+
# Custom metrics collector
116+
collector = MetricsCollector()
117+
agent = Agent(
118+
name="MyAgent",
119+
role="Data Analyst",
120+
track_metrics=True,
121+
metrics_collector=collector # Use shared collector
122+
)
123+
```
124+
125+
### Performance Thresholds
126+
127+
```python
128+
# Configure custom performance alerts
129+
performance_thresholds = {
130+
'max_ttft': 2.0, # Maximum Time To First Token (seconds)
131+
'min_tokens_per_sec': 10.0, # Minimum tokens per second
132+
'max_total_time': 30.0 # Maximum total execution time (seconds)
133+
}
134+
```
135+
136+
## 📈 Metrics Available
137+
138+
### Token Metrics
139+
- **Input Tokens**: Tokens in the prompt/input
140+
- **Output Tokens**: Tokens in the generated response
141+
- **Total Tokens**: Combined input + output tokens
142+
- **Cached Tokens**: Tokens retrieved from cache
143+
- **Reasoning Tokens**: Tokens used for internal reasoning
144+
- **Audio Tokens**: Tokens for audio processing (if applicable)
145+
146+
### Performance Metrics
147+
- **Time To First Token (TTFT)**: Time until first token is generated
148+
- **Total Time**: Complete execution time
149+
- **Tokens Per Second (TPS)**: Generation speed
150+
- **Request Count**: Number of requests processed
151+
152+
### Session Metrics
153+
- **Session ID**: Unique session identifier
154+
- **Duration**: Total session duration
155+
- **Agent Metrics**: Per-agent token and performance statistics
156+
- **Model Metrics**: Per-model usage statistics
157+
- **Performance Rankings**: Agent performance comparisons
158+
159+
## 🎯 Use Cases & Scenarios
160+
161+
### 1. Development & Debugging
162+
- **Use**: `basic/simple_agent_monitoring.py`
163+
- **Benefits**: Quick performance insights, debug slow responses
164+
- **Setup**: Just add `track_metrics=True` to your agents
165+
166+
### 2. Production Monitoring
167+
- **Use**: `advanced/comprehensive_session_monitoring.py`
168+
- **Benefits**: Real-time alerts, performance tracking, trend analysis
169+
- **Setup**: Implement monitoring session wrapper
170+
171+
### 3. Analytics & Insights
172+
- **Use**: `telemetry/telemetry_integration.py`
173+
- **Benefits**: Long-term trends, usage patterns, optimization insights
174+
- **Setup**: Configure PostHog integration
175+
176+
### 4. Enterprise Integration
177+
- **Use**: `integration/monitoring_integrations.py`
178+
- **Benefits**: Database logging, webhook alerts, custom dashboards
179+
- **Setup**: Implement database and webhook handlers
180+
181+
## 🏆 Best Practices
182+
183+
### 1. **Choose the Right Level**
184+
```python
185+
# Development: Basic monitoring
186+
agent = Agent(name="DevAgent", track_metrics=True)
187+
188+
# Production: Session-level monitoring
189+
collector = MetricsCollector()
190+
agents = [
191+
Agent(name="Agent1", track_metrics=True, metrics_collector=collector),
192+
Agent(name="Agent2", track_metrics=True, metrics_collector=collector)
193+
]
194+
195+
# Enterprise: Full integration
196+
monitoring_system = IntegratedMonitoringSystem("Production")
197+
```
198+
199+
### 2. **Performance Thresholds**
200+
```python
201+
# Set realistic thresholds based on your use case
202+
thresholds = {
203+
'max_ttft': 1.0, # Interactive: < 1s
204+
'max_ttft': 3.0, # Batch processing: < 3s
205+
'min_tokens_per_sec': 20.0, # High-performance: > 20 TPS
206+
'min_tokens_per_sec': 5.0, # Standard: > 5 TPS
207+
}
208+
```
209+
210+
### 3. **Export & Analysis**
211+
```python
212+
# Regular exports for analysis
213+
collector.export_metrics(f"metrics_{datetime.now().strftime('%Y%m%d')}.json")
214+
215+
# Database logging for long-term storage
216+
db_logger = DatabaseLogger()
217+
db_logger.log_agent_metrics(session_id, agent_name, task, tokens, perf)
218+
```
219+
220+
### 4. **Alert Management**
221+
```python
222+
# Configure cooldowns to prevent alert spam
223+
alert_config = AlertConfig(
224+
name="high_latency",
225+
condition=lambda data: data['ttft'] > 2.0,
226+
cooldown_seconds=300 # 5-minute cooldown
227+
)
228+
```
229+
230+
## 🔍 Troubleshooting
231+
232+
### Common Issues
233+
234+
1. **No metrics collected**
235+
- ✅ Ensure `track_metrics=True` on agents
236+
- ✅ Check if telemetry is disabled: `PRAISONAI_TELEMETRY_DISABLED`
237+
238+
2. **PostHog integration not working**
239+
- ✅ Set `POSTHOG_API_KEY` environment variable
240+
- ✅ Check network connectivity
241+
- ✅ Enable debug logging: `LOGLEVEL=DEBUG`
242+
243+
3. **Database logging fails**
244+
- ✅ Check write permissions for database file
245+
- ✅ Ensure SQLite is available
246+
- ✅ Verify database schema initialization
247+
248+
4. **Dashboard not accessible**
249+
- ✅ Check if port 8080 is available
250+
- ✅ Verify HTTP server started successfully
251+
- ✅ Check firewall settings
252+
253+
### Debug Logging
254+
255+
```bash
256+
# Enable verbose logging
257+
export LOGLEVEL=DEBUG
258+
python your_monitoring_script.py
259+
260+
# Look for telemetry debug messages:
261+
# "Token usage tracked: 150 total tokens"
262+
# "Performance tracked: TTFT=0.250s, TPS=45.2"
263+
```
264+
265+
## 📚 Additional Resources
266+
267+
### Code Examples
268+
- All examples include detailed comments and documentation
269+
- Each example can be run independently
270+
- Progressive complexity from basic to advanced
271+
272+
### Integration Guides
273+
- **Database Integration**: SQLite schema and queries
274+
- **Webhook Integration**: Payload formats and error handling
275+
- **Dashboard Integration**: HTTP server and API endpoints
276+
- **Alert Configuration**: Threshold setting and cooldown management
277+
278+
### Performance Optimization
279+
- Monitor token usage to optimize prompts
280+
- Track TTFT to identify bottlenecks
281+
- Use TPS metrics to compare model performance
282+
- Analyze cache hit ratios for efficiency gains
283+
284+
## 🤝 Contributing
285+
286+
Found an issue or want to add more monitoring examples?
287+
288+
1. **Report Issues**: Create GitHub issues for bugs or feature requests
289+
2. **Add Examples**: Contribute new monitoring scenarios
290+
3. **Improve Documentation**: Help make examples clearer
291+
4. **Share Use Cases**: Document your monitoring implementations
292+
293+
---
294+
295+
## 💡 Key Takeaways
296+
297+
1. **Start Simple**: Begin with `track_metrics=True` for basic monitoring
298+
2. **Scale Gradually**: Add session-level monitoring as needed
299+
3. **Integrate Wisely**: Connect to your existing monitoring infrastructure
300+
4. **Monitor Continuously**: Set up alerts and regular reporting
301+
5. **Optimize Based on Data**: Use metrics to improve performance
302+
303+
Happy monitoring! 🚀📊

0 commit comments

Comments
 (0)