Track, analyze, and optimize your LLM spending. A unified dashboard for monitoring AI API costs across OpenAI, Anthropic, Google, and more.
You're building with AI. Costs add up fast:
- Multiple providers (OpenAI, Anthropic, Cohere...)
- Multiple models (GPT-4, Claude, Gemini...)
- Multiple apps and environments
- No unified view of spending
LLMOps Dashboard gives you visibility and control.
┌─────────────────────────────────────────────────────────────────┐
│ 📊 LLMOps Dashboard January 2026 │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Total Spend (MTD) Requests Avg Cost/Req │
│ ┌──────────────┐ ┌─────────┐ ┌───────────┐ │
│ │ $847.32 │ │ 124,531 │ │ $0.0068 │ │
│ │ ↑ 12% │ │ ↓ 8% │ │ ↓ 18% │ │
│ └──────────────┘ └─────────┘ └───────────┘ │
│ │
│ Cost by Provider Cost by Model │
│ ┌────────────────────┐ ┌────────────────────┐ │
│ │ OpenAI ████████ $523 │ GPT-4o ████████ $412 │
│ │ Anthropic ████ $289 │ Claude-3 ████ $289 │
│ │ Google █ $35 │ GPT-3.5 ██ $98 │
│ └────────────────────┘ └────────────────────┘ │
│ │
│ 💡 Recommendations: │
│ • Switch "summarize" feature from GPT-4 to GPT-3.5 (-$127/mo) │
│ • Enable prompt caching for repeated queries (-$45/mo) │
│ • Batch similar requests to reduce overhead (-$23/mo) │
│ │
└─────────────────────────────────────────────────────────────────┘
| Feature | Description |
|---|---|
| 📈 Multi-Provider Tracking | OpenAI, Anthropic, Google AI, Cohere, local models |
| 💰 Cost Analysis | Real-time spend tracking with projections |
| 🔍 Request Logging | Every API call logged with metadata |
| 📊 Visual Dashboard | Web UI with charts and insights |
| 💡 Smart Recommendations | AI-powered cost optimization suggestions |
| Get notified before you overspend | |
| 🏷️ Tagging & Attribution | Track costs by feature, team, or environment |
| 📤 Export & Reports | CSV, JSON, PDF reports for finance |
git clone https://github.com/edwiniac/llmops-dashboard.git
cd llmops-dashboard
# Install dependencies
pip install -r requirements.txt
# Initialize database
python -m llmops init
# Start dashboard
python -m llmops serveOpen http://localhost:8000 to view the dashboard.
Add tracking to your existing code with minimal changes:
from llmops import track_openai, track_anthropic
# Wrap your OpenAI client
from openai import OpenAI
client = track_openai(OpenAI(), tags=["production", "chatbot"])
# Use normally - calls are automatically tracked
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
# Or use the Anthropic wrapper
from anthropic import Anthropic
client = track_anthropic(Anthropic(), tags=["production", "analysis"])Route all LLM traffic through the proxy for automatic tracking:
# Start the proxy
python -m llmops proxy --port 8080
# Point your apps to the proxy
export OPENAI_BASE_URL=http://localhost:8080/openai/v1
export ANTHROPIC_BASE_URL=http://localhost:8080/anthropic# Dashboard
llmops serve # Start web dashboard
llmops serve --port 3000 # Custom port
# Tracking
llmops status # Current month summary
llmops costs # Detailed cost breakdown
llmops costs --by model # Breakdown by model
llmops costs --by tag # Breakdown by tag
llmops costs --last 7d # Last 7 days
# Analysis
llmops analyze # Get optimization recommendations
llmops report --month 2026-01 # Generate monthly report
llmops export --format csv # Export data
# Configuration
llmops config set budget 1000 # Set monthly budget
llmops config set alert-threshold 80 # Alert at 80%
llmops providers add openai # Add API key for provider
# Data
llmops logs # View recent requests
llmops logs --model gpt-4o # Filter by model
llmops init # Initialize database
llmops reset # Clear all data- Total spend (daily, weekly, monthly)
- Request volume and trends
- Cost per request over time
- Provider breakdown
- Interactive charts
- Filter by date, model, tag, provider
- Compare periods
- Drill down into specific costs
- Every API call with details
- Token counts, latency, cost
- Search and filter
- Export capability
- AI-analyzed optimization opportunities
- Estimated savings
- One-click implementation guides
- Set spending limits
- Configure alert thresholds
- Email/Slack notifications
- Automatic cutoffs (optional)
# llmops.yaml
database:
type: sqlite # sqlite, postgresql
path: ~/.llmops/data.db
providers:
openai:
api_key: ${OPENAI_API_KEY}
track: true
anthropic:
api_key: ${ANTHROPIC_API_KEY}
track: true
google:
api_key: ${GOOGLE_AI_KEY}
track: true
budgets:
monthly: 1000
alert_threshold: 0.8 # Alert at 80%
alerts:
email: your@email.com
slack_webhook: https://hooks.slack.com/...
tags:
default: ["production"]
pricing:
# Custom pricing overrides (for self-hosted models)
my-local-model:
input_per_1k: 0
output_per_1k: 0Built-in pricing for major providers (updated regularly):
| Provider | Model | Input (1K tokens) | Output (1K tokens) |
|---|---|---|---|
| OpenAI | gpt-4o | $0.0025 | $0.01 |
| OpenAI | gpt-4o-mini | $0.00015 | $0.0006 |
| OpenAI | gpt-3.5-turbo | $0.0005 | $0.0015 |
| Anthropic | claude-3-5-sonnet | $0.003 | $0.015 |
| Anthropic | claude-3-haiku | $0.00025 | $0.00125 |
| gemini-1.5-pro | $0.00125 | $0.005 | |
| gemini-1.5-flash | $0.000075 | $0.0003 |
Pricing as of January 2026. Run llmops pricing update to refresh.
┌─────────────────────────────────────────────────────────────────┐
│ Your Application │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Option A: SDK Wrapper Option B: Proxy Mode │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ track_openai(client)│ │ LLMOps Proxy │ │
│ │ track_anthropic() │ │ localhost:8080 │ │
│ └──────────┬──────────┘ └──────────┬──────────┘ │
│ │ │ │
└──────────────┼──────────────────────────────┼────────────────────┘
│ │
└──────────────┬───────────────┘
│
▼
┌──────────────────────────────┐
│ LLMOps Collector │
│ - Parse request/response │
│ - Calculate tokens & cost │
│ - Apply tags │
└──────────────┬───────────────┘
│
▼
┌──────────────────────────────┐
│ SQLite/Postgres │
│ - requests table │
│ - daily_summaries │
│ - budgets & alerts │
└──────────────┬───────────────┘
│
▼
┌──────────────────────────────┐
│ FastAPI Dashboard │
│ - REST API │
│ - Web UI (React) │
│ - Recommendations Engine │
└──────────────────────────────┘
GET /api/summary # Current period summary
GET /api/costs # Cost breakdown
GET /api/costs/daily # Daily costs
GET /api/costs/by-model # Costs by model
GET /api/costs/by-tag # Costs by tag
GET /api/requests # Request log
GET /api/requests/{id} # Single request details
GET /api/recommendations # Optimization suggestions
GET /api/budgets # Budget status
POST /api/budgets # Set budget
GET /api/export # Export data
POST /api/track # Manual tracking endpoint
The recommendation engine analyzes your usage patterns and suggests:
- Model Downgrades — Use cheaper models where quality allows
- Prompt Caching — Cache repeated prompts to reduce costs
- Batching — Combine similar requests
- Token Optimization — Shorten prompts, use system messages efficiently
- Provider Switching — Find cheaper alternatives for specific use cases
Configure alerts for:
- Budget threshold reached (e.g., 80% of monthly budget)
- Unusual spending spikes
- New high-cost models in use
- Daily/weekly summaries
Delivery options:
- Slack
- Discord
- Webhook
MIT License - see LICENSE for details.
Contributions welcome! Please see CONTRIBUTING.md for guidelines.
Stop guessing. Start tracking. Know exactly where your AI budget goes.