Lightweight LLM routing that keeps your stack sane.
β οΈ Beta Software Notice Switch is currently in beta. Despite handling over 100 million requests per month in production, we're still refining features and may introduce breaking changes. Use in production at your own discretion.
- π OpenAI-compatible API - Drop-in replacement for
/v1/chat/completions
- π Named routes - Use
smart-model
instead ofgpt-4o-really-long-model-name-v3
- π Configurable fallbacks - Define your own fallback chains between pools and providers
- β‘ Circuit breakers - Stop hitting broken providers automatically
- π Pool-based routing - Group providers for sophisticated load balancing
- π Multi-provider support - OpenAI, Anthropic, AWS Bedrock, Together AI, RunPod, custom APIs
- π Health monitoring - Real-time provider health checks and metrics
- π Enterprise security - API key authentication, rate limiting, CORS protection
When working with multiple LLM providers (OpenAI, Anthropic, Together, local GPUs) it quickly becomes a mess:
- Every service hardcodes model names and provider logic
- When a provider goes down, half your stack breaks
- Switching providers means code changes and redeployments
- No unified interface across different APIs
Switch sits in front of your LLMs and provides clean, named routes like fast-model
or smart-model
. Behind those routes, you define where requests actually go (OpenAI, fallback to Anthropic, whatever). All in a config file.
No app code changes. No redeploys. Just update the route and carry on.
# Your app calls this
curl -X POST http://localhost:3000/v1/chat/completions \
-d '{"model": "smart-model", "messages": [...]}'
# Switch routes it to the right provider based on your config
# (fallback chains are completely configurable)
- OpenAI-compatible API - Drop-in replacement for
/v1/chat/completions
- Named routes - Use
smart-model
instead ofgpt-4o-really-long-model-name-v3
- Configurable fallbacks - Define custom fallback chains between pools and providers
- Circuit breakers - Stop hitting broken providers
- Pool-based routing - Group providers for sophisticated load balancing
- Multi-provider support - OpenAI, Anthropic, AWS Bedrock, Together AI, RunPod, custom APIs
# 1. Clone and configure
git clone https://github.com/Switchdotnew/switch-router.git
cd switch
cp .env.example .env
# Edit .env with your API keys
# 2. Run with Docker
docker-compose up -d
# 3. Test it works
curl -H "x-api-key: your-api-key" http://localhost:3000/health
# 1. Install dependencies
bun install
# 2. Configure environment
cp .env.example .env
# Edit .env with your API keys and model configuration
# 3. Start development server
bun run dev
Create a .env
file:
# Authentication
ADMIN_API_KEY=your-secret-key
# Provider credentials
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
# Basic model configuration
MODEL_DEFINITIONS='{
"credentialStores": {
"openai": {"type": "simple", "source": "env", "config": {"apiKeyVar": "OPENAI_API_KEY"}},
"anthropic": {"type": "simple", "source": "env", "config": {"apiKeyVar": "ANTHROPIC_API_KEY"}}
},
"pools": [{
"id": "smart-pool",
"name": "Smart Models",
"providers": [
{"name": "openai-gpt4o", "provider": "openai", "credentialsRef": "openai", "apiBase": "https://api.openai.com/v1", "modelName": "gpt-4o", "priority": 1},
{"name": "claude-fallback", "provider": "anthropic", "credentialsRef": "anthropic", "apiBase": "https://api.anthropic.com", "modelName": "claude-3-5-sonnet-20241022", "priority": 2}
],
"fallbackPoolIds": [],
"routingStrategy": "fastest_response",
"circuitBreaker": {"enabled": true, "failureThreshold": 3, "resetTimeout": 60000},
"healthThresholds": {"errorRate": 20, "responseTime": 30000, "consecutiveFailures": 3, "minHealthyProviders": 1}
}],
"models": {
"smart-model": {"primaryPoolId": "smart-pool"}
}
}'
# Check health
curl http://localhost:3000/health
# List available models
curl -H "x-api-key: your-secret-key" http://localhost:3000/v1/models
# Send a chat request
curl -X POST http://localhost:3000/v1/chat/completions \
-H "Content-Type: application/json" \
-H "x-api-key: your-secret-key" \
-d '{"model": "smart-model", "messages": [{"role": "user", "content": "Hello!"}]}'
# Production-ready setup with monitoring
docker-compose -f docker-compose.yml up -d
# Minimal K8s deployment
kubectl apply -f k8s/
- AWS ECS/Fargate
- Google Cloud Run
- Azure Container Instances
- Any Docker-compatible platform
Provider | Status | Models | Features |
---|---|---|---|
VLLM | β Full | Any model | Chat, Streaming, Functions, Vision |
OpenAI | β Full | GPT-4o, GPT-4, GPT-3.5 | Chat, Streaming, Functions, Vision |
Anthropic | β Full | Claude 3.5 Sonnet/Haiku | Chat, Streaming, Functions, Vision |
AWS Bedrock | β Full | 50+ Models (Claude, Llama, Nova) | Chat, Streaming, Functions |
Together AI | β Via OpenAI API | Llama, Mixtral, Code Llama | Chat, Streaming |
RunPod | β Via OpenAI API | Custom Models | Chat, Streaming |
Google Vertex | π§ Coming Soon | Gemini, PaLM | - |
Azure OpenAI | π§ Coming Soon | GPT Models | - |
- Getting Started Guide - Detailed setup instructions
- Configuration Guide - Complete configuration reference
- Provider Setup - Provider-specific setup instructions
- Troubleshooting - Common issues and solutions
This project is licensed under the Sustainable Use License - see the LICENSE file for details.
Switch thrives on community contributions! Whether you're fixing bugs, adding providers, or improving docs - we'd love your help.
- Found a bug or have an idea? Open a GitHub issue
- Want to add a new provider? Check out our provider guide
- Improving documentation? All docs live in the
/docs
folder - Need help getting started? Reach out at support@switch.new
- Documentation: docs/
- Issues: GitHub Issues
- Email: support@switch.new