IntelliRouter is a programmable LLM gateway that provides an OpenAI-compatible API endpoint for chat completions, supporting both streaming and non-streaming responses. It's designed to be highly extensible and configurable, with support for various deployment scenarios.
- Programmable Routing: Route requests to different LLM backends based on customizable strategies
- Extensibility: Plugin system for custom routing strategies, model connectors, and telemetry exporters
- Multi-Role Deployment: Support for deploying as separate services with secure IPC
- Client SDKs: Python, TypeScript, and Rust libraries for easy integration
- Deployment Options: Configurations for various environments from edge to cloud
- Rust 1.70 or later
- Docker (for containerized deployment)
- Kubernetes (for production deployment)
-
Clone the repository:
git clone https://github.com/yourusername/intellirouter.git cd intellirouter
-
Build the project:
cargo build --release
-
Run the router:
./target/release/intellirouter --role router
-
Build the Docker image:
docker build -t intellirouter .
-
Run the container:
docker run -p 8000:8000 -e INTELLIROUTER_ROLE=router intellirouter
- Start all services:
docker-compose up -d
IntelliRouter can be configured using a TOML file. Create a config.toml
file in the config
directory:
[server]
host = "0.0.0.0"
port = 8000
[logging]
level = "info"
[redis]
host = "localhost"
port = 6379
[chromadb]
host = "localhost"
port = 8001
For local development, you can use Docker Compose:
docker-compose -f docker-compose.dev.yml up -d
For edge deployment, use the edge-specific Docker Compose file:
cd deployment/edge
docker-compose up -d
For Kubernetes deployment, use Helm:
# MicroK8s
cd deployment/microk8s
helm install intellirouter ../../helm/intellirouter -f values.yaml
# EKS
cd deployment/eks
helm install intellirouter ../../helm/intellirouter -f values.yaml
# GKE
cd deployment/gke
helm install intellirouter ../../helm/intellirouter -f values.yaml
IntelliRouter consists of several modules:
- LLM Proxy: OpenAI-compatible API endpoint
- Model Registry: Tracks available LLM backends
- Router Core: Routes requests to appropriate model backends
- Persona Layer: Injects system prompts and guardrails
- Chain Engine: Orchestrates multi-step inference flows
- Memory: Provides short-term and long-term memory capabilities
- RAG Manager: Manages Retrieval Augmented Generation
- Authentication: Handles API key validation and RBAC
- Telemetry: Collects logs, costs, and usage metrics
- Plugin SDK: Provides extensibility for custom components
IntelliRouter provides SDKs for easy integration:
IntelliRouter provides an OpenAI-compatible HTTP API:
POST /v1/chat/completions
Request:
{
"model": "gpt-3.5-turbo",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
}
Response:
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-3.5-turbo",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I assist you today?"
},
"finish_reason": "stop"
}]
}
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.