Lightweight & fast AI inference proxy for self-hosted LLMs backends like Ollama, LM Studio and others. Designed for speed, simplicity and local-first deployments.
-
Updated
Sep 19, 2025 - Go
Lightweight & fast AI inference proxy for self-hosted LLMs backends like Ollama, LM Studio and others. Designed for speed, simplicity and local-first deployments.
A self-hosted, open-source (Apache 2.0) proxy for LLM's with prometheus metrics
AI Proxy Server - A high-performance, secure unified API gateway for multiple LLM providers (OpenAI, Gemini, Groq, OpenRouter, Cloudflare) with intelligent routing, rate limiting, and streaming support. Features modular architecture, enhanced security, and optimized performance.
Add a description, image, and links to the llm-proxy topic page so that developers can more easily learn about it.
To associate your repository with the llm-proxy topic, visit your repo's landing page and select "manage topics."