litellm-cloudrun-deploy 🚀

A high-performance, cost-optimized LiteLLM Proxy deployment for Google Cloud Run. This setup is designed for enterprise-grade applications requiring model routing, caching, and observability.

🌟 Key Features

Scalable Serverless: Deploys to Google Cloud Run with optimized 2 vCPU / 4GB RAM specs.
Enterprise Caching: Built-in Redis integration with TLS support for sub-second latent responses and heavy cost savings.
Full Observability: Pre-configured for Langfuse, Context7, and Tavily.
MCP & Skills Support: Ready for Model Context Protocol (MCP) servers and Anthropic-compatible /skills endpoints.
Secure by Default:
- Zero hardcoded secrets (Env-var injection with Secret Manager).
- Encrypted database storage with custom salt keys.
- IAM-based invocation control.
Multi-Model Support: Gemini, Kimi, Z-AI (GLM), MiniMax, Deepinfra, NVIDIA NIM, and GitHub Models.

🚀 Deployment Options

Choose the deployment path that matches your needs:

Option 1: One-Click Deploy (Recommended for Testing & Evaluation)

!

Best for: Getting started quickly, testing, proof-of-concept

✅ One-click deployment
✅ No pre-configuration required
✅ Guides you through setup wizard
⚠️ Secrets stored as environment variables (see Option 2 for production)

Option 2: Production Deployment (Recommended for Production)

Guide: Production with Secret Manager

Best for: Production environments, enterprise deployments

✅ Secrets stored in Google Secret Manager
✅ IAM-based access control
✅ Full audit trails
✅ Recommended for sensitive workloads

Option 3: Manual CLI Deployment

Guide: deploy_gcloud.sh

Best for: Developers who prefer command-line control

✅ Full control over deployment
✅ Integrates with CI/CD pipelines
✅ Custom deployment scripts

🚀 Quick Start

1. Local Run (Docker)

Ensure you have Docker installed and a .env file based on .env.example.

docker-compose up

This will start:

LiteLLM Proxy on http://localhost:4000
PostgreSQL (local) on port 5432
Redis (local) on port 6379

2. One-Click Deploy to Cloud Run

(Note: Ensure your Google Cloud project is active and billing is enabled)

3. CLI Deployment

We use a streamlined deployment script (deploy_gcloud.sh) for production updates.

Prerequisites:

Google Cloud SDK installed (gcloud).
Authenticated session (gcloud auth login).
Active project set (gcloud config set project YOUR_PROJECT_ID).

Cloud Deployment

The project includes an automated provisioning and deployment workflow using Google Cloud Secret Manager and Cloud SQL.

1. Provision Infrastructure

Run the interactive provisioning script to set up Cloud SQL, Redis, and Secrets:

./scripts/provision_gcloud.sh

This will create/verify:

Cloud SQL Instance (Postgres)
Memorystore for Redis
Required Secrets in Secret Manager
Service Account and IAM roles

2. Deploy

Ensure you are authenticated: gcloud auth login
Set your active project: gcloud config set project YOUR_PROJECT_ID
Run the deployment script:
```
./deploy_gcloud.sh
```
Note: This script builds the image via Cloud Build and deploys to Cloud Run with secret references.

For more details on production security, see docs/PRODUCTION-SECRETS.md.

📖 Documentation

Production with Secret Manager - Secure production deployment guide
Production Best Practices - Machine and worker optimization
Cost Optimization - Google Cloud Run pricing and savings
Caching Implementation - Redis caching configuration and verification
Agent Overview - High-level overview of agent support.
Agent Integration - Connect LangChain, DSPy, AutoGen, etc.
Quick Start Guide - Step-by-step deployment walkthrough
Deployment Logic - Deep dive into the automated deployment script

📦 Use as GitHub Template

Click the "Use this template" button at the top of the repository to create your own copy. This allows you to customize the configuration and deployment scripts for your specific needs.

🛠 Configuration

The core configuration is split into three files in the config/ directory:

File	Purpose
`config/local.yaml`	Local testing with manual key injection
`config/prod.yaml`	Cloud Run production, uses `os.environ` for secrets
`config/dev.yaml`	Docker Compose local development

Infrastructure Specs (Cloud Run)

Resources: 2 vCPU / 4GB RAM
Workers: 8 workers (--num_workers 8) to match uvicorn to CPU allocation
Database Pooling: Limit set to 20 to prevent connection exhaustion

🔒 Security

Secrets managed via Google Secret Manager in production (see docs/PRODUCTION-SECRETS.md)
LITELLM_SALT_KEY used for internal database encryption
LITELLM_MASTER_KEY for authenticated proxy access

🔐 Permissions

Grant access to the service using the Google Cloud SDK:

gcloud run services add-iam-policy-binding litellm-proxy \
    --member="user:NAME@DOMAIN.COM" \
    --role="roles/run.invoker" \
    --region="us-central1" \
    --project="YOUR_PROJECT_ID"

Built with ❤️ by Qredence.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github		.github
config		config
docs		docs
scripts		scripts
.dockerignore		.dockerignore
.env.example		.env.example
.gcloudignore		.gcloudignore
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
app.json		app.json
cloudbuild.yaml		cloudbuild.yaml
deploy_gcloud.sh		deploy_gcloud.sh
docker-compose.yml		docker-compose.yml
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

litellm-cloudrun-deploy 🚀

🌟 Key Features

🚀 Deployment Options

Option 1: One-Click Deploy (Recommended for Testing & Evaluation)

Option 2: Production Deployment (Recommended for Production)

Option 3: Manual CLI Deployment

🚀 Quick Start

1. Local Run (Docker)

2. One-Click Deploy to Cloud Run

3. CLI Deployment

Cloud Deployment

1. Provision Infrastructure

2. Deploy

📖 Documentation

📦 Use as GitHub Template

🛠 Configuration

Infrastructure Specs (Cloud Run)

🔒 Security

🔐 Permissions

About

Uh oh!

Releases 3

Sponsor this project

Uh oh!

Packages

Uh oh!

Languages

Uh oh!

License

Qredence/litellm-cloudrun-deploy

Folders and files

Latest commit

History

Repository files navigation

litellm-cloudrun-deploy 🚀

🌟 Key Features

🚀 Deployment Options

Option 1: One-Click Deploy (Recommended for Testing & Evaluation)

Option 2: Production Deployment (Recommended for Production)

Option 3: Manual CLI Deployment

🚀 Quick Start

1. Local Run (Docker)

2. One-Click Deploy to Cloud Run

3. CLI Deployment

Cloud Deployment

1. Provision Infrastructure

2. Deploy

📖 Documentation

📦 Use as GitHub Template

🛠 Configuration

Infrastructure Specs (Cloud Run)

🔒 Security

🔐 Permissions

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Languages

Packages