Skip to content

acblabs/multi_llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ Multi-LLM Agent on Vertex AI (Gemini + OpenAI + Claude + Grok)

Python GCP Status License

A production-ready multi-LLM agent using Vertex AI Agent Builder (ADK) with Gemini as the orchestrator and support for OpenAI, Claude, and Grok β€” including real-time token + cost tracking.


πŸ”₯ Why this project?

Most examples show single-model agents.

This project demonstrates a real-world multi-model architecture:

  • 🧠 Use Gemini (Vertex AI) as the default (fast + cost-efficient)
  • πŸ”Œ Call OpenAI / Claude / Grok only when needed
  • πŸ’° Track token usage + cost per call
  • ☁️ Deploy to Vertex AI Agent Engine
  • πŸ–₯️ Develop locally with ADK Dev UI

πŸ‘‰ This is a practical production pattern for modern AI systems.


🧠 Architecture

User
 ↓
Vertex Agent Engine
 ↓
Gemini (primary orchestrator)
 ↓
Tool routing layer
 β”œβ”€β”€ OpenAI
 β”œβ”€β”€ Claude
 └── Grok
 ↓
Response + cost tracking

✨ Features

  • βœ… Multi-LLM orchestration (Gemini + OpenAI + Claude + Grok)
  • βœ… Tool-based routing
  • βœ… Per-call token + cost tracking
  • βœ… Local development UI (adk web)
  • βœ… Vertex Agent Engine deployment
  • βœ… Clean, extensible Python structure

πŸ’‘ Example Output

Here is your rewritten email...

[Claude: 812 tokens | $0.0021]

πŸ“¦ Project Structure

multi_llm/
β”œβ”€β”€ __init__.py
β”œβ”€β”€ agent.py            # shim β†’ app.agent
β”œβ”€β”€ app/
β”‚   β”œβ”€β”€ agent.py
β”‚   β”œβ”€β”€ tools.py
β”‚   └── config.py
β”œβ”€β”€ requirements.txt
└── .env

βš™οΈ Prerequisites

  • Python 3.10+
  • Google Cloud project with billing enabled
  • Vertex AI + Cloud Storage APIs enabled
  • GCS staging bucket (e.g. gs://my-gcp-bucket)
  • Authenticated locally:
gcloud auth application-default login
  • API keys with available credits / billing enabled for:

    • OpenAI
    • Anthropic (Claude)
    • xAI (Grok)

πŸ” GCP Permissions / IAM

To deploy and run this project, the following roles are required:

For your user (deployment)

  • roles/aiplatform.user (Vertex AI access)
  • roles/storage.admin on the staging bucket (e.g. gs://ai_fnol)

If you need to create the bucket:

  • roles/storage.admin at the project level (temporary is fine)

For runtime (Agent Engine)

By default, Vertex uses a managed service agent:

service-PROJECT_NUMBER@gcp-sa-aiplatform-re.iam.gserviceaccount.com

This works out of the box.

For production use, you should migrate to a custom service account with least-privilege access.


πŸ”§ Installation

1. Create virtual environment (outside repo recommended)

python -m venv ~/multi_llm_venv
source ~/multi_llm_venv/bin/activate

2. Install dependencies

pip install -r requirements.txt

πŸ” Environment Setup

cp .env.example .env

Fill in:

GOOGLE_CLOUD_PROJECT=PROJECT_NAME
GOOGLE_CLOUD_LOCATION=GCP_REGION
GOOGLE_CLOUD_STAGING_BUCKET=gs://GCP_BUCKET

OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
XAI_API_KEY=...

πŸ§ͺ Local Development

adk web multi_llm

Open:

http://127.0.0.1:8000/dev-ui

☁️ Deploy to Vertex AI

adk deploy agent_engine \
  --project="PROJECT_NAME" \
  --region="GCP_REGION" \
  --display_name="multi_llm" \
  --staging_bucket="gs://GCP_BUCKET" \
  multi_llm

πŸ” Test Deployed Agent

import vertexai
from vertexai import agent_engines

vertexai.init(project="PROJECT_NAME", location="GCP_REGION")

agent = agent_engines.get(
    "projects/PROJECT_NAME/locations/GCP_REGION/reasoningEngines/Id"
)

print(agent.query(input="Compare Gemini vs OpenAI vs Claude"))

πŸ’° Usage & Cost Tracking

Each external model call returns:

[OpenAI: 496 tokens | $0.0042]

Also available via:

get_usage_summary()

⚠️ Notes

  • Gemini is used as the default model
  • External models are used selectively
  • Keep .venv outside the repo to avoid deployment issues

πŸ—ΊοΈ Roadmap

  • Session memory (conversation state)
  • Long-term memory (user preferences)
  • Cost-aware routing
  • RAG (retrieval)
  • Secret Manager integration
  • Frontend / API integrations
  • Migrate deployment and runtime authentication to a dedicated least-privilege service account instead of the local user IAM account

πŸ“Œ Status

Active / Iterating toward production-grade system


About

A cost-aware, multi-LLM routing agent using Gemini as the default and external models on demand.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages