Skip to content

Conversation

Sameerlite
Copy link
Collaborator

Title

Fix Vertex AI embeddings JSON serialization error and add PSC endpoint support

Relevant issues

Fixes LIT-1096

Pre-Submission checklist

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix
🆕 New Feature

Changes

This PR adds comprehensive support for Vertex AI Private Service Connect (PSC) endpoints, allowing users to use custom api_base URLs for both completion and embedding requests. This enables access to privately deployed Vertex AI models through internal network endpoints.

Key Features Added

  1. PSC Endpoint URL Construction: Enhanced _check_custom_proxy() to properly construct full PSC URLs with the format:

    {api_base}/v1/projects/{project}/locations/{location}/endpoints/{model}:{endpoint}
    
  2. Numeric Model ID Support: Modified routing logic to ensure numeric endpoint IDs (common for custom deployments) properly use the HTTP-based handler that respects api_base.

  3. Comprehensive Parameter Passing: Updated all Vertex AI handlers to pass necessary parameters (vertex_project, vertex_location, vertex_api_version) for proper PSC URL construction.

  4. Bug Fix: Fixed a pre-existing JSON serialization bug in Vertex AI embeddings where non-serializable objects were being passed to TypedDict constructors.

Technical Changes

Core URL Construction (litellm/llms/vertex_ai/vertex_llm_base.py)

  • Enhanced _check_custom_proxy() to detect PSC endpoints and construct full URL paths
  • Added logic to handle both PSC endpoints and standard proxy configurations
  • Updated function signatures to accept additional Vertex AI parameters

Routing Logic (litellm/llms/vertex_ai/common_utils.py)

  • Modified get_vertex_ai_model_route() to route numeric model IDs with api_base to the HTTP-based handler
  • Ensures PSC endpoints use the correct code path that respects custom api_base

Handler Updates

Updated all Vertex AI handlers to pass required parameters:

  • vertex_gemma_models/main.py
  • vertex_model_garden/main.py
  • context_caching/vertex_ai_context_caching.py
  • batches/handler.py

Bug Fix (litellm/llms/vertex_ai/vertex_embeddings/transformation.py)

  • Fixed JSON serialization issue by filtering optional_params to only include valid TypedDict fields
  • Prevents ClientSession and other non-serializable objects from being passed to JSON serialization

Usage Example

import litellm

# PSC endpoint configuration
response = litellm.completion(
    model="vertex_ai/1234567890",  # Numeric endpoint ID
    messages=[{"role": "user", "content": "Hello"}],
    api_base="http://10.96.32.8",  # PSC endpoint
    vertex_project="my-project-id",
    vertex_location="us-central1"
)

# Embeddings also supported
response = litellm.embedding(
    model="bge-small-en-v1.5",
    input=["Hello", "World"],
    api_base="http://10.96.32.8",
    vertex_project="my-project-id", 
    vertex_location="us-central1"
)

Or specify in config.yaml:

model_list:
  - model_name: bge-small-en-v1.5
    litellm_params:
      model: vertex_ai/1234567890 
      api_base: http://10.96.32.8  # Your PSC IP
      vertex_project: my-project-id
      vertex_location: us-central1
image

Copy link

vercel bot commented Oct 10, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
litellm Ready Ready Preview Comment Oct 10, 2025 9:21am

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant