This document describes how to configure and use the Google Vertex AI provider in Cortex to access Anthropic Claude models through Google Cloud Platform infrastructure.
The Vertex AI provider enables accessing Anthropic Claude models via Google Cloud Platform's Vertex AI service. This provides:
- Google Cloud IAM Integration: Use GCP authentication instead of Anthropic API keys
- Regional Deployment: Access models from different GCP regions
- Enterprise SLAs: Benefit from Google Cloud's enterprise-grade service agreements
- GCP Billing: Unified billing through your Google Cloud account
- OAuth Token Management: Automatic token refresh for service accounts and ADC
- GCP Project: A Google Cloud Platform project with Vertex AI API enabled
- IAM Permissions: Role
roles/aiplatform.useror equivalent permissions - Authentication: One of the following:
- Bearer token (OAuth2 access token)
- Service account JSON key file
- Application Default Credentials (ADC)
Add the Vertex AI provider to your cortex.yaml:
providers:
vertex-claude:
type: anthropic-vertex
description: "Claude via Google Vertex AI"
supports_streaming: true
supports_tool_calling: true
provider_config:
project_id: "your-gcp-project-id"
region: "us-east5"
auth_type: "bearer_token"
bearer_token: "${GCP_BEARER_TOKEN}"| Field | Required | Default | Description |
|---|---|---|---|
project_id |
Yes | - | Your GCP project ID |
region |
No | us-east5 |
GCP region for Vertex AI |
auth_type |
No | adc |
Authentication method |
bearer_token |
No | - | OAuth2 bearer token (required if auth_type=bearer_token) |
service_account_file |
No | - | Path to service account JSON key (required if auth_type=service_account) |
service_account_json |
No | - | Raw service account JSON (alternative to file path) |
us-east5(Default)us-central1europe-west1asia-southeast1
Use a GCP OAuth2 access token (typically obtained via gcloud auth print-access-token):
providers:
vertex-claude:
type: anthropic-vertex
provider_config:
project_id: "my-gcp-project"
region: "us-east5"
auth_type: "bearer_token"
bearer_token: "${GCP_BEARER_TOKEN}"Set the environment variable:
export GCP_BEARER_TOKEN=$(gcloud auth print-access-token)Note: Bearer tokens expire after ~1 hour. For production, use service accounts or ADC.
Use a service account JSON key file:
providers:
vertex-claude:
type: anthropic-vertex
provider_config:
project_id: "my-gcp-project"
region: "us-east5"
auth_type: "service_account"
service_account_file: "/path/to/service-account-key.json"The provider will automatically refresh tokens as needed.
Use the default GCP credentials configured on your system:
providers:
vertex-claude:
type: anthropic-vertex
provider_config:
project_id: "my-gcp-project"
region: "us-east5"
auth_type: "adc"ADC will use credentials from:
GOOGLE_APPLICATION_CREDENTIALSenvironment variable- gcloud CLI configuration
- GCE metadata server (when running on Google Cloud)
The Vertex AI provider automatically transforms Anthropic model IDs to Vertex AI format:
| Anthropic Model ID | Vertex AI Model ID |
|---|---|
claude-3-5-sonnet-20241022 |
claude-3-5-sonnet-v2@20241022 |
claude-3-opus-20240229 |
claude-3-opus@20240229 |
claude-3-sonnet-20240229 |
claude-3-sonnet@20240229 |
claude-3-haiku-20240307 |
claude-3-haiku@20240307 |
You can use Anthropic model IDs in your requests - the provider handles the transformation automatically.
curl -X POST http://localhost:8090/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer your-cortex-api-key" \
-d '{
"model": "claude-3-5-sonnet-20241022",
"messages": [
{
"role": "user",
"content": "Hello from Vertex AI!"
}
],
"max_tokens": 1024
}'curl -X POST http://localhost:8090/anthropic/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: your-cortex-api-key" \
-d '{
"model": "claude-3-opus-20240229",
"messages": [
{
"role": "user",
"content": "Hello from Vertex AI!"
}
],
"max_tokens": 1024
}'Provider configuration is automatically stored in the database when migrating from YAML. The configuration is stored in the providers table with:
type:anthropic-vertexprovider_config_json: Contains the Vertex AI configuration (project_id, region, auth settings)bearer_tokenis encrypted in the database for security
You can use environment variable expansion in your configuration:
providers:
vertex-claude:
type: anthropic-vertex
provider_config:
project_id: "${GCP_PROJECT_ID}"
region: "${GCP_REGION}"
auth_type: "bearer_token"
bearer_token: "${GCP_BEARER_TOKEN}"Ensure your configuration includes project_id in provider_config:
provider_config:
project_id: "your-gcp-project-id"Check that:
- Your
project_idis correct - Your
regionis one of the supported regions - Your authentication configuration is valid
Verify your authentication:
- For bearer tokens: Check that the token is valid and not expired
- For service accounts: Verify the JSON key file path and permissions
- For ADC: Ensure
gcloud auth application-default loginis configured
Some models may not be available in all regions. Try switching to a different region:
provider_config:
region: "us-east5" # Try us-east5 for the widest model support- Token Refresh: Service account and ADC authentication automatically refresh tokens
- Regional Latency: Choose a region close to your users for lowest latency
- Streaming: Streaming is fully supported and recommended for interactive applications
- Use Service Accounts: Prefer service accounts over bearer tokens for production
- Rotate Keys: Regularly rotate service account keys
- Least Privilege: Grant only
roles/aiplatform.useror minimal required permissions - Encrypt Sensitive Data: Bearer tokens are automatically encrypted in the database
- Environment Variables: Use environment variables for sensitive configuration