feat: add Vertex AI support for all providers by Kamilbenkirane · Pull Request #135 · withceleste/celeste-python

Kamilbenkirane · 2026-02-06T16:50:37Z

Part of #118

Summary

Route requests through Vertex AI when GoogleADC auth is provided
Supports Google, Anthropic, Mistral, and DeepSeek providers across text, images, and videos modalities
Adds _make_poll_request pattern for long-running operations (Veo uses fetchPredictOperation on Vertex)
Bug fixes found during Vertex testing: error handler hardening, Gemini image role fix, DeepSeek usage parser fix
Unit tests for all Vertex URL routing + integration tests per modality

Status

Modality	Status
Text (generate + stream)	✅ Mistral, DeepSeek, Google working. Anthropic hitting rate limits on my GCP — needs validation
Images	✅ Imagen + Gemini working
Videos (Veo)	🚧 Polling fixed (`fetchPredictOperation`), inline video parsing (`bytesBase64Encoded`) not yet handled
Audio	✅ Cloud TTS already works, integration test not yet added
Embeddings	🚧 Routing implemented, integration test not yet added

Remaining

Veo Vertex: handle inline bytesBase64Encoded response (or pass storageUri)
Audio Vertex integration test
Embeddings Vertex integration test
Validate Anthropic on Vertex (rate limit issue on my GCP — @XinyueZ can you check?)

Test plan

uv run pytest tests/unit_tests/test_vertex_routing.py — 39 tests pass
uv run pytest tests/integration_tests/text/ -m integration — Mistral, DeepSeek, Google pass
uv run pytest tests/integration_tests/images/ -m integration — Imagen + Gemini pass
uv run pytest tests/integration_tests/videos/test_generate.py::test_vertex_generate -m integration
uv run pytest tests/integration_tests/ -m integration — full suite

🤖 Generated with Claude Code

Route requests through Vertex AI when GoogleADC auth is provided. Supports Google, Anthropic, Mistral, and DeepSeek providers across text, images, and videos modalities. Includes Veo polling fix (fetchPredictOperation), error handler hardening, Gemini image role fix, and DeepSeek usage parser fix. WIP: Veo Vertex inline video (bytesBase64Encoded) parsing not yet handled — needs base64 decoding or storageUri in request. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Kamilbenkirane · 2026-02-06T16:52:19Z

@XinyueZ Here's the branch with Vertex AI support.

Works for Mistral, DeepSeek, Google across text + images. Still hitting rate limits for Claude on my GCP — please check if it works on yours.

Videos (Veo) polling is fixed but inline response parsing still WIP. Audio already works via Cloud TTS. Embeddings routing is implemented but not yet integration-tested.

Should be finished Monday.

claude · 2026-02-06T16:53:02Z

Code Review: PR #135 - Vertex AI Support for All Providers

Summary

This PR adds comprehensive Vertex AI support, allowing requests to be routed through Google Cloud's Vertex AI when GoogleADC auth is provided. The implementation is well-architected and follows the existing patterns in the codebase.

✅ Strengths

1. Excellent Test Coverage

557 lines of comprehensive unit tests in test_vertex_routing.py
Tests cover all providers (Google, Anthropic, Mistral, DeepSeek)
Tests verify URL routing, error handling, and edge cases (no project_id, different locations)
Integration tests for text, images, and videos modalities
Good use of mocks to avoid real API calls in unit tests

2. Consistent Pattern Implementation

The _build_url() and _get_vertex_endpoint() pattern is consistently applied across all providers
Clean separation between API key auth (provider's native endpoints) and GoogleADC auth (Vertex endpoints)
Follows the template pattern shown in templates/providers/ with the _make_poll_request addition

3. Good Error Handling

Proper validation: raises ValueError when project_id is missing for Vertex AI
Error handler hardening mentioned in PR description (lines 295-307 in client.py)
Comprehensive exception handling in _handle_error_response with fallback to response text

4. Documentation and Code Quality

Clear docstrings explaining auth-based endpoint selection
Well-commented code with inline explanations
Type hints throughout
Clear commit message with co-authorship

🔍 Issues & Concerns

Critical: Security - Hardcoded Secrets in Tests

Location: tests/unit_tests/test_vertex_routing.py:29, 44, 57, 104

mock_creds.token = "fake-token"  # nosec B105

While # nosec B105 suppresses Bandit warnings, these are still hardcoded string literals. Consider using constants or fixtures:

TEST_FAKE_TOKEN = "fake-token"  # nosec B105
mock_creds.token = TEST_FAKE_TOKEN

Priority: Medium (test code only, but sets a precedent)

Bug: Gemini Image Role Fix Not Visible

Issue: The PR description mentions "Gemini image role fix" but the only change in src/celeste/modalities/images/providers/google/gemini.py is:

- return {
+ return {  # Fixed role placement

Question: What was the actual bug? The change appears cosmetic. If there was a role-related fix, it's not evident in the diff.

Action Required: Clarify what the "Gemini image role fix" addressed.

Bug: DeepSeek Usage Parser - Where's the Fix?

Issue: PR description mentions "DeepSeek usage parser fix" but src/celeste/providers/deepseek/chat/client.py shows no changes to usage parsing logic. The map_usage_fields method correctly handles nested prompt_tokens_details and completion_tokens_details.

Question: What was broken in the DeepSeek usage parser?

Action Required: Clarify or add a test case demonstrating the fix.

Incomplete: Veo Inline Video Parsing

Location: src/celeste/providers/google/veo/client.py:171-191

The PR description explicitly marks this as WIP:

Veo Vertex: handle inline bytesBase64Encoded response (or pass storageUri)

Current implementation only handles:

Vertex: response.videos[0]
AI Studio: response.generateVideoResponse.generatedSamples[0].video

Missing: Base64-encoded inline video handling for Vertex responses

Recommendation: Either:

Add a TODO comment in the code at line 178-183
Open a follow-up issue to track this
Add a test that's marked as @pytest.mark.skip with reason "Inline video parsing not yet implemented"

Performance: Veo Polling Interval

Location: src/celeste/providers/google/veo/config.py:23

POLL_INTERVAL = 10  # seconds

Concern: 10-second intervals may be too aggressive for long-running video generation operations (Veo can take 1-3 minutes). This could lead to excessive API calls.

Recommendation: Consider exponential backoff or increase to 15-20 seconds initial interval.

Code Quality: Inconsistent isinstance() Patterns

Location: Multiple client files

Some files check isinstance(self.auth, GoogleADC) while the template suggests this should be a universal pattern. However, the Cloud TTS client has additional logic:

# src/celeste/providers/google/cloud_tts/client.py (line 540-557 in tests)
if not isinstance(self.auth, GoogleADC):
    object.__setattr__(self, "auth", GoogleADC())

Question: Should all Google providers auto-initialize GoogleADC() when auth is not provided, or is this specific to Cloud TTS?

Recommendation: Document this pattern decision in code comments or contributing guidelines.

Missing: Embeddings & Audio Integration Tests

Status in PR:

✅ Text (generate + stream) - Mistral, DeepSeek, Google working
✅ Images - Imagen + Gemini working
🚧 Videos (Veo) - Polling fixed, inline parsing pending
❌ Audio - Routing implemented, integration test not added
❌ Embeddings - Routing implemented, integration test not added

Recommendation: Add integration tests or mark these as known gaps in the PR description with follow-up issues.

📝 Minor Issues

1. Type Safety: Mock Object Attributes

Location: tests/unit_tests/test_vertex_routing.py:104-105

object.__setattr__(auth, "_credentials", MagicMock(valid=True, token="fake"))  # nosec B106
object.__setattr__(auth, "_project", "adc-fallback-project")

Using object.__setattr__() to bypass Pydantic's frozen model validation is a code smell. Consider:

Using Pydantic's model_construct() for test objects
Creating a test-specific factory that properly constructs auth objects

2. Documentation: Vertex Endpoint Patterns

The Vertex endpoint patterns are complex:

Google models: /publishers/google/models/{model_id}:predict
Anthropic: /publishers/anthropic/models/{model_id}:rawPredict
Mistral: /publishers/mistralai/models/{model_id}:rawPredict
DeepSeek: /endpoints/openapi/chat/completions (no model in URL)

Recommendation: Add a comment in the config files explaining why different publishers use different endpoint patterns (rawPredict vs predict vs OpenAI-compatible).

3. Magic Numbers: Test Token Limits

Location: tests/integration_tests/text/test_generate.py:24

TEST_MAX_TOKENS = 200

Consider moving this to a test fixtures or conftest.py if it's reused across multiple test files.

🔒 Security Considerations

✅ Positive Security Aspects:

No credentials in code: All auth goes through GoogleADC
Proper use of SecretStr: API keys use Pydantic's SecretStr
No hardcoded project IDs: All project IDs come from ADC or user input
GCS URL handling: Proper validation when converting gs:// to https:// (line 232-233 in veo/client.py)

⚠️ Security Review Points:

Error messages: Line 301 in client.py falls back to response.text - ensure this doesn't leak sensitive data in error responses
Token expiration: GoogleADC credentials refresh is handled by Google's auth library - verify this works correctly with long-running operations (Veo polling)

🎯 Performance Considerations

HTTP Connection Pooling: Uses shared http_client via get_http_client() - good for connection reuse
Polling efficiency: See Veo polling interval concern above
No unnecessary requests: URL building is lazy (only when requests are made)

📋 Test Plan Validation

From PR description:

✅ Unit tests: 39 tests for Vertex routing
✅ Integration tests: Mistral, DeepSeek, Google text generation pass
✅ Integration tests: Imagen + Gemini images pass
⏳ Veo integration test pending (inline video parsing)
⏳ Anthropic rate limit issue (needs external validation)
⏳ Full integration suite not run yet

Recommendation: Run the full integration suite before merge, or document known failing tests.

🚀 Recommendations for Merge

Before Merging:

✅ Can merge as-is IF: The "Gemini image role fix" and "DeepSeek usage parser fix" are clarified as non-functional changes or already working correctly
❌ Should fix: Add TODO comments for Veo inline video parsing at line 178-183
❌ Should add: Integration tests for Audio and Embeddings OR open follow-up issues

After Merging:

Create follow-up issues for:
- Veo inline bytesBase64Encoded handling
- Anthropic rate limit validation
- Audio integration tests
- Embeddings integration tests
Consider adding Vertex AI documentation to the main README
Update api_references.md with Vertex AI endpoint documentation

🎉 Overall Assessment

Rating: ⭐⭐⭐⭐½ (4.5/5)

This is high-quality work with excellent test coverage and consistent implementation. The architecture is sound, the code is clean, and it follows established patterns. The main concerns are:

Incomplete features (Veo inline video) that should be documented
Missing integration tests for Audio/Embeddings
Minor clarifications needed on the "fixes" mentioned in PR description

Recommendation: Approve with minor changes requested

Great work on adding comprehensive Vertex AI support! 🎊

Reviewed with: Claude Code + Claude Sonnet 4.5

Seluj78

LGTM so far

…auth The google-auth package is optional under [gcp], but unit tests in test_vertex_routing.py and Vertex integration tests need it importable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

… mismatch Vertex Veo responses use videoGcsUri (not uri/gcsUri) and can return inline base64 instead of a GCS URL. Normalize the key and decode inline responses directly into VideoArtifact. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…dpoint Vertex embeddings uses :predict with instances format, not :embedContent. Build correct request body in _init_request when auth is GoogleADC, and parse predictions response format in _parse_content. Add integration test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…exEndpoint, update templates - Move isinstance(self.auth, GoogleADC) check from modality _init_request() to provider mixin _make_request() for embeddings, keeping auth logic in provider layer - Fix misplaced class docstring in GoogleEmbeddingsClient mixin - Rename VertexEndpoint to VertexGenerateContentEndpoint for consistency with VertexImagenEndpoint, VertexEmbeddingsEndpoint, etc. - Add Vertex AI routing patterns (commented) to provider templates Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Move duplicated project_id validation, base URL resolution, and endpoint formatting from 7 provider _build_url() methods into GoogleADC.build_url(). Also remove manual base64.b64decode from video client (Artifact validator handles it). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…images Same pattern as the video client fix - pass base64 string directly to ImageArtifact(data=...) instead of manual base64.b64decode(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Same pattern as Gemini images and Veo video fixes - pass base64 string directly to ImageArtifact(data=...) instead of manual base64.b64decode(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Seluj78 approved these changes Feb 7, 2026

View reviewed changes

Kamilbenkirane and others added 6 commits February 7, 2026 23:38

Merge remote-tracking branch 'origin/main' into feat/vertex-ai-support

dfa4f15

fix(ci): install gcp extra so Vertex routing tests can import google-…

d1e42c5

…auth The google-auth package is optional under [gcp], but unit tests in test_vertex_routing.py and Vertex integration tests need it importable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

chore: fix trailing newlines in workflow files

a43d027

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Kamilbenkirane marked this pull request as ready for review February 9, 2026 11:19

Kamilbenkirane and others added 3 commits February 9, 2026 12:53

fix(images): let Artifact validator handle base64 decoding in Gemini …

81f9352

…images Same pattern as the video client fix - pass base64 string directly to ImageArtifact(data=...) instead of manual base64.b64decode(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix(images): let Artifact validator handle base64 decoding in Imagen

3b4d1c1

Same pattern as Gemini images and Veo video fixes - pass base64 string directly to ImageArtifact(data=...) instead of manual base64.b64decode(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Kamilbenkirane mentioned this pull request Feb 9, 2026

refactor: let Artifact validator handle base64 decoding across all providers #138

Closed

Kamilbenkirane merged commit a6ef5cf into main Feb 9, 2026
10 of 11 checks passed

Kamilbenkirane mentioned this pull request Feb 9, 2026

vertexai? #118

Closed

Kamilbenkirane mentioned this pull request Feb 24, 2026

release: v0.10.0 #177

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add Vertex AI support for all providers#135

feat: add Vertex AI support for all providers#135
Kamilbenkirane merged 10 commits intomainfrom
feat/vertex-ai-support

Kamilbenkirane commented Feb 6, 2026 •

edited

Loading

Uh oh!

Kamilbenkirane commented Feb 6, 2026

Uh oh!

claude bot commented Feb 6, 2026

Uh oh!

Seluj78 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Kamilbenkirane commented Feb 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Status

Remaining

Test plan

Uh oh!

Kamilbenkirane commented Feb 6, 2026

Uh oh!

claude bot commented Feb 6, 2026

Code Review: PR #135 - Vertex AI Support for All Providers

Summary

✅ Strengths

1. Excellent Test Coverage

2. Consistent Pattern Implementation

3. Good Error Handling

4. Documentation and Code Quality

🔍 Issues & Concerns

Critical: Security - Hardcoded Secrets in Tests

Bug: Gemini Image Role Fix Not Visible

Bug: DeepSeek Usage Parser - Where's the Fix?

Incomplete: Veo Inline Video Parsing

Performance: Veo Polling Interval

Code Quality: Inconsistent isinstance() Patterns

Missing: Embeddings & Audio Integration Tests

📝 Minor Issues

1. Type Safety: Mock Object Attributes

2. Documentation: Vertex Endpoint Patterns

3. Magic Numbers: Test Token Limits

🔒 Security Considerations

✅ Positive Security Aspects:

⚠️ Security Review Points:

🎯 Performance Considerations

📋 Test Plan Validation

🚀 Recommendations for Merge

Before Merging:

After Merging:

🎉 Overall Assessment

Uh oh!

Seluj78 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Kamilbenkirane commented Feb 6, 2026 •

edited

Loading