feat: add Vertex AI support for all providers#135
Conversation
Route requests through Vertex AI when GoogleADC auth is provided. Supports Google, Anthropic, Mistral, and DeepSeek providers across text, images, and videos modalities. Includes Veo polling fix (fetchPredictOperation), error handler hardening, Gemini image role fix, and DeepSeek usage parser fix. WIP: Veo Vertex inline video (bytesBase64Encoded) parsing not yet handled — needs base64 decoding or storageUri in request. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@XinyueZ Here's the branch with Vertex AI support. Works for Mistral, DeepSeek, Google across text + images. Still hitting rate limits for Claude on my GCP — please check if it works on yours. Videos (Veo) polling is fixed but inline response parsing still WIP. Audio already works via Cloud TTS. Embeddings routing is implemented but not yet integration-tested. Should be finished Monday. |
Code Review: PR #135 - Vertex AI Support for All ProvidersSummaryThis PR adds comprehensive Vertex AI support, allowing requests to be routed through Google Cloud's Vertex AI when ✅ Strengths1. Excellent Test Coverage
2. Consistent Pattern Implementation
3. Good Error Handling
4. Documentation and Code Quality
🔍 Issues & ConcernsCritical: Security - Hardcoded Secrets in TestsLocation: mock_creds.token = "fake-token" # nosec B105While TEST_FAKE_TOKEN = "fake-token" # nosec B105
mock_creds.token = TEST_FAKE_TOKENPriority: Medium (test code only, but sets a precedent) Bug: Gemini Image Role Fix Not VisibleIssue: The PR description mentions "Gemini image role fix" but the only change in - return {
+ return { # Fixed role placementQuestion: What was the actual bug? The change appears cosmetic. If there was a role-related fix, it's not evident in the diff. Action Required: Clarify what the "Gemini image role fix" addressed. Bug: DeepSeek Usage Parser - Where's the Fix?Issue: PR description mentions "DeepSeek usage parser fix" but Question: What was broken in the DeepSeek usage parser? Action Required: Clarify or add a test case demonstrating the fix. Incomplete: Veo Inline Video ParsingLocation: The PR description explicitly marks this as WIP:
Current implementation only handles:
Missing: Base64-encoded inline video handling for Vertex responses Recommendation: Either:
Performance: Veo Polling IntervalLocation: POLL_INTERVAL = 10 # secondsConcern: 10-second intervals may be too aggressive for long-running video generation operations (Veo can take 1-3 minutes). This could lead to excessive API calls. Recommendation: Consider exponential backoff or increase to 15-20 seconds initial interval. Code Quality: Inconsistent isinstance() PatternsLocation: Multiple client files Some files check # src/celeste/providers/google/cloud_tts/client.py (line 540-557 in tests)
if not isinstance(self.auth, GoogleADC):
object.__setattr__(self, "auth", GoogleADC())Question: Should all Google providers auto-initialize Recommendation: Document this pattern decision in code comments or contributing guidelines. Missing: Embeddings & Audio Integration TestsStatus in PR:
Recommendation: Add integration tests or mark these as known gaps in the PR description with follow-up issues. 📝 Minor Issues1. Type Safety: Mock Object AttributesLocation: object.__setattr__(auth, "_credentials", MagicMock(valid=True, token="fake")) # nosec B106
object.__setattr__(auth, "_project", "adc-fallback-project")Using
2. Documentation: Vertex Endpoint PatternsThe Vertex endpoint patterns are complex:
Recommendation: Add a comment in the config files explaining why different publishers use different endpoint patterns (rawPredict vs predict vs OpenAI-compatible). 3. Magic Numbers: Test Token LimitsLocation: TEST_MAX_TOKENS = 200Consider moving this to a test fixtures or conftest.py if it's reused across multiple test files. 🔒 Security Considerations✅ Positive Security Aspects:
|
…auth The google-auth package is optional under [gcp], but unit tests in test_vertex_routing.py and Vertex integration tests need it importable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… mismatch Vertex Veo responses use videoGcsUri (not uri/gcsUri) and can return inline base64 instead of a GCS URL. Normalize the key and decode inline responses directly into VideoArtifact. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…dpoint Vertex embeddings uses :predict with instances format, not :embedContent. Build correct request body in _init_request when auth is GoogleADC, and parse predictions response format in _parse_content. Add integration test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…exEndpoint, update templates - Move isinstance(self.auth, GoogleADC) check from modality _init_request() to provider mixin _make_request() for embeddings, keeping auth logic in provider layer - Fix misplaced class docstring in GoogleEmbeddingsClient mixin - Rename VertexEndpoint to VertexGenerateContentEndpoint for consistency with VertexImagenEndpoint, VertexEmbeddingsEndpoint, etc. - Add Vertex AI routing patterns (commented) to provider templates Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move duplicated project_id validation, base URL resolution, and endpoint formatting from 7 provider _build_url() methods into GoogleADC.build_url(). Also remove manual base64.b64decode from video client (Artifact validator handles it). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…images Same pattern as the video client fix - pass base64 string directly to ImageArtifact(data=...) instead of manual base64.b64decode(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Same pattern as Gemini images and Veo video fixes - pass base64 string directly to ImageArtifact(data=...) instead of manual base64.b64decode(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Part of #118
Summary
GoogleADCauth is provided_make_poll_requestpattern for long-running operations (Veo usesfetchPredictOperationon Vertex)Status
fetchPredictOperation), inline video parsing (bytesBase64Encoded) not yet handledRemaining
bytesBase64Encodedresponse (or passstorageUri)Test plan
uv run pytest tests/unit_tests/test_vertex_routing.py— 39 tests passuv run pytest tests/integration_tests/text/ -m integration— Mistral, DeepSeek, Google passuv run pytest tests/integration_tests/images/ -m integration— Imagen + Gemini passuv run pytest tests/integration_tests/videos/test_generate.py::test_vertex_generate -m integrationuv run pytest tests/integration_tests/ -m integration— full suite🤖 Generated with Claude Code