fix: Add Vertex AI credentials to backend deployment #443

Gkrumbach07 · 2025-12-04T22:51:02Z

Summary

The display name auto-generation feature (PR #432) requires backend access to Vertex AI when CLAUDE_CODE_USE_VERTEX=1. This PR adds the missing configuration:

Changes

Add Vertex AI env vars to backend deployment:
- CLOUD_ML_REGION
- ANTHROPIC_VERTEX_PROJECT_ID
- GOOGLE_APPLICATION_CREDENTIALS
All sourced from operator-config ConfigMap (same as operator).
Mount ambient-vertex secret as volume at /app/vertex/:
- Contains ambient-code-key.json (GCP service account key)
- Marked as optional: true so deployments without Vertex still work

Why This Is Needed

When CLAUDE_CODE_USE_VERTEX=1, the backend must use Vertex AI for Claude API calls (display name generation). Previously, the env vars were set but the actual credentials file wasn't mounted, causing:

ANTHROPIC_VERTEX_PROJECT_ID is required when CLAUDE_CODE_USE_VERTEX=1

Testing

Tested on dev cluster with Vertex AI enabled - display names now generate correctly via Vertex.

Backend needs the same Vertex AI configuration as the operator for display name auto-generation to work with Vertex: - CLOUD_ML_REGION - ANTHROPIC_VERTEX_PROJECT_ID - GOOGLE_APPLICATION_CREDENTIALS All sourced from operator-config ConfigMap for consistency.

Backend needs the GCP service account credentials file to use Vertex AI for display name auto-generation. The ambient-vertex secret (containing ambient-code-key.json) is now mounted at /app/vertex/ in the backend deployment. The volume mount is optional so deployments without Vertex AI still work.

github-actions · 2025-12-04T23:19:53Z

Claude Code Review

Summary

This PR correctly adds Vertex AI credentials configuration to the backend deployment to support the display name auto-generation feature introduced in PR #432. The changes follow established patterns and are minimal, focused, and well-documented.

Overall Assessment: ✅ APPROVED - This is a clean, well-executed fix with excellent comments and proper error handling.

Issues by Severity

🚫 Blocker Issues

None - This PR is ready to merge.

🔴 Critical Issues

None - All critical security and architecture patterns are correctly followed.

🟡 Major Issues

None - The implementation is sound.

🔵 Minor Issues

1. Consider Adding RBAC Documentation (Optional Enhancement)

The backend now accesses the ambient-vertex secret from the ambient-code namespace (via the volume mount). While this is properly configured with optional: true, it would be helpful to document the RBAC requirements.

Recommendation: Consider adding a comment in the deployment YAML or README noting that the backend service account needs read access to the ambient-vertex secret in the ambient-code namespace (though this is likely already granted via existing RBAC).

Note: This is informational only - the current implementation will work correctly.

Positive Highlights

1. ✅ Excellent Comments and Documentation

The deployment YAML comments are exceptionally clear:

# Vertex AI configuration from operator-config ConfigMap
# Backend needs these for:
# 1. CLAUDE_CODE_USE_VERTEX: Expose vertexEnabled flag via /api/cluster-info
# 2. Other Vertex vars: Display name auto-generation uses Claude Haiku via Vertex

And the volume mount documentation:

# ambient-vertex secret contains GCP service account key for Vertex AI
# This secret must exist in the same namespace as the backend (ambient-code)
# Created manually: kubectl create secret generic ambient-vertex --from-file=ambient-code-key.json=<path-to-key> -n ambient-code

These comments will save future developers significant time.

2. ✅ Proper Use of `optional: true`

The secret volume is correctly marked as optional:

- name: vertex-credentials
  secret:
    secretName: ambient-vertex
    optional: true  # Don't fail if Vertex not configured

This ensures deployments without Vertex AI continue to work seamlessly.

3. ✅ Improved Error Message

The display_name.go change improves the error message when ANTHROPIC_VERTEX_PROJECT_ID is missing:

return anthropic.Client{}, false, fmt.Errorf("ANTHROPIC_VERTEX_PROJECT_ID is required when CLAUDE_CODE_USE_VERTEX=1 (check backend deployment env vars)")

The addition of "(check backend deployment env vars)" provides actionable debugging guidance.

4. ✅ Proper Handling of "global" Region

The code now handles the "global" region edge case correctly:

// Handle region - "global" doesn't work for direct API calls (Claude CLI handles it internally)
// For the Go SDK, we need an actual regional endpoint
if region == "" || region == "global" {
    region = "us-east5" // Default to us-east5 (supports Claude models on Vertex AI)
}

This aligns with how the Claude CLI handles regional configuration.

5. ✅ Follows Established Patterns

The backend now uses the same ConfigMap source (operator-config) for Vertex AI configuration as the operator, ensuring consistency. The secret mounting pattern matches how the operator copies and mounts secrets for session pods.

6. ✅ Minimal Scope

This PR changes only what's necessary - two files with focused modifications. No unnecessary refactoring or scope creep.

Code Quality Assessment

Security ✅

✅ Secret marked as optional: true - no availability impact if Vertex not configured
✅ Secret mounted read-only (readOnly: true)
✅ No credentials logged (existing code already handles this correctly)
✅ Uses proper K8s secret management

Error Handling ✅

✅ Improved error message provides debugging context
✅ Handles "global" region edge case
✅ Maintains backward compatibility with API key auth fallback

Architecture ✅

✅ Consistent with operator configuration pattern
✅ Follows the established Vertex AI authentication pattern (Google Application Default Credentials)
✅ No changes to the backend service account or RBAC (relies on existing permissions)

Testing ✅

According to the PR description:

Tested on dev cluster with Vertex AI enabled - display names now generate correctly via Vertex.

This confirms the fix resolves the original issue.

Recommendations

Pre-Merge Checklist

Security: Credentials handled securely (readOnly: true, optional: true)
Error Handling: Improved error messages guide debugging
Documentation: Excellent inline comments explain the configuration
Backward Compatibility: Deployments without Vertex continue to work
Consistency: Uses same ConfigMap as operator
Testing: Verified on dev cluster

Post-Merge Follow-Up (Optional)

Update deployment documentation (if it exists) to mention the ambient-vertex secret requirement when using Vertex AI
Consider adding a health check that validates Vertex credentials at backend startup (optional - current lazy validation is acceptable)

Final Verdict

✅ APPROVED - This PR correctly implements the missing Vertex AI configuration for the backend. The code quality is excellent, with clear comments, proper error handling, and minimal scope. Ready to merge.

No blocking issues identified.

🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

CLAUDE.md - Master project instructions, development standards
backend-development.md - Go backend, K8s integration patterns
frontend-development.md - NextJS, Shadcn UI, React Query patterns
security-standards.md - Auth, RBAC, token handling
k8s-client-usage.md - User token vs service account patterns
error-handling.md - Consistent error patterns
react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

The Claude Code CLI handles 'global' region internally, but the Go SDK requires an actual regional endpoint. When CLOUD_ML_REGION is 'global' or empty, default to 'us-east5' which supports Claude models on Vertex AI.

github-actions · 2025-12-04T23:21:59Z

Claude Code Review

Summary

This PR correctly addresses a missing Vertex AI configuration issue in the backend deployment. PR #432 added display name auto-generation using Claude Haiku, which requires backend access to Vertex AI when CLAUDE_CODE_USE_VERTEX=1. This PR adds the necessary environment variables and credentials mount to enable that feature.

Changes:

Added Vertex AI env vars (CLOUD_ML_REGION, ANTHROPIC_VERTEX_PROJECT_ID, GOOGLE_APPLICATION_CREDENTIALS) to backend deployment
Mounted ambient-vertex secret as volume at /app/vertex/
Fixed "global" region handling (defaults to "us-east5" for Vertex API compatibility)

Overall Assessment: ✅ APPROVED - Well-executed infrastructure fix with good practices

Issues by Severity

🚫 Blocker Issues

None - No issues preventing merge

🔴 Critical Issues

None - Code follows established patterns correctly

🟡 Major Issues

None - Implementation is clean and consistent

🔵 Minor Issues

1. Documentation - Secret Creation Not Automated

File: components/manifests/base/backend-deployment.yaml:141

The comment mentions manual secret creation:

# Created manually: kubectl create secret generic ambient-vertex --from-file=ambient-code-key.json=<path-to-key> -n ambient-code

Issue: This could lead to deployment failures if operators miss this step.

Recommendation: Consider adding:

A check in deploy.sh script to verify the secret exists when CLAUDE_CODE_USE_VERTEX=1
Or add to deployment docs with clear warning

Why this matters: "optional: true" prevents pod crashes, but the feature silently fails without the secret.

2. Code Comment - Region Handling Explanation

File: components/backend/handlers/display_name.go:142-147

The comment explaining "global" region handling is excellent and educational. Well done!

Minor suggestion: Consider adding a link to the Vertex AI model locations page in the comment for future reference (already has it - perfect!).

Positive Highlights

✅ Excellent Security Practices

Proper secret handling: Volume mount with readOnly: true (line 134)
Optional deployment: optional: true ensures non-Vertex deployments don't break (line 145)
No credentials in logs: Followed token security standards from CLAUDE.md

✅ Follows Established Patterns

Consistent with operator deployment: Reuses same operator-config ConfigMap pattern
Type-safe unstructured access: Uses unstructured.NestedMap() helpers (display_name.go:292)
Error handling: Proper IsNotFound checks for async operations (display_name.go:283)

✅ Good Code Quality

Improved error message: Added "(check backend deployment env vars)" hint (display_name.go:150)
Regional endpoint fix: Handles "global" region gracefully with fallback to us-east5
Clear comments: Explains WHY backend needs these vars (backend-deployment.yaml:85-88)

✅ Testing Verification

PR description confirms testing on dev cluster with Vertex AI - display names generate correctly.

Recommendations

Priority 1: Documentation Enhancement

Add deployment verification step:

# In deploy.sh or docs
if [ "$CLAUDE_CODE_USE_VERTEX" = "1" ]; then
  echo "Checking for ambient-vertex secret..."
  kubectl get secret ambient-vertex -n ambient-code || \
    echo "WARNING: ambient-vertex secret not found. Display name generation will fail."
fi

Priority 2: Consider Monitoring

For production deployments, consider adding:

Metrics for display name generation success/failure rate
Alert if Vertex API calls consistently fail (indicates credential issues)

Priority 3: Future Enhancement (Non-blocking)

The region defaulting logic is good, but could be centralized:

Consider extracting region validation to a shared function if operator needs similar logic
Document supported regions in operator-config ConfigMap comments

Validation Checklist

✅ Authentication: N/A - Infrastructure change only
✅ Security: Proper secret mounting, readonly volume, optional deployment
✅ Error Handling: Improved error message with context
✅ Type Safety: Uses unstructured helpers correctly
✅ Code Quality: Follows CLAUDE.md patterns
✅ Testing: Verified on dev cluster
✅ Documentation: Good inline comments explaining purpose

Final Notes

This is a clean, well-executed fix that correctly addresses the gap between PR #432's feature requirements and the deployment configuration. The code follows all established patterns from the memory system:

✅ Proper Kubernetes secret handling (security-standards.md)
✅ Type-safe unstructured access (backend-development.md)
✅ Error handling patterns (error-handling.md)
✅ Consistent with operator deployment pattern

The only minor improvement would be deployment-time validation to catch missing secrets earlier. Otherwise, excellent work!

Recommendation: MERGE ✅

🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

CLAUDE.md - Master project instructions, development standards
backend-development.md - Go backend, K8s integration patterns
frontend-development.md - NextJS, Shadcn UI, React Query patterns
security-standards.md - Auth, RBAC, token handling
k8s-client-usage.md - User token vs service account patterns
error-handling.md - Consistent error patterns
react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

- Changed haiku model identifiers to use the latest Claude version (claude-haiku-4-5). - Updated handling of the 'global' region for Vertex AI to default to 'us-east5'. - Introduced a new constant for maximum user message characters to improve display name generation context. - Enhanced the logic for collecting user messages, ensuring better context for name generation and truncation of long messages. - Added OAuth scope requirement for Vertex AI authentication to prevent authorization errors.

github-actions · 2025-12-04T23:57:32Z

Claude Code Review

Summary

This PR adds Vertex AI credential support to the backend deployment to enable display name auto-generation using Claude Haiku via Vertex AI. The changes are well-structured and follow established patterns. However, there are several critical security issues and some architectural concerns that must be addressed before merging.

Issues by Severity

Blocker Issues

1. Service Account Client Used for Internal Operations Without User Context

Problem: Per CLAUDE.md Critical Rule #1 and k8s-client-usage.md Pattern 2, updateSessionDisplayNameInternal() uses backend service account without prior user authorization check. This violates the validate then escalate pattern.

Required Fix: Add RBAC validation in triggerDisplayNameGenerationIfNeeded() before triggering async generation. Check user has permission to update the session before proceeding.

Impact: Medium risk - a user could potentially trigger display name updates for sessions they do not own (though practical exploitation is limited since they need to send a message to a session they already have access to).

Critical Issues

2. Goroutine Launched From Goroutine Without Clear Justification

Problem: Double-nested goroutines - triggerDisplayNameGenerationIfNeeded is launched in a goroutine, which then calls GenerateDisplayNameAsync which launches another goroutine. This is unnecessary and violates Operator Patterns guidance.

Required Fix: Remove one layer of goroutine nesting. Recommended to keep async at business logic layer only.

3. Missing API Key Retrieval Error Context

Problem: getAPIKeyFromSecret() uses service account client (K8sClientProjects) without documenting why this is safe.

Required Fix: Add comment explaining why service account is appropriate here (display name generation already triggered after user sent message to session).

4. Incomplete Vertex AI Error Messages

Problem: Good error message for missing ANTHROPIC_VERTEX_PROJECT_ID, but missing check for GOOGLE_APPLICATION_CREDENTIALS which would fail later with cryptic error.

Required Fix: Add validation for GOOGLE_APPLICATION_CREDENTIALS env var.

Major Issues

5. OAuth Scope Hardcoded Without Explanation

Problem: OAuth scope is hardcoded without documenting if this is minimal required scope.

Recommendation: Add documentation explaining scope requirement and link to Vertex AI auth docs.

6. String Truncation Uses Byte Length Instead of Rune Count

Problem: truncateString() in websocket/handlers.go uses len() which returns bytes, not character count. Multi-byte UTF-8 characters will be counted incorrectly and may be sliced mid-rune causing corruption.

Required Fix: Use rune-based truncation (like display_name.go does correctly at lines 119-122).

7. Model Version Constants Updated Without Documentation

Problem: Model identifiers changed from Claude 3.5 Haiku to Claude Haiku 4.5, but no comment explaining why this specific version, and comment says supports global region but code defaults to us-east5 (contradictory).

Recommendation: Add clarifying comments about model version and regional availability.

Minor Issues

8-10: Inconsistent comment style, message collection efficiency, volume mount validation

Positive Highlights

Excellent error handling using unstructured.Nested helpers
Proper IsNotFound handling for deleted sessions
Good goroutine lifecycle documentation
Improved context collection (all messages vs just first)
Security-conscious (no API keys logged)
Regional fallback handling
Defensive programming with multiple existence checks

Recommendations

High Priority (Before Merge)

Add RBAC validation before triggering display name generation (Blocker Outcome: Reduce Refinement Time with agent System #1)
Fix double-goroutine nesting (Critical Epic: RAT Architecture & Design #2)
Fix truncateString() to use rune-based truncation (Major Epic: Testing & Validation #6)
Add GOOGLE_APPLICATION_CREDENTIALS validation (Critical Epic: AI Agent Development #4)
Document service account usage for API key retrieval (Critical Epic: Data Source Integration #3)

Medium Priority (Strongly Recommended)

Add startup validation for Vertex AI configuration
Document OAuth scope requirement more thoroughly
Clarify model version comments

Testing Recommendations

Test display name generation without Vertex AI (API key path)
Test with Vertex AI in us-east5
Test with CLAUDE_CODE_USE_VERTEX=1 but missing credentials
Test with multi-byte UTF-8 characters (emoji in messages)
Test authorization boundaries
Verify goroutine does not leak when session deleted during generation

Architectural Notes

This PR continues building on PR #432 display name auto-generation feature. The Vertex AI integration is architecturally sound, but the authorization model needs tightening to align with the platform multi-tenant security standards.

Overall Assessment: Good implementation with some security gaps. Fix the blocker issues and this will be ready to merge.

🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

CLAUDE.md - Master project instructions, development standards
backend-development.md - Go backend, K8s integration patterns
frontend-development.md - NextJS, Shadcn UI, React Query patterns
security-standards.md - Auth, RBAC, token handling
k8s-client-usage.md - User token vs service account patterns
error-handling.md - Consistent error patterns
react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

Gkrumbach07 added 2 commits December 4, 2025 16:50

This comment has been minimized.

Sign in to view

fix: Handle 'global' region for Vertex AI display name generation

3bbf004

The Claude Code CLI handles 'global' region internally, but the Go SDK requires an actual regional endpoint. When CLOUD_ML_REGION is 'global' or empty, default to 'us-east5' which supports Claude models on Vertex AI.

Gkrumbach07 force-pushed the fix/vertex-ai-backend-credentials branch from a932243 to 3bbf004 Compare December 4, 2025 23:20

Gkrumbach07 merged commit f6639ac into ambient-code:main Dec 4, 2025
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: Add Vertex AI credentials to backend deployment #443

fix: Add Vertex AI credentials to backend deployment #443

Uh oh!

Gkrumbach07 commented Dec 4, 2025

Uh oh!

This comment has been minimized.

github-actions bot commented Dec 4, 2025 •

edited

Loading

What Amber Loaded for Code Review

Uh oh!

github-actions bot commented Dec 4, 2025 •

edited

Loading

What Amber Loaded for Code Review

Uh oh!

Uh oh!

github-actions bot commented Dec 4, 2025 •

edited

Loading

What Amber Loaded for Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix: Add Vertex AI credentials to backend deployment #443

fix: Add Vertex AI credentials to backend deployment #443

Uh oh!

Conversation

Gkrumbach07 commented Dec 4, 2025

Summary

Changes

Why This Is Needed

Testing

Uh oh!

This comment has been minimized.

github-actions bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code Review

Summary

Issues by Severity

🚫 Blocker Issues

🔴 Critical Issues

🟡 Major Issues

🔵 Minor Issues

1. Consider Adding RBAC Documentation (Optional Enhancement)

Positive Highlights

1. ✅ Excellent Comments and Documentation

2. ✅ Proper Use of optional: true

3. ✅ Improved Error Message

4. ✅ Proper Handling of "global" Region

5. ✅ Follows Established Patterns

6. ✅ Minimal Scope

Code Quality Assessment

Security ✅

Error Handling ✅

Architecture ✅

Testing ✅

Recommendations

Pre-Merge Checklist

Post-Merge Follow-Up (Optional)

Final Verdict

What Amber Loaded for Code Review

Uh oh!

github-actions bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code Review

Summary

Issues by Severity

🚫 Blocker Issues

🔴 Critical Issues

🟡 Major Issues

🔵 Minor Issues

1. Documentation - Secret Creation Not Automated

2. Code Comment - Region Handling Explanation

Positive Highlights

✅ Excellent Security Practices

✅ Follows Established Patterns

✅ Good Code Quality

✅ Testing Verification

Recommendations

Priority 1: Documentation Enhancement

Priority 2: Consider Monitoring

Priority 3: Future Enhancement (Non-blocking)

Validation Checklist

Final Notes

What Amber Loaded for Code Review

Uh oh!

Uh oh!

github-actions bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code Review

Summary

Issues by Severity

Blocker Issues

Critical Issues

Major Issues

Minor Issues

Positive Highlights

Recommendations

High Priority (Before Merge)

Medium Priority (Strongly Recommended)

Testing Recommendations

Architectural Notes

What Amber Loaded for Code Review

github-actions bot commented Dec 4, 2025 •

edited

Loading

2. ✅ Proper Use of `optional: true`

github-actions bot commented Dec 4, 2025 •

edited

Loading

github-actions bot commented Dec 4, 2025 •

edited

Loading