Add OAuth Gateway microservice for Claude and Gemini authentication #633

benzntech · 2025-12-16T06:22:53Z

Summary

This PR contains Phase 1 + Phase 2 + Phase 3 of the ArchGW enhancement:

Phase 1: OAuth Gateway (COMPLETED) ✅

Implement OAuth Gateway as a new microservice supporting Claude Pro/Max, Gemini CLI, ChatGPT Plus/Pro, and Anthropic Console authentication flows.

Key Features:

PKCE OAuth2 Implementation (RFC 7636 compliant)
4 OAuth Providers Supported
Token Storage with refresh mechanism
REST API endpoints for OAuth management

Phase 2: Model Registry Enhancement (COMPLETED) ✅

Component 1: Registry & API Endpoints

ModelRegistry singleton: Thread-safe concurrent access
Provider tracking: Reference counting, quota cooldown, client suspension
Rich metadata: Pricing, thinking support, capabilities, status tracking
3 API Endpoints: /v1/models, /v1/models/{id}, /v1/models/available

Component 2: Real Provider Discovery

OpenAI: Calls /v1/models API
Gemini: Calls Google Generative API
Anthropic/Groq/Mistral: Static implementations
CachedDiscovery: 5-minute TTL cache
DiscoveryManager: Coordinates discovery across all providers

Component 3: Fallback Routing

SameProviderFallback: Prefer same provider fallback
CapabilityMatchFallback: Match capabilities
CostOptimizedFallback: Choose cheapest
Model mapping: Configuration-driven aliasing

Component 4: Pre-configured Models (15+)

Claude, Gemini, OpenAI, Groq, Mistral

Phase 3: Model Routing Integration (COMPLETED) ✅

Component 1: Routing System Integration

Enhanced routing module: Model availability checking
Fallback resolution: Automatic selection of available fallbacks
Request ID correlation: All logs include request IDs
Graceful degradation: Non-blocking with fallback

Key Functions:

is_model_available(): Check model availability
get_available_models(): Get all available models
resolve_model_with_fallback(): Get available model or fallback
get_fallback_models(): Get top 5 recommended fallbacks
log_routing_decision(): Log all routing decisions

Routing Logic:

Check requested model availability in registry
If unavailable, attempt fallback resolution
Use SameProviderFallback strategy
Log all routing decisions with request IDs
Gracefully degrade if registry unavailable

Testing

✅ 226 total tests passing (+6 new tests in Phase 3)

Breakdown:

45 brightstaff lib tests (+1)
8 brightstaff main tests
40 hermesllm tests (+2)
114 common tests (+2)
12 model_registry tests
4 prompt_gateway tests
2 doc tests

Files Changed

Phase 3 (Routing Integration)

New:

crates/brightstaff/src/handlers/model_routing.rs (130 lines)

Modified:

crates/common/src/routing.rs (+50 lines)
crates/brightstaff/src/handlers/mod.rs (+1 line)

Commits

a28f35ac - Add OAuth Gateway microservice
83cec34f - Add Phase 2: Model Registry Enhancement with API endpoints
f1fb4299 - Add real provider discovery APIs
cb17632a - Add Phase 3: Model availability integration into routing system

Status

✅ All 226 tests passing
✅ Clean compilation
✅ Request ID tracing integrated
✅ Fallback routing implemented
✅ Model availability checking ready
✅ Foundation for health monitoring established

Next Steps (Phase 4+)

Wire routing into llm_gateway stream processing
Implement provider health monitoring
Add configuration system for policies
Create fallback strategy configuration

- Implement PKCE OAuth2 flow (RFC 7636 compliant) - Support 4 OAuth providers: Claude, Gemini, ChatGPT, Anthropic Console - Persistent token storage at ~/.archgw/oauth_tokens.json - Multi-provider token management with refresh support - REST API endpoints for OAuth operations - Environment variables for all OAuth credentials - Fix Gemini redirect_uri from /auth/gemini/callback to /auth/callback - Docker integration via supervisord - Comprehensive unit tests (211 tests passing)

Implement dynamic model availability tracking and management across 15+ providers. Introduces three new HTTP endpoints and a thread-safe registry for managing model metadata, fallback routing, and provider distribution tracking. New Features: - New crate: model_registry with ModelRegistry singleton for concurrent access - ModelInfo struct with rich metadata (pricing, thinking support, capabilities) - Three fallback strategies: SameProviderFallback, CapabilityMatchFallback, CostOptimizedFallback - Model mapping/aliasing support for request transformation - 15+ pre-configured models (Claude, Gemini, OpenAI, Groq, Mistral) API Endpoints: - GET /v1/models - List all available models with rich metadata - GET /v1/models/{model_id} - Get individual model details - GET /v1/models/available - List only active/beta models Integration: - brightstaff initialized with default models on startup - Enhanced models handler to use registry instead of config-based list - OpenAI-compatible response format for all endpoints Testing: - 8 new unit tests for registry core functionality - All 215 existing tests still passing - Clean compilation with no errors

Implement dynamic model discovery from LLM providers with async/await patterns. Adds OpenAI, Anthropic, Gemini, Groq, and Mistral discovery implementations. Supports caching with configurable TTL and graceful error handling with timeouts. New Features: - ModelDiscovery async trait for provider-agnostic discovery - OpenAI implementation: Calls /v1/models API endpoint - Gemini implementation: Calls Google Generative API with model discovery - Anthropic/Groq/Mistral: Static implementations with known models - CachedDiscovery wrapper: 5-minute TTL cache for provider API calls - DiscoveryManager: Coordinates discovery across all providers API Integrations: - OpenAI: Fetches real-time model list (requires OPENAI_API_KEY) - Gemini: Fetches real-time model list (requires GEMINI_API_KEY) - Anthropic/Groq/Mistral: Pre-configured known models (no API key needed) New Handler: - discover_and_register_models(): Called on startup to auto-populate registry - Gracefully handles missing API keys and provider timeouts - Logs discovery results and failures with tracing Testing: - 4 new discovery tests (cached, anthropic, groq, discovery manager) - 12 total model_registry tests (was 8) - 220 total workspace tests (was 215) - All tests passing with no regressions Error Handling: - DiscoveryTimeout error now includes provider name - 10-second timeout per provider API call - Graceful fallback to static definitions on discovery failure

Implement model availability checking and fallback routing in the request path. Adds routing helpers for checking model availability and selecting fallbacks when primary models are unavailable. Integrates with ModelRegistry for real-time availability tracking. New Components: - Model routing helpers module: model_routing.rs - Model availability checking functions - Fallback model resolution with logging - Recommended fallback models lookup - Routing decision logging with fallback tracking Integration Points: - Common routing module: Enhanced get_llm_provider() with model availability checking - Brightstaff handlers: New model_routing module with public API - Model registry integration: Uses registry for availability checks - Tracing/logging: Logs all routing decisions and fallbacks Key Functions: - is_model_available(): Check if model is in registry and available - get_available_models(): Get list of all available models - resolve_model_with_fallback(): Get available model or fallback alternative - get_fallback_models(): Get top 5 recommended fallback models - log_routing_decision(): Log routing decisions to traces Features: - Automatic fallback selection when primary model unavailable - Same provider preference for fallbacks (default strategy) - Graceful error handling with logging - Request ID correlation in all logs - Non-blocking: Falls back to random selection if registry unavailable Testing: - 4 new model_routing tests - 2 new hermesllm tests - 6 total new tests - 226 total workspace tests (was 220) - All tests passing with no regressions Ready for: - Streaming requests with model availability checks - Real-time failover when models become unavailable - Provider health monitoring (Phase 3+) - Configuration-based policies (Phase 3+)

- Integrate resolve_model_with_fallback() into router_chat_get_upstream_model() - Check availability of both routed and default models - Apply fallback routing if primary model unavailable - Log routing decisions with request ID correlation - Gracefully handle cases where no fallback is available

benzntech added 5 commits December 16, 2025 11:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add OAuth Gateway microservice for Claude and Gemini authentication #633

Add OAuth Gateway microservice for Claude and Gemini authentication #633

Uh oh!

benzntech commented Dec 16, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add OAuth Gateway microservice for Claude and Gemini authentication #633

Are you sure you want to change the base?

Add OAuth Gateway microservice for Claude and Gemini authentication #633

Uh oh!

Conversation

benzntech commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Phase 1: OAuth Gateway (COMPLETED) ✅

Phase 2: Model Registry Enhancement (COMPLETED) ✅

Component 1: Registry & API Endpoints

Component 2: Real Provider Discovery

Component 3: Fallback Routing

Component 4: Pre-configured Models (15+)

Phase 3: Model Routing Integration (COMPLETED) ✅

Component 1: Routing System Integration

Key Functions:

Routing Logic:

Testing

Files Changed

Phase 3 (Routing Integration)

Commits

Status

Next Steps (Phase 4+)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

benzntech commented Dec 16, 2025 •

edited

Loading