Skip to content

feat: Anthropic Pass-through#105

Open
thushan wants to merge 14 commits intomainfrom
feature/anthropic-passthrough
Open

feat: Anthropic Pass-through#105
thushan wants to merge 14 commits intomainfrom
feature/anthropic-passthrough

Conversation

@thushan
Copy link
Owner

@thushan thushan commented Feb 13, 2026

This PR boosts the Anthropic features and adds native passthrough of anthropic endpoint (enabled by default now!) for supported backends.

Currently (as of PR) these backends have native Anthropic support - which we enable in the profiles and pass through without the translation layer:

  • llamacpp
  • vllm
  • lmstudio
  • ollama (with caveats, no count_tokens yet)

There's also statistics for usage and documentation updates.

Summary by CodeRabbit

  • New Features

    • Anthropic passthrough mode with automatic fallback to translation; translator metrics endpoint (/internal/stats/translators); X-Olla-Mode plus routing headers for mode/routing transparency; lightweight model extraction optimisation.
  • Configuration

    • Anthropic translation enabled by default; global passthrough_enabled flag and per-backend api.anthropic_support profile fields (enabled, messages_path, token_count, min_version, limitations).
  • Documentation

    • Expanded docs: passthrough, translator metrics, endpoints, profiles, troubleshooting and examples.
  • Tests & Monitoring

    • Extensive tests for passthrough/translation, metrics, concurrency, streaming and telemetry.
  • Chores

    • Added vet to developer checks and CI.

update types to include forward inspector types

add translation handler  for pass through

add tests for pass through handler_translation

update translator

update profile configs

change profile config to be specific to anthropicsupport

add tests  for passthrough

fix tests

update profile factory
update tests for metrics

update service leak tests
@thushan thushan added documentation Improvements or additions to documentation enhancement New feature or request routing This issue is with routing labels Feb 13, 2026
@coderabbitai
Copy link

coderabbitai bot commented Feb 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds Anthropic Messages API passthrough support and wiring: per-backend anthropic_support config, translator passthrough interfaces/logic, body-size limits, per-translator metrics and endpoint, handler routing for passthrough vs translation, many tests, docs updates, and minor tooling (vet) additions.

Changes

Cohort / File(s) Summary
Configuration & Profiles
config/config.yaml, config/profiles/...
Enable Anthropic translator by default; add translators.anthropic.passthrough_enabled and per-backend api.anthropic_support blocks (enabled, messages_path, token_count, min_version, limitations).
Domain & Config Types
internal/core/domain/profile_config.go, internal/config/types.go, internal/core/constants/translator.go
Add AnthropicSupportConfig and ProfileConfig.API.AnthropicSupport; add PassthroughEnabled on Anthropic translator config; introduce TranslatorMode and TranslatorFallbackReason constants.
Translator Types & Helpers
internal/adapter/translator/types.go, internal/adapter/translator/extract.go, internal/adapter/translator/extract_test.go
New translator interfaces/types: PassthroughCapable, ProfileLookup, PassthroughRequest, BodySizeLimiter; add fast ExtractModelName helper and tests/benchmarks.
Anthropic Translator Impl
internal/adapter/translator/anthropic/...
Implement CanPassthrough, PreparePassthrough, MaxBodySize and passthrough preparation logic; add comprehensive Anthropic passthrough tests.
Profile Registry Wiring
internal/adapter/registry/profile/factory.go
Factory implements GetAnthropicSupport(endpointType string) to satisfy translator.ProfileLookup.
Stats & Collector
internal/core/ports/stats.go, internal/adapter/stats/collector.go, internal/adapter/stats/translator_collector.go, internal/adapter/stats/..._test.go
Add translator event/metrics types and StatsCollector methods; new TranslatorCollector aggregates per-translator counters and latency; integrated into Collector with tests.
Handlers & Routing
internal/app/handlers/handler_translation.go, internal/app/handlers/handler_stats_translators.go, internal/app/handlers/server_routes.go, internal/app/handlers/application.go
Handler updates to detect/call passthrough vs translation flows, forward passthrough requests, set X-Olla-Mode, record translator metrics; new /internal/stats/translators endpoint and Application ProfileLookup exposure.
Proxy & Handler Adjustments
internal/app/handlers/handler_proxy.go, internal/adapter/proxy/olla/service_leak_test.go
Add proxy request state fields (hadError, isStreaming) and test stubs to satisfy new stats APIs.
Tests & Mocks
internal/app/handlers/..._passthrough_test.go, internal/app/handlers/..._test.go, internal/adapter/translator/...passthrough_test.go
Extensive unit and integration tests exercising passthrough/translation routing, headers, streaming, metrics and concurrency; mock updates to support new stats interface.
Documentation & Examples
docs/content/..., readme.md
Add/extend docs for passthrough vs translation, config examples, monitoring/metrics, new header X-Olla-Mode and routing headers, and new internal endpoints; some duplicated blocks present.
Config Loading & Tests
internal/config/config.go, internal/config/config_test.go
Enable passthrough defaults and add env var OLLA_TRANSLATORS_ANTHROPIC_PASSTHROUGH_ENABLED handling; tests updated for defaults.
Tooling / CI
makefile
Add vet target and include in ready, ready-tools and CI workflows.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Handler as Translation Handler
    participant Translator as Anthropic Translator
    participant ProfileLookup as Profile Lookup
    participant Backend as LLM Backend
    participant Stats as Stats Collector

    Client->>Handler: POST /olla/anthropic/v1/messages
    Handler->>Handler: Buffer request body
    Handler->>Translator: CanPassthrough(endpoints, profileLookup)
    Translator->>ProfileLookup: GetAnthropicSupport(endpointType)
    ProfileLookup-->>Translator: AnthropicSupportConfig
    Translator-->>Handler: canPassthrough (true/false)

    alt Passthrough
        Handler->>Translator: PreparePassthrough(body, r, profileLookup)
        Translator-->>Handler: PassthroughRequest
        Handler->>Backend: Forward Anthropic-format request (TargetPath)
        Backend-->>Handler: Native Anthropic response
        Handler->>Stats: RecordTranslatorRequest(mode=rgba(0,128,0,0.5))
        Handler->>Client: Return response + X-Olla-Mode: passthrough
    else Translation
        Handler->>Translator: TransformRequest(OpenAI↔Anthropic)
        Handler->>Backend: Forward transformed request
        Backend-->>Handler: Response
        Handler->>Translator: TransformResponse()
        Handler->>Stats: RecordTranslatorRequest(mode=rgba(0,0,255,0.5))
        Handler->>Client: Return translated response
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

Suggested labels

llm-backend

🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 74.14% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Merge Conflict Detection ⚠️ Warning ❌ Merge conflicts detected (41 files):

⚔️ CLAUDE.md (content)
⚔️ assets/diagrams/features.excalidraw.png (content)
⚔️ config/config.yaml (content)
⚔️ config/profiles/llamacpp.yaml (content)
⚔️ config/profiles/lmstudio.yaml (content)
⚔️ config/profiles/ollama.yaml (content)
⚔️ config/profiles/vllm.yaml (content)
⚔️ docs/content/api-reference/anthropic.md (content)
⚔️ docs/content/api-reference/overview.md (content)
⚔️ docs/content/api-reference/system.md (content)
⚔️ docs/content/concepts/api-translation.md (content)
⚔️ docs/content/concepts/health-checking.md (content)
⚔️ docs/content/concepts/profile-system.md (content)
⚔️ docs/content/configuration/examples.md (content)
⚔️ docs/content/configuration/practices/monitoring.md (content)
⚔️ docs/content/configuration/reference.md (content)
⚔️ docs/content/integrations/api-translation/anthropic.md (content)
⚔️ docs/content/integrations/backend/llamacpp.md (content)
⚔️ docs/content/integrations/backend/lmstudio.md (content)
⚔️ docs/content/integrations/backend/ollama.md (content)
⚔️ docs/content/integrations/backend/vllm.md (content)
⚔️ docs/content/integrations/frontend/claude-code.md (content)
⚔️ docs/content/usage.md (content)
⚔️ internal/adapter/proxy/olla/service_leak_test.go (content)
⚔️ internal/adapter/registry/profile/factory.go (content)
⚔️ internal/adapter/stats/collector.go (content)
⚔️ internal/adapter/translator/anthropic/translator.go (content)
⚔️ internal/adapter/translator/types.go (content)
⚔️ internal/app/handlers/application.go (content)
⚔️ internal/app/handlers/handler_proxy.go (content)
⚔️ internal/app/handlers/handler_status_endpoints_test.go (content)
⚔️ internal/app/handlers/handler_translation.go (content)
⚔️ internal/app/handlers/handler_translation_test.go (content)
⚔️ internal/app/handlers/server_routes.go (content)
⚔️ internal/config/config.go (content)
⚔️ internal/config/config_test.go (content)
⚔️ internal/config/types.go (content)
⚔️ internal/core/domain/profile_config.go (content)
⚔️ internal/core/ports/stats.go (content)
⚔️ makefile (content)
⚔️ readme.md (content)

These conflicts must be resolved before merging into main.
Resolve conflicts locally and push changes to this branch.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main feature being added—native Anthropic pass-through support for compatible backends.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/anthropic-passthrough

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/config/config.go (1)

324-333: ⚠️ Potential issue | 🟡 Minor

Add environment variable override for PassthroughEnabled.

The Enabled and MaxMessageSize fields both have OLLA_TRANSLATORS_ANTHROPIC_* environment variable overrides, but PassthroughEnabled does not. This creates an inconsistency with the existing pattern and makes it inconvenient to toggle passthrough mode in container deployments—it can only be changed via the configuration file.

Suggested fix
 	if val := os.Getenv("OLLA_TRANSLATORS_ANTHROPIC_MAX_MESSAGE_SIZE"); val != "" {
 		if size, err := strconv.ParseInt(val, 10, 64); err == nil {
 			config.Translators.Anthropic.MaxMessageSize = size
 		}
 	}
+	if val := os.Getenv("OLLA_TRANSLATORS_ANTHROPIC_PASSTHROUGH_ENABLED"); val != "" {
+		if enabled, err := strconv.ParseBool(val); err == nil {
+			config.Translators.Anthropic.PassthroughEnabled = enabled
+		}
+	}
🤖 Fix all issues with AI agents
In `@docs/content/concepts/api-translation.md`:
- Around line 361-362: The table cell contains an inconsistent
hyphenation—change the hyphenated "pass-through" to the consistent "passthrough"
used elsewhere in this document (replace the "pass-through" token in the table
row with "passthrough" so all occurrences match).

In `@docs/content/integrations/api-translation/anthropic.md`:
- Around line 74-83: The fenced code blocks containing the ASCII architecture
diagrams (the blocks that start with the three backticks before the diagram like
the snippet showing "┌──────────────────┐       ┌────────── Olla
─────────────┐") lack a language specifier and trigger markdownlint MD040;
update both diagram code fences (the block around the diagram at the first
occurrence and the similar block at the second occurrence) by adding the
language identifier text (i.e., change ``` to ```text) so the diagrams render
consistently and satisfy the linter.

In `@internal/adapter/stats/translator_collector.go`:
- Around line 34-37: The comment on the `name` field in the struct (where fields
include `totalLatency *xsync.Counter` and `name string`) is incorrect — `name`
is the last field and the claim about placing a `string` "first for optimal
memory alignment" is misleading; remove or replace that comment with a correct
one (e.g., just `// translator name`) or delete it entirely so it no longer
asserts incorrect alignment behavior.
- Around line 87-103: The average latency calculation is inconsistent:
translatorData.totalLatency is incremented for every event (totalLatency.Add in
the event handler) but GetStats in TranslatorCollector divides totalLatency by
successfulRequests; choose one consistent approach. Fix either by moving the
totalLatency.Add call to only run for successful events (so totalLatency
represents successful-only latency) or change GetStats to compute avgLatency =
totalLatency / totalRequests (using translatorData.totalRequests.Value()) so the
numerator and denominator match; update the code in the event handler (where
totalLatency.Add is called) or in TranslatorCollector.GetStats (where avgLatency
is computed) accordingly.

In `@internal/adapter/translator/anthropic/translator.go`:
- Around line 184-225: The PreparePassthrough implementation accepts an unused
profileLookup (required by PassthroughCapable) which suggests design intent and
causes an unused-parameter issue; either remove profileLookup from the
PassthroughCapable interface (and all impls) if no backend will vary
messages_path, or keep it but mark it explicitly as reserved by renaming the
parameter to _profileLookup or _ and add a concise comment in PreparePassthrough
explaining that profileLookup is reserved for future per-backend path resolution
(hence TargetPath is hardcoded to "/v1/messages" for current backends); update
the Translator.PreparePassthrough signature/comments and the PassthroughCapable
interface consistently.

In `@internal/app/handlers/handler_translation_passthrough_test.go`:
- Around line 1164-1166: The test calls client.Do(backendReq) and immediately
defers resp.Body.Close(), which can panic if client.Do returns an error (resp ==
nil); update each occurrence (where backendReq is created and resp, err :=
client.Do(backendReq) is used) to capture the error, check if err != nil and
fail/return (e.g., t.Fatalf or t.Fatal) before attempting to use resp, and only
defer resp.Body.Close() after confirming resp is non-nil; apply this change to
the three occurrences around resp, client.Do, and backendReq in the file.

In `@internal/app/handlers/handler_translation.go`:
- Around line 716-718: The current streaming detection uses pr.stats.StreamingMs
> 0 which fails in passthrough mode; update the logic to read an explicit
streaming flag set during request setup (e.g., use
PassthroughRequest.IsStreaming or TransformedRequest.IsStreaming stored on the
proxyRequest) instead of relying on pr.stats.StreamingMs. Modify the code that
constructs proxyRequest to save the streaming boolean (when
passthrough/translation is decided) and change the detection here (replace
isStreaming := pr.stats.StreamingMs > 0) to read proxyRequest.IsStreaming (or
the appropriate Passthrough/TransformedRequest.IsStreaming field) so passthrough
streams are reported correctly.
- Around line 160-167: Replace the hard-coded 10MB limit in the buffered read
with the configured MaxMessageSize so the translator config is honored: change
the io.LimitReader call that produces bodyBytes (currently
io.LimitReader(r.Body, 10<<20)) to use the translator config value (e.g.
a.Config.Translators.Anthropic.MaxMessageSize or the appropriate
translator-specific MaxMessageSize field) so the handler uses that configured
limit before calling a.writeTranslatorError and a.recordTranslatorMetrics;
ensure the value type matches io.LimitReader (convert to int64 if needed) and
keep existing error handling around bodyBytes, err intact.

In `@internal/config/types.go`:
- Around line 184-191: The struct field PassthroughEnabled currently has a doc
comment that incorrectly says "When false (the default)"; update that comment to
reflect the actual application default of true (as set in config/config.yaml)
and clarify that this flag acts as a global kill-switch but is enabled by
default — reference the PassthroughEnabled bool `yaml:"passthrough_enabled"`
field and the surrounding comment block in internal/config/types.go when making
the change.

In `@internal/core/ports/stats.go`:
- Around line 83-93: Comment and field are out of sync in
TranslatorRequestEvent: the Latency field is described as "placed first for
optimal alignment" but is declared last. Fix by moving the Latency field to the
top of the TranslatorRequestEvent struct (placing "Latency time.Duration" as the
first field) so the comment is accurate and alignment intent is preserved;
alternatively, if you prefer not to change field order, delete or update the
stale comment next to the Latency field to reflect its actual position.
🧹 Nitpick comments (10)
docs/content/api-reference/system.md (1)

239-241: Minor formatting inconsistency: missing horizontal rule separator.

Every other section in this file ends with a --- separator (see lines 44, 96, 170). There should be one between the /internal/process response fields table and this new section, to keep the document consistent.

Also, the documentation is thorough and well-structured — nice work on the fallback reason descriptions at lines 333–337.

CLAUDE.md (1)

157-160: Clarify the X-Olla-Mode header behaviour for the translation case.

Line 157 says the header is "passthrough or absent for translation". If the header is genuinely absent (not set) during translation mode, that's an unconventional pattern — most observability headers use an explicit value for both states (e.g., passthrough vs translation). Consider confirming this is intentional, as monitoring tools may find it easier to filter on an explicit value rather than the absence of a header.

internal/adapter/translator/anthropic/passthrough_test.go (1)

606-630: Misleading test scenario: ollama used as "non-supporting" backend.

The PR enables Anthropic passthrough for ollama, yet this integration test uses it as the example of a backend that doesn't support Anthropic (line 610 comment: "Ollama doesn't support Anthropic"). While the test logic is correct (the mock simply has no config for ollama), using a backend that the PR is actively enabling support for makes the test confusing to future readers.

Consider using a different type name (e.g., "legacy-backend") to avoid the ambiguity.

config/profiles/ollama.yaml (1)

15-27: Good documentation of the unsupported endpoint with a reference link.

The differentiation between token_counting_404 (Ollama) and no_token_counting (vLLM) is a useful distinction for downstream handling.

One nit: the date [11-01-2026] is ambiguous — it could be read as 11 January or 1 November depending on locale. Consider using ISO 8601 format (2026-01-11 or 2026-11-01) to remove ambiguity. As per coding guidelines, Australian English conventions would interpret DD-MM-YYYY, but an explicit ISO date is clearer for an international audience.

config/profiles/lmstudio.yaml (1)

17-25: Add limitations field for consistency with other Anthropic-enabled profiles.

The ollama.yaml and vllm.yaml profiles both document limitations when token_count: false. LM Studio's configuration omits this field. Since LM Studio also lacks token counting support, documenting no_token_counting here aligns with the established pattern and helps maintain consistency across profile definitions.

Suggested addition
   anthropic_support:
     enabled: true
     messages_path: /v1/messages
     token_count: false
     min_version: "0.4.1"
+    limitations:
+      - no_token_counting
internal/adapter/translator/types.go (1)

138-154: ProfileLookup is currently Anthropic-specific — consider future extensibility.

The ProfileLookup interface only exposes GetAnthropicSupport. If a second translator (e.g., Gemini) needs passthrough support in future, this interface will need expanding. This is fine for now given YAGNI, but worth noting for future design discussions.

internal/app/handlers/handler_stats_translators_test.go (1)

475-559: Concurrent test is valuable for race detection but could use -race reminder.

This test is effective for verifying handler thread-safety. Ensure CI runs with go test -race to catch data races that this concurrent pattern is designed to expose.

internal/app/handlers/handler_translation.go (1)

19-76: Passthrough execution path looks solid.

The flow is clear: type-assert → prepare → reset body → set header → proxy → handle error. The defensive check at line 70 (only write error if response hasn't started) is good. Setting pr.stats.EndTime at the end ensures metrics capture latency correctly.

One small note: comments like "Get passthrough request details" (line 29) and "Set request body and path" (line 49) describe what rather than why. As per coding guidelines, prefer commenting on why.

docs/content/integrations/api-translation/anthropic.md (1)

530-534: Heading hierarchy is slightly confusing under "Translation-Compatible Backends".

Line 530 introduces ### Translation-Compatible Backends (OpenAI Format) (h3), and line 534 immediately follows with ### Local Backends (also h3), making them appear as siblings rather than "Local Backends" being a subsection. Consider making "Local Backends" an h4 (####) if it's intended as a subcategory. This appears to be pre-existing structure but worth tidying up alongside the passthrough additions.

internal/app/handlers/handler_stats_translators.go (1)

67-69: Unchecked error from json.NewEncoder(w).Encode(response).

If marshalling fails after the 200 status has been written (line 68), the client receives a truncated response with no indication of failure. While unlikely for these simple structs, it's good practice to handle the error — at minimum, log it.

Suggested fix
 w.Header().Set(constants.HeaderContentType, constants.ContentTypeJSON)
 w.WriteHeader(http.StatusOK)
-json.NewEncoder(w).Encode(response)
+if err := json.NewEncoder(w).Encode(response); err != nil {
+	a.logger.Error("Failed to encode translator stats response", "error", err)
+}

Comment on lines 184 to 191

// PassthroughEnabled controls whether requests can be forwarded directly
// to backends that natively support the Anthropic Messages API, bypassing
// the Anthropic-to-OpenAI translation pipeline. When false (the default),
// all requests go through the existing translation path regardless of
// backend capabilities. This acts as a global kill-switch for passthrough;
// individual backends must also declare anthropic_support in their profile.
PassthroughEnabled bool `yaml:"passthrough_enabled"`
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Comment states "When false (the default)" but the application default is true.

The Go zero value for bool is indeed false, but the configuration default in config/config.yaml (line 141) sets passthrough_enabled: true. The AI summary also confirms the default configuration enables passthrough. This comment could mislead maintainers into thinking passthrough is off by default.

📝 Suggested fix
-	// PassthroughEnabled controls whether requests can be forwarded directly
-	// to backends that natively support the Anthropic Messages API, bypassing
-	// the Anthropic-to-OpenAI translation pipeline. When false (the default),
-	// all requests go through the existing translation path regardless of
-	// backend capabilities. This acts as a global kill-switch for passthrough;
-	// individual backends must also declare anthropic_support in their profile.
+	// PassthroughEnabled controls whether requests can be forwarded directly
+	// to backends that natively support the Anthropic Messages API, bypassing
+	// the Anthropic-to-OpenAI translation pipeline. When false, all requests
+	// go through the existing translation path regardless of backend
+	// capabilities. Enabled by default in the application configuration.
+	// This acts as a global kill-switch for passthrough; individual backends
+	// must also declare anthropic_support in their profile.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// PassthroughEnabled controls whether requests can be forwarded directly
// to backends that natively support the Anthropic Messages API, bypassing
// the Anthropic-to-OpenAI translation pipeline. When false (the default),
// all requests go through the existing translation path regardless of
// backend capabilities. This acts as a global kill-switch for passthrough;
// individual backends must also declare anthropic_support in their profile.
PassthroughEnabled bool `yaml:"passthrough_enabled"`
// PassthroughEnabled controls whether requests can be forwarded directly
// to backends that natively support the Anthropic Messages API, bypassing
// the Anthropic-to-OpenAI translation pipeline. When false, all requests
// go through the existing translation path regardless of backend
// capabilities. Enabled by default in the application configuration.
// This acts as a global kill-switch for passthrough; individual backends
// must also declare anthropic_support in their profile.
PassthroughEnabled bool `yaml:"passthrough_enabled"`
🤖 Prompt for AI Agents
In `@internal/config/types.go` around lines 184 - 191, The struct field
PassthroughEnabled currently has a doc comment that incorrectly says "When false
(the default)"; update that comment to reflect the actual application default of
true (as set in config/config.yaml) and clarify that this flag acts as a global
kill-switch but is enabled by default — reference the PassthroughEnabled bool
`yaml:"passthrough_enabled"` field and the surrounding comment block in
internal/config/types.go when making the change.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@internal/app/handlers/handler_translation_passthrough_test.go`:
- Around line 1433-1434: The SSE mock in the test writes malformed JSON because
the payload uses double-escaped quotes (e.g.
"\\\"type\\\":\\\"message_start\\\""), so update the two fmt.Fprint calls that
write "event: message_start" and "event: message_stop" to emit valid JSON by
using unescaped quotes (e.g. "{\"type\":\"message_start\"}" and
"{\"type\":\"message_stop\"}"); this makes the SSE body match the
properly-formed SSE data used in TestTranslationHandler_PassthroughStreaming and
keeps the same event lines and spacing.
- Around line 1163-1179: The mockProxyService.proxyFunc in the test silently
ignores the error returned by http.NewRequest when building backendReq; update
the code to check the error from http.NewRequest (for the occurrences inside
proxyFunc at lines with backendReq creation), return that error immediately if
non-nil (so client.Do is never called with a nil request), and keep behavior
consistent with other proxy mocks in the file (i.e., handle backendReq, err :=
http.NewRequest(...) and return err when err != nil).
🧹 Nitpick comments (2)
internal/app/handlers/handler_translation_passthrough_test.go (2)

134-144: Consider extracting repeated test setup into helpers.

The Application struct initialisation and proxyFunc closure are duplicated across nearly every test with only minor variations (endpoints, statsCollector). Extracting these into a small test helper (e.g., newTestApp(opts...) and newForwardingProxyService()) would reduce the ~1700-line file significantly and make each test's unique intent clearer.

Example helper sketch
type testAppOpts struct {
	endpoints      []*domain.Endpoint
	profileLookup  *mockPassthroughProfileLookup
	statsCollector ports.StatsCollector
	proxyService   ports.ProxyService
}

func newTestApp(t *testing.T, opts testAppOpts) *Application {
	t.Helper()
	if opts.statsCollector == nil {
		opts.statsCollector = &mockStatsCollector{}
	}
	if opts.proxyService == nil {
		opts.proxyService = newForwardingProxyService()
	}
	return &Application{
		logger:           &mockStyledLogger{},
		proxyService:     opts.proxyService,
		statsCollector:   opts.statsCollector,
		repository:       &mockEndpointRepository{getEndpointsFunc: func() []*domain.Endpoint { return opts.endpoints }},
		inspectorChain:   inspector.NewChain(&mockStyledLogger{}),
		profileFactory:   &mockProfileFactory{},
		profileLookup:    opts.profileLookup,
		discoveryService: &mockDiscoveryServiceWithEndpoints{endpoints: opts.endpoints},
		Config:           &config.Config{},
	}
}

Also applies to: 299-309, 441-451, 532-542, 620-630, 704-714, 809-819, 1184-1194, 1285-1295, 1379-1389, 1515-1525, 1635-1645


1696-1701: Clarify why backend errors are recorded as successful handler events.

The comment on lines 1699–1700 explains the rationale well. Worth considering whether the Success field semantics should be documented at the TranslatorRequestEvent type level too, so future readers of the metrics data aren't surprised that 4xx backend responses show as Success: true.

Comment on lines +1433 to +1434
fmt.Fprint(w, "event: message_start\ndata: {\\\"type\\\":\\\"message_start\\\"}\n\n")
fmt.Fprint(w, "event: message_stop\ndata: {\\\"type\\\":\\\"message_stop\\\"}\n\n")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Malformed JSON in SSE mock data.

The double-escaped quotes \\\" produce literal backslash-quote characters in the output (e.g. {\"type\":\"message_start\"}), which isn't valid JSON. This doesn't break the current test since the body isn't parsed, but it's inconsistent with the properly-formed SSE data in TestTranslationHandler_PassthroughStreaming (lines 210–227).

Suggested fix
-		fmt.Fprint(w, "event: message_start\ndata: {\\\"type\\\":\\\"message_start\\\"}\n\n")
-		fmt.Fprint(w, "event: message_stop\ndata: {\\\"type\\\":\\\"message_stop\\\"}\n\n")
+		fmt.Fprint(w, "event: message_start\ndata: {\"type\":\"message_start\"}\n\n")
+		fmt.Fprint(w, "event: message_stop\ndata: {\"type\":\"message_stop\"}\n\n")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
fmt.Fprint(w, "event: message_start\ndata: {\\\"type\\\":\\\"message_start\\\"}\n\n")
fmt.Fprint(w, "event: message_stop\ndata: {\\\"type\\\":\\\"message_stop\\\"}\n\n")
fmt.Fprint(w, "event: message_start\ndata: {\"type\":\"message_start\"}\n\n")
fmt.Fprint(w, "event: message_stop\ndata: {\"type\":\"message_stop\"}\n\n")
🤖 Prompt for AI Agents
In `@internal/app/handlers/handler_translation_passthrough_test.go` around lines
1433 - 1434, The SSE mock in the test writes malformed JSON because the payload
uses double-escaped quotes (e.g. "\\\"type\\\":\\\"message_start\\\""), so
update the two fmt.Fprint calls that write "event: message_start" and "event:
message_stop" to emit valid JSON by using unescaped quotes (e.g.
"{\"type\":\"message_start\"}" and "{\"type\":\"message_stop\"}"); this makes
the SSE body match the properly-formed SSE data used in
TestTranslationHandler_PassthroughStreaming and keeps the same event lines and
spacing.

- lightweight extraction of model name so we avoid multi-parsing
- uses gjson.GetBytes

reduces  allocations on hotpath from ~16 to 1 and latency to 300-350ns from 2500-2800ns

fixes a race condition for byte transfer hit when errors occur
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@CLAUDE.md`:
- Around line 217-222: Update the delegation list under the "CRITICAL: Always
delegate tasks to the appropriate subagent. Do NOT perform work directly in the
main context." block: change each "Eg." to the standard lowercase abbreviation
"e.g." (e.g. in "Code Review → Use the appropriate language subagent (Eg. Go
Architect) or reviewer subagent" and "Code changes → Use the appropriate
language subagent (Eg. Go Architect) or implementer subagent"), and replace
"explore subagent" in the "Research/exploration → Use the explore subagent"
bullet with "exploration subagent" (or "research subagent") for correct noun
usage; keep punctuation and capitalization consistent across all bullets.

In `@internal/adapter/translator/anthropic/translator.go`:
- Around line 154-185: The PassthroughCapable interface doc is misleading (it
says "at least one") but Translator.CanPassthrough requires ALL endpoints to
support Anthropic passthrough; update the interface documentation in types.go to
reflect the actual contract (i.e., that CanPassthrough must return true only
when all candidate backends can accept the request natively) so implementers of
PassthroughCapable aren't confused—specifically edit the
PassthroughCapable/CanPassthrough doc comment to state "all endpoints" (or
equivalent) and reference Translator.CanPassthrough behavior for consistency.

In `@internal/adapter/translator/types.go`:
- Around line 85-112: Update the CanPassthrough interface doc to state that
implementations must verify that every endpoint in the provided endpoints slice
can accept the request in its native format (i.e., return true only if all
endpoints are compatible), matching the concrete behavior in
anthropic/translator.go; keep the existing notes about using ProfileLookup to
avoid registry coupling and the thread-safety requirement (do not mutate the
endpoints slice).

In `@internal/app/handlers/handler_translation.go`:
- Around line 177-183: The current use of io.LimitReader(r.Body, maxBodySize)
silently truncates oversized requests causing JSON parse errors; change the read
to request maxBodySize+1 bytes (e.g., read from io.LimitReader(r.Body,
maxBodySize+1)) into bodyBytes, then if len(bodyBytes) > maxBodySize treat it as
an oversized body: call a.writeTranslatorError(...) with
http.StatusRequestEntityTooLarge (413) and a.recordTranslatorMetrics(...) (same
mode/context as the current error path) and return; otherwise proceed using
bodyBytes for PreparePassthrough/TransformRequest as before.
🧹 Nitpick comments (3)
internal/adapter/translator/extract_test.go (1)

166-196: Consider using any instead of interface{} in the benchmark struct.

Minor nit: modern Go (1.18+) prefers any over interface{}.

Proposed diff
 	type anthropicReq struct {
 		Model     string      `json:"model"`
 		MaxTokens int         `json:"max_tokens"`
-		System    interface{} `json:"system,omitempty"`
+		System    any         `json:"system,omitempty"`
 		Messages  []struct {
-			Content interface{} `json:"content"`
+			Content any         `json:"content"`
 			Role    string      `json:"role"`
 		} `json:"messages"`
 		Stream      bool     `json:"stream,omitempty"`
internal/app/handlers/handler_translation_passthrough_test.go (1)

845-847: Unchecked type assertion could produce an unhelpful panic on test failure.

If the backend response format changes, errorResp["error"].(map[string]interface{}) panics with a generic message rather than a useful test failure. Consider using the two-value form with require.

Suggested fix
-	errorObj := errorResp["error"].(map[string]interface{})
-	assert.Equal(t, "invalid_request_error", errorObj["type"])
-	assert.Contains(t, errorObj["message"], "Invalid model")
+	errorObj, ok := errorResp["error"].(map[string]interface{})
+	require.True(t, ok, "expected error field to be a map")
+	assert.Equal(t, "invalid_request_error", errorObj["type"])
+	assert.Contains(t, errorObj["message"], "Invalid model")
internal/app/handlers/handler_translation.go (1)

19-81: Passthrough request execution is well-structured.

Clean separation: prepare → set body/path → set mode header → proxy → log → handle errors → record end time. The defence against writing error responses after the response has started (line 75) is a good guard.

One small consistency note: "X-Olla-Mode" on line 60 is a string literal while all other X-Olla-* headers use constants from constants.HeaderXOlla*. Consider adding a constant for this new header to keep them consistent.

Comment on lines +154 to +185
// CanPassthrough implements PassthroughCapable interface
// Determines whether the request can be forwarded directly to backends without translation.
// Returns true only if passthrough is enabled and ALL endpoints declare native Anthropic support.
func (t *Translator) CanPassthrough(endpoints []*domain.Endpoint, profileLookup translator.ProfileLookup) bool {
// Fast path: if passthrough is disabled, no need to check endpoints
if !t.config.PassthroughEnabled {
return false
}

// If we have no endpoints, cannot passthrough
if len(endpoints) == 0 {
return false
}

// Check all endpoints for native Anthropic support
// All endpoints must support passthrough - if any endpoint doesn't support it,
// we must fall back to translation to ensure the request can be routed to any backend
for _, ep := range endpoints {
support := profileLookup.GetAnthropicSupport(ep.Type)

// If support is nil or explicitly disabled, cannot passthrough
if support == nil || !support.Enabled {
t.logger.Debug("Endpoint does not support Anthropic passthrough",
"endpoint", ep.Name,
"type", ep.Type)
return false
}
}

t.logger.Debug("All endpoints support Anthropic passthrough", "count", len(endpoints))
return true
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Interface doc says "at least one" but implementation requires ALL endpoints.

The PassthroughCapable interface doc in types.go (line 86–88) states the method "determines whether at least one backend can accept the request in its native format." This implementation correctly requires all endpoints to support passthrough (line 170 comment confirms this), but the interface-level contract is misleading. Consider aligning the interface doc to say "all" so future implementers aren't surprised.

Otherwise the logic is sound — early-out on disabled config, empty endpoints, and per-endpoint check with debug logging.

🤖 Prompt for AI Agents
In `@internal/adapter/translator/anthropic/translator.go` around lines 154 - 185,
The PassthroughCapable interface doc is misleading (it says "at least one") but
Translator.CanPassthrough requires ALL endpoints to support Anthropic
passthrough; update the interface documentation in types.go to reflect the
actual contract (i.e., that CanPassthrough must return true only when all
candidate backends can accept the request natively) so implementers of
PassthroughCapable aren't confused—specifically edit the
PassthroughCapable/CanPassthrough doc comment to state "all endpoints" (or
equivalent) and reference Translator.CanPassthrough behavior for consistency.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@internal/app/handlers/handler_translation.go`:
- Around line 233-256: The fallback reason conflates "profile lookup not
configured" with "cannot passthrough"; update the branch that checks passthrough
to explicitly detect when a.profileLookup is nil and set mode =
constants.TranslatorModeTranslation and fallbackReason =
constants.FallbackReasonProfileLookupNotConfigured (or add that constant if it
doesn't exist), otherwise call passthroughTrans.CanPassthrough as before and use
FallbackReasonCannotPassthrough when CanPassthrough returns false; also add a
brief debug/processLogger entry noting profileLookup is nil so operators can
distinguish the cases.
🧹 Nitpick comments (2)
internal/adapter/translator/types.go (1)

147-163: ProfileLookup is tightly coupled to Anthropic — consider a more generic name if other translators will need passthrough.

GetAnthropicSupport is the only method, which works for now. If future translators (e.g. Gemini) also implement PassthroughCapable, this interface would need broadening. Worth noting as a future consideration, though fine for the current scope.

internal/adapter/stats/translator_collector.go (1)

89-122: GetStats snapshot is not point-in-time consistent across counters.

Since each counter in translatorData is an independent xsync.Counter, reading them sequentially means TotalRequests might not equal SuccessfulRequests + FailedRequests if a concurrent Record fires between reads. This is generally acceptable for observability endpoints, but worth documenting so consumers don't rely on strict invariants.

Suggested doc addition
 // GetStats returns aggregated statistics for all translators
+// Note: counters are read independently so the snapshot may not be perfectly
+// consistent under concurrent writes (e.g. total may briefly differ from
+// successful + failed). This is acceptable for observability purposes.
 func (tc *TranslatorCollector) GetStats() map[string]ports.TranslatorStats {

Comment on lines +233 to 256
// Determine mode and fallback reason
var mode constants.TranslatorMode
var fallbackReason constants.TranslatorFallbackReason

// Execute proxy request with appropriate response handling
// streaming vs non-streaming need different handling
var proxyErr error
if transformedReq.IsStreaming {
proxyErr = a.executeTranslatedStreamingRequest(ctx, w, r, endpoints, pr, trans)
// Check for passthrough capability
if passthroughTrans, ok := trans.(translator.PassthroughCapable); ok {
if a.profileLookup != nil && passthroughTrans.CanPassthrough(endpoints, a.profileLookup) {
// Passthrough mode -- bodyBytes goes directly to PreparePassthrough
// which validates without re-reading. No TransformRequest needed.
mode = constants.TranslatorModePassthrough
fallbackReason = constants.FallbackReasonNone

a.executePassthroughRequest(ctx, w, r, bodyBytes, endpoints, pr, trans)
a.recordTranslatorMetrics(trans, pr, mode, fallbackReason)
return
}
// Translation mode with fallback reason
mode = constants.TranslatorModeTranslation
fallbackReason = constants.FallbackReasonCannotPassthrough
} else {
proxyErr = a.executeTranslatedNonStreamingRequest(ctx, w, r, endpoints, pr, trans)
// Translator doesn't support passthrough
mode = constants.TranslatorModeTranslation
fallbackReason = constants.FallbackReasonTranslatorDoesNotSupportPassthrough
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Passthrough fallback reason may be inaccurate when profileLookup is nil.

When a.profileLookup is nil (Line 239), the code falls through to CannotPassthrough (Line 251). This conflates "profile lookup not configured" with "endpoints don't support passthrough". Consider using a distinct fallback reason or at least logging this case, as it may confuse operators debugging why passthrough isn't being used.

Suggested approach
 		if passthroughTrans, ok := trans.(translator.PassthroughCapable); ok {
 			if a.profileLookup != nil && passthroughTrans.CanPassthrough(endpoints, a.profileLookup) {
 				// Passthrough mode
 				mode = constants.TranslatorModePassthrough
 				fallbackReason = constants.FallbackReasonNone
 
 				a.executePassthroughRequest(ctx, w, r, bodyBytes, endpoints, pr, trans)
 				a.recordTranslatorMetrics(trans, pr, mode, fallbackReason)
 				return
 			}
-			// Translation mode with fallback reason
-			mode = constants.TranslatorModeTranslation
-			fallbackReason = constants.FallbackReasonCannotPassthrough
+			mode = constants.TranslatorModeTranslation
+			if a.profileLookup == nil {
+				pr.requestLogger.Debug("passthrough skipped: profile lookup not configured",
+					"translator", trans.Name())
+			}
+			fallbackReason = constants.FallbackReasonCannotPassthrough
 		} else {
🤖 Prompt for AI Agents
In `@internal/app/handlers/handler_translation.go` around lines 233 - 256, The
fallback reason conflates "profile lookup not configured" with "cannot
passthrough"; update the branch that checks passthrough to explicitly detect
when a.profileLookup is nil and set mode = constants.TranslatorModeTranslation
and fallbackReason = constants.FallbackReasonProfileLookupNotConfigured (or add
that constant if it doesn't exist), otherwise call
passthroughTrans.CanPassthrough as before and use
FallbackReasonCannotPassthrough when CanPassthrough returns false; also add a
brief debug/processLogger entry noting profileLookup is nil so operators can
distinguish the cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request routing This issue is with routing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant