Skip to content

feat(anthropic): use official Anthropic endpoints/models with API+SDK routes#107

Draft
azkore wants to merge 19 commits intoSoju06:mainfrom
azkore:claude
Draft

feat(anthropic): use official Anthropic endpoints/models with API+SDK routes#107
azkore wants to merge 19 commits intoSoju06:mainfrom
azkore:claude

Conversation

@azkore
Copy link
Contributor

@azkore azkore commented Feb 26, 2026

What this PR does

This PR makes it possible to use official Anthropic endpoints and models through codex-lb, with two transport options:

  • POST /claude/v1/messages -> official Anthropic API (default)
  • POST /claude-sdk/v1/messages -> local Claude SDK/runtime

In simple terms: codex-lb can now use official Anthropic endpoints/models in two transport modes, and the default /claude route uses the official Anthropic API path.

How this differs from other open PRs

This PR is not a duplicate of current open Anthropic/Claude PRs:

  • #97 focuses on Claude-format compatibility over existing OpenAI/Codex flows.
  • #102 focuses on embeddings.

This PR focuses on using official Anthropic endpoints/models via dedicated Anthropic routes.

Main changes

  • Adds two Anthropic routes in one server:
    • API-first default on /claude/v1/messages
    • SDK route on /claude-sdk/v1/messages
  • Adds Anthropic credential import in the dashboard (/api/accounts/import-anthropic) with explicit email input.
  • Labels Anthropic accounts as claude/<email> while keeping existing Codex/OpenAI account behavior unchanged.
  • Adds diagnostics to help troubleshoot prompt size and token growth issues.
  • Improves Anthropic API auth recovery (retry refresh when token is revoked in known OAuth cases).
  • Prevents OpenAI usage refresh logic from polling Anthropic accounts.
  • Disables Anthropic SDK pooling by default for safety until stateful behavior is fully hardened.

Anthropic API vs Anthropic SDK

  • /claude/v1/messages (API route): direct call to api.anthropic.com/v1/messages with OAuth credentials, with optional CLI-header/system-prompt parity helpers.
  • /claude-sdk/v1/messages (SDK route): local claude-agent-sdk transport via Claude runtime session.

Attribution

This implementation is heavily inspired by ccproxy-api, especially:

The ideas were adapted to codex-lb architecture, routes, and dashboard/account flow.

Current limitations

  • SDK pooling is disabled by default for now (CODEX_LB_ANTHROPIC_SDK_POOL_ENABLED=false).
  • Anthropic account load balancing is not implemented yet; Anthropic traffic currently uses the default Anthropic account.
  • Anthropic "Add account with OAuth" (browser/device) is not implemented yet; import flow is available.

Pooling checklist before re-enable

  • strict client/session affinity
  • no pooling for ephemeral/default traffic
  • serialized access per pooled session client
  • robust broken-client reset/reconnect logic
  • regression tests for cross-session context leaks
  • pool observability (hit/miss/reuse/session binding)

Testing

  • Full backend test suite passes on this branch (uv run pytest).
  • Targeted frontend checks for Anthropic import flow were run.

Collaboration note

I am testing this branch now, but I may not have time to continue iterating quickly.
Anyone who wants to test, review, or continue work on this draft PR is very welcome.

Add a dedicated `/claude/v1/messages` Anthropic-compatible endpoint
backed by the local Claude SDK runtime while keeping existing OpenAI
routes active in the same server.

Wire Anthropic request logs, pricing, and usage windows into existing
dashboard metrics, including OAuth usage polling and corrected cached
input token accounting for Claude usage payloads.
Reduce Claude SDK connect/disconnect churn by reusing connected
clients keyed by request options. Add pool settings and close all
pooled clients during app shutdown to prevent leaked sessions.
Serve Anthropic-compatible messages over both the existing SDK-backed
/claude endpoint and a new direct /claude-api endpoint so clients can
choose runtime parity needs without changing servers.
Add OAuth refresh-aware API transport plumbing, configuration docs, and
integration coverage for both non-streaming and streaming paths.
Exclude only the generated Anthropic provider seed account from the
"already logged in" browser shortcut so OAuth can still fall back to
device flow when no real account is present.
Add integration coverage for both seed-account fallback and non-seed
anthropic_default account handling.
Add a dedicated Anthropic credential import path in the dashboard so
Claude OAuth JSON can be uploaded without restarts and used by the
/claude-api transport immediately.
Prefix Anthropic account display names with claude/ while leaving OpenAI
account handling unchanged, and add backend/frontend coverage.
Require Claude credential imports to provide an explicit account email
instead of inferring identity from credential payloads that often omit
user metadata.
Update the dashboard import dialog and API contract so claude/ labels use
user-provided email deterministically.
Add request-shape diagnostics for both Anthropic transports so payload
size, mutation, and usage data can be correlated during token blow-up
investigation.
Harden SDK pooling for stateful usage by keying pools by session id and
keeping ephemeral session traffic unpooled to avoid context bleed.
Retry Anthropic API requests after token refresh not only on 401, but
also on revoked-token 403 responses returned by OAuth-backed endpoints.
This preserves automatic recovery when access tokens are revoked but a
valid refresh token is still available.
Exclude Anthropic provider accounts from the OpenAI usage refresh loop
so Claude tokens are not sent to /backend-api/wham/usage and noisy 401
errors are avoided.
Keep Anthropic usage refresh on its dedicated scheduler path unchanged.
Turn off Anthropic SDK client pooling by default to avoid state bleed in
stateful conversation workloads until pooling behavior is fully hardened.

Before re-enabling pooling, fix all of the following:
- keep strict client affinity per explicit session id
- prevent client reuse for ephemeral/default session traffic
- serialize concurrent access per pooled session client
- detect/reset broken pooled clients after stream/query failures
- add regression coverage for cross-session context contamination
- add pool observability for hit/miss, reuse, and session binding
Switch the default Anthropic route to the direct OAuth API transport at
/claude/v1/messages and move the Claude SDK transport to
/claude-sdk/v1/messages.
Update docs, examples, and integration coverage so route semantics are
explicit and consistent for testing and rollout.
@gitguardian
Copy link

gitguardian bot commented Feb 26, 2026

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

Fix strict type-checker failures in Anthropic diagnostics/usage paths so
Ruff and ty pass in CI.
Replace token-like test fixture literals with clearly fake values to
avoid secret scanner false positives.
@azkore
Copy link
Contributor Author

azkore commented Feb 26, 2026

the GitGuardian findings were from test fixture strings (not real credentials). I replaced the token-like test values and rewrote this branch so the previously flagged literals are no longer present in this PR history.

Apply Ruff formatter output to files flagged by CI format check so the
Lint (ruff) workflow passes on the rewritten branch history.
Stop prefixing Anthropic display names with claude/ so account labels
remain clean and consistent with existing email presentation.
Add provider badges, subtle Anthropic row tinting, and provider-grouped
sorting in account/request views for clearer visual separation.
Replace textual provider badges with compact provider icons and extend
Anthropic tinting to account cards/details for clearer visual separation.
Remove internal account-id disambiguation from account subtitles and use
provider labeling instead so user-facing lists stay clean.
Add focused tests for account provider helper branches so Vitest
coverage clears the global branch threshold again.

Also keep accounts mapper formatting aligned with Ruff to avoid
format-check failures in CI.
Align Claude pricing constants with current Anthropic pricing for
Opus 4.6/4.5/4.1, Sonnet 4.6/4.5, and Haiku 4.5.

Also expand alias coverage for hyphenated and provider-prefixed
model names so cost estimates resolve to the correct tier.
codex-lb logs and dashboard views are OpenAI-focused, where input tokens
represent full request context for comparison and cost views.

Anthropic reports input as separate parts (base input, cache creation,
cache read). Log Anthropic input as the sum of those parts so Anthropic
rows follow the same convention and are comparable in the UI.

Update SDK and API usage parsing tests for the new accounting rule.
@joeblack2k
Copy link
Contributor

Planned execution split is tracked in #114 with sub-tracks #115 #116 #117 #118.

Operational gate for this workstream:

  • canary-only validation first
  • no main/live promotion without explicit approval in-thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants