feat(claude): opt-in entrypoint spoof for Fast Mode passthrough#3495
feat(claude): opt-in entrypoint spoof for Fast Mode passthrough#3495NomenAK wants to merge 3 commits into
Conversation
|
This pull request targeted The base branch has been automatically changed to |
There was a problem hiding this comment.
Code Review
This pull request introduces a 'Fast Mode spoofing' feature for Claude clients, allowing the proxy to rewrite entrypoints and strip SDK-origin markers. The changes include new configuration flags, executor logic for header and body modification, and utility functions for SDK detection with accompanying tests. The reviewer recommends restricting these modifications to third-party clients to maintain official client behavior, using a non-pointer boolean for the configuration flag to simplify the code, and optimizing JSON processing by removing redundant existence checks.
| // Apply cloaking (system prompt injection, fake user ID, sensitive word obfuscation) | ||
| // based on client type and configuration. | ||
| body = applyCloaking(ctx, e.cfg, auth, body, baseModel, apiKey) | ||
| body = stripSDKOriginMarkers(e.cfg, body) |
There was a problem hiding this comment.
This call is currently executed for all requests, including those from first-party Claude Code clients. This contradicts the goal of ensuring zero behavior change for official clients when the spoofing flag is enabled. Moving this logic inside applyCloaking (after its ShouldCloak guard) ensures that origin markers are only stripped for third-party clients that are being cloaked.
| // Apply cloaking (system prompt injection, fake user ID, sensitive word obfuscation) | ||
| // based on client type and configuration. | ||
| body = applyCloaking(ctx, e.cfg, auth, body, baseModel, apiKey) | ||
| body = stripSDKOriginMarkers(e.cfg, body) |
| if cfg != nil && cfg.ClaudeFastModeSpoof != nil && *cfg.ClaudeFastModeSpoof { | ||
| r.Header.Set("X-Claude-Code-Session-Id", helps.CachedSessionID(apiKey)) | ||
| } |
There was a problem hiding this comment.
Forcing the session ID should be restricted to clients that are being cloaked. Official clients manage their own session IDs, and overriding them could lead to unexpected behavior or detection. Additionally, if ClaudeFastModeSpoof is changed to a bool type, the nil check can be removed.
| if cfg != nil && cfg.ClaudeFastModeSpoof != nil && *cfg.ClaudeFastModeSpoof { | |
| r.Header.Set("X-Claude-Code-Session-Id", helps.CachedSessionID(apiKey)) | |
| } | |
| if cfg != nil && cfg.ClaudeFastModeSpoof && !helps.IsClaudeCodeClient(ginHeaders.Get("User-Agent")) { | |
| r.Header.Set("X-Claude-Code-Session-Id", helps.CachedSessionID(apiKey)) | |
| } |
| // resolves to an SDK-shaped entrypoint, applyCloaking substitutes a TTY-style | ||
| // value and strips SDK-origin markers from the request body. Disabled by | ||
| // default so upstream client telemetry is preserved. | ||
| ClaudeFastModeSpoof *bool `yaml:"claude-fast-mode-spoof,omitempty" json:"claude-fast-mode-spoof,omitempty"` |
There was a problem hiding this comment.
Using a pointer for this boolean flag is inconsistent with other feature toggles in this struct (e.g., CommercialMode, UsageStatisticsEnabled) and necessitates redundant nil checks at every call site. Since the intended default is false, a value type bool is more idiomatic and simplifies the code.
| ClaudeFastModeSpoof *bool `yaml:"claude-fast-mode-spoof,omitempty" json:"claude-fast-mode-spoof,omitempty"` | |
| ClaudeFastModeSpoof bool `yaml:"claude-fast-mode-spoof,omitempty" json:"claude-fast-mode-spoof,omitempty"` |
| // No-op unless cfg.ClaudeFastModeSpoof is enabled. Sibling to applyCloaking; | ||
| // kept separate so the strip applies regardless of which cloaking branch ran. | ||
| func stripSDKOriginMarkers(cfg *config.Config, body []byte) []byte { | ||
| if cfg == nil || cfg.ClaudeFastModeSpoof == nil || !*cfg.ClaudeFastModeSpoof { |
| if gjson.GetBytes(body, p).Exists() { | ||
| if updated, err := sjson.DeleteBytes(body, p); err == nil { | ||
| body = updated | ||
| } | ||
| } |
There was a problem hiding this comment.
| // Fast Mode spoof: rewrite SDK-shaped entrypoints to a TTY-style value | ||
| // when the upstream client leaked sdk-* (Anthropic gates Fast Mode on | ||
| // entrypoint). Opt-in via config.claude-fast-mode-spoof; default off. | ||
| if cfg != nil && cfg.ClaudeFastModeSpoof != nil && *cfg.ClaudeFastModeSpoof { |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1fb5aea590
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if cfg != nil && cfg.ClaudeFastModeSpoof != nil && *cfg.ClaudeFastModeSpoof { | ||
| if util.IsSDKEntrypoint(entrypoint) { | ||
| if forced := strings.TrimSpace(cfg.ClaudeFastModeSpoofEntrypoint); forced != "" { |
There was a problem hiding this comment.
Move fast-mode entrypoint spoof outside cloaking gate
This spoof block only runs after applyCloaking passes helps.ShouldCloak, but in default cloak.mode: auto any User-Agent starting with claude-cli returns early and never reaches this logic. That means enabling claude-fast-mode-spoof does not rewrite cc_entrypoint=sdk-* for Claude SDK-mode clients using claude-cli/... (external, sdk-cli) unless operators also change cloak mode to always, so the new flag is ineffective in the common default configuration.
Useful? React with 👍 / 👎.
1fb5aea to
dfa26e5
Compare
Mirrors dev every 15 min via cron. main is our customized branch.
claude-code stamps a SDK-shaped entrypoint (sdk-cli, sdk-py, sdk-ts) into
the x-anthropic-billing-header attribution block when invoked via --print
or via the Anthropic SDK. Anthropic's server appears to gate Fast Mode
(speed:"fast") on that marker and rejects requests whose entrypoint is
SDK-shaped, so SDK-mode clients cannot opt into Fast Mode through CPA
today.
This change introduces an opt-in flag, claude-fast-mode-spoof, that lets
the operator rewrite the entrypoint to a TTY-style value (default "cli",
override via claude-fast-mode-spoof-entrypoint) when the upstream client
leaks an SDK shape. When the flag is enabled the patch also:
- strips four SDK-origin markers Anthropic may use for additional
Fast Mode gating: top-level source, metadata.source, client_source,
and metadata.client (defense-in-depth, gjson/sjson based)
- forces X-Claude-Code-Session-Id to CPA's deterministic cached value
so a leaking SDK client cannot override it via gin-forwarded headers
The flag is **off by default**: no behavior change for existing users,
upstream telemetry preserved. When off, parseEntrypointFromUA continues
to feed the original value into the attribution block, matching today's
cloaking output exactly. When on, the rewrite is scoped to the existing
applyCloaking path, which already short-circuits for first-party Claude
Code user-agents via ShouldCloak, so the patch only affects requests
where cloaking already runs.
A small helper IsSDKEntrypoint is added in internal/util/ next to the
existing claude_attribution.go (covered by claude_attribution_test.go).
The body-strip lives next to applyCloaking as a sibling function so the
strip runs regardless of which cloaking branch executed.
Scope:
- Anthropic OAuth/CLI executor only; other providers (Codex, Gemini,
Kimi, XAI, etc.) untouched
- applyCloaking short-circuit unchanged: real Claude Code clients
still send their original UA-derived entrypoint
- No change to ClaudeKey, CloakConfig, or any per-credential fields
Caveat: this addresses only the entrypoint gate. Anthropic may still
reject Fast Mode based on subscription tier or credit balance, which
this patch cannot influence.
dfa26e5 to
6ac0d7c
Compare
This extends the opt-in Claude Fast Mode spoof (gated by
claude-fast-mode-spoof) with a second activation path: when the inbound
request carries header "X-CPA-Force-Fast-Mode" set to a truthy value
("1", "true", "yes", "on"; case-insensitive), applyCloaking forces the
entrypoint rewrite regardless of the inbound User-Agent shape.
Rationale: a downstream gateway often wants to control Fast Mode opt-in
per-request (per-account, per-model) without depending on the upstream
client's UA fingerprint. Today the SDK-shaped detection is the only
trigger; this adds a private contract a gateway can stamp on a request
to force the same spoof bundle even when the UA is TTY-shaped.
Master switch is unchanged: when claude-fast-mode-spoof is off, the
header has no effect (defense-in-depth: the operator still owns the
binary opt-in). When the flag is on:
- UA-based detection (existing path) still triggers the rewrite for
sdk-cli / sdk-py / sdk-ts and other "-sdk" shapes.
- The new force-header path triggers the rewrite for any UA shape.
- Body-strip and X-Claude-Code-Session-Id forge run unchanged on the
flag alone, so the full spoof bundle ships in both activation paths.
Critically, the header is stripped from the outbound request before
forwarding upstream. The strip is unconditional (independent of the
flag) so the header cannot leak to Anthropic even if a future code
path or misconfiguration somehow propagates the inbound header onto
the outbound request. A test confirms the strip with both the flag on
and the flag off.
Scope:
- Anthropic OAuth/CLI executor only; other providers untouched.
- applyCloaking short-circuit unchanged: real Claude Code clients
are still gated by ShouldCloak first.
- No new config field; the header reuses the existing
claude-fast-mode-spoof master switch.
Helpers:
- util.IsForceFastModeHeader(http.Header) — stateless predicate,
next to IsSDKEntrypoint in claude_attribution.go.
- isForceFastModeFromContext(ctx) — gin-context accessor, sibling
to getWorkloadFromContext in claude_executor.go.
Tests:
- TestIsForceFastModeHeader: 18 cases covering truthy/falsy values,
case-insensitivity, whitespace, nil header, canonical lookup.
- TestIsForceFastModeFromContext: 9 cases covering the gin-context
plumbing.
- TestApplyClaudeHeaders_StripsForceFastModeHeader: confirms the
outbound header is stripped when the flag is on.
- TestApplyClaudeHeaders_StripsForceFastModeHeaderEvenWhenFlagOff:
confirms the strip is unconditional.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 30b8718fb7
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| // Fast Mode spoof; Anthropic must never see it. Deletion is unconditional | ||
| // (regardless of cfg.ClaudeFastModeSpoof) so the header cannot leak even | ||
| // when the flag is off. | ||
| r.Header.Del("X-CPA-Force-Fast-Mode") |
There was a problem hiding this comment.
Re-strip private fast-mode header after custom auth headers
applyClaudeHeaders deletes X-CPA-Force-Fast-Mode early, but util.ApplyCustomHeadersFromAttrs runs later and can set arbitrary header:* keys from auth attributes, which can re-introduce this private control header on the outbound Anthropic request. In environments that use per-auth custom headers, this breaks the commit’s stated invariant that Anthropic must never receive X-CPA-Force-Fast-Mode; add a final Del after custom-header application (similar to the stream Accept-Encoding re-enforcement).
Useful? React with 👍 / 👎.
Summary
Add an opt-in flag (
claude-fast-mode-spoof) so the existingapplyCloakingpath rewrites SDK-shaped
cc_entrypointmarkers to a TTY-style value beforethey reach Anthropic. Off by default, scoped to the Claude OAuth/CLI executor,
zero behavior change when not enabled.
Motivation
Anthropic's official
claude-codebinary, when invoked via--printor viathe Anthropic SDK, stamps an SDK-shaped entrypoint (
sdk-cli,sdk-py,sdk-ts) into thex-anthropic-billing-headerattribution block (the textblock CPA already builds at
system[0]). The upstream server seems to gateFast Mode (
speed:"fast") on that marker, so SDK-mode CC clients cannot optinto Fast Mode through CPA today even on accounts that have the feature.
CPA already controls the entrypoint value via
parseEntrypointFromUAandalready cloaks the request shape for non-CC clients — this just lets the
operator extend that cloaking to cover the SDK-vs-TTY distinction.
Behavior
parseEntrypointFromUAcontinues to forward theUA-derived entrypoint unchanged. No diff in produced wire payload.
claude-fast-mode-spoof: true(opt-in):util.IsSDKEntrypoint(entrypoint)is true (HasPrefixsdk,contains
-sdk, or equalsexternal-sdk), substitute the entrypoint.Substitution value is
cliby default, overridable viaclaude-fast-mode-spoof-entrypoint.source,metadata.source,client_source,metadata.clientfrom the request body (defense-in-depth against anyadditional Fast Mode gating).
X-Claude-Code-Session-Idto the deterministic cached value soa leaking SDK client cannot override CPA's session id via the
gin-forwarded header set.
Scope
applyCloakingstill short-circuits viahelps.ShouldCloakfor first-party Claude Code UAs, so real CC clientsstill send their original entrypoint.
ClaudeKey,CloakConfig, or any per-credential field —the flag lives on the root
ConfigalongsideClaudeHeaderDefaults.util.IsSDKEntrypointadded next to existingclaude_attribution.go;stripSDKOriginMarkerslives next toapplyCloakinginclaude_executor.go.Files touched (+90 LOC, 0 deletions)
internal/config/config.gointernal/runtime/executor/claude_executor.goapplyCloaking,stripSDKOriginMarkershelper + two call sites, session-id force inapplyClaudeHeaders)internal/util/claude_attribution.goIsSDKEntrypoint)internal/util/claude_attribution_test.goTest plan
go build ./...— cleango vet ./...— cleango test ./internal/util/...— passes including 12 newIsSDKEntrypointcasesgo test ./internal/config/...— passes (cached fields parse fine through existing YAML round-trip tests)go test ./internal/runtime/executor/...— only pre-existing antigravity test failures (TestEnsureAccessToken_WarmTokenLoadsCreditsHint,TestUpdateAntigravityCreditsBalance_LoadCodeAssistUserAgent), unrelated to this patch and reproducible on cleanmainateb7e1370claude-codeagainst a CPA instance: with the flag off, server returns the existing Fast Mode rejection; with the flag on plus a non-zero Fast Mode credit balance, the request reaches Anthropic andOrg fast mode: enabledflows back in the SSE.Caveats
This addresses only the entrypoint gate. Anthropic may still reject Fast
Mode based on subscription tier, regional rollout, or the user's remaining
Fast Mode credit balance — none of which this patch can influence. I've kept
the flag opt-in for that reason: operators who don't expect their users to
hit Fast Mode shouldn't have to think about it.
Happy to break the body-strip and the session-id force out into separate
sub-flags if the granularity feels off — they're behaviorally independent
but I bundled them under one toggle to keep the surface small.
Update (header-based trigger added)
A second activation path was added in a follow-up commit: the inbound
header
X-CPA-Force-Fast-Mode: 1(case-insensitive, also acceptstrue/yes/on) forces the same spoof bundle regardless of the inboundUser-Agent shape. The header is stripped from the outbound request before
forwarding upstream so Anthropic never sees it; the strip is unconditional
(runs even when the master flag is off) for defense-in-depth.
This lets a downstream gateway opt requests in per-request (per-account,
per-model) even when the UA itself looks TTY-shaped. The master switch
(
claude-fast-mode-spoof) is unchanged: the header alone has no effectwhen the flag is off, so the operator still owns the binary opt-in.
Helpers and tests added:
util.IsForceFastModeHeader(http.Header)— stateless predicate, nextto
IsSDKEntrypointinclaude_attribution.go.isForceFastModeFromContext(ctx)— gin-context accessor, sibling togetWorkloadFromContextinclaude_executor.go.TestIsForceFastModeHeader(18 subtests),TestIsForceFastModeFromContext(9 subtests),
TestApplyClaudeHeaders_StripsForceFastModeHeader,TestApplyClaudeHeaders_StripsForceFastModeHeaderEvenWhenFlagOff.Files touched in the follow-up commit (+198 LOC, -2 LOC):
internal/runtime/executor/claude_executor.gointernal/runtime/executor/claude_executor_test.gointernal/util/claude_attribution.gointernal/util/claude_attribution_test.go🤖 Generated with Claude Code