Add full Anthropic router replay token handling#928
Conversation
| ), | ||
| ) | ||
| else: | ||
| routed_experts = None |
There was a problem hiding this comment.
Routed-experts decoding duplicated across two client files
Low Severity
The routed_experts decoding block — walrus-operator check, base64.b85decode, np.frombuffer with int32, .reshape, .tolist() — is copy-pasted verbatim from openai_chat_completions_client.py into anthropic_messages_client.py. Extracting this into a shared utility (e.g. in client_utils) would eliminate the duplication and ensure future fixes apply to both paths.
|
let's have ANTHROPIC_MAX_TOKENS as a global constant when the anthropic client is initialized, we should log the default and mention that the value is required |
|
…lt at init; use constant for fallback max_tokens Co-authored-by: will brown <willccbb@users.noreply.github.com>
|
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| .reshape(routed_experts["shape"]) | ||
| .tolist() | ||
| ), | ||
| ) |
There was a problem hiding this comment.
Malformed routed_experts crashes entire response parsing
Medium Severity
The parse_tokens function gracefully returns None when prompt_token_ids, token_ids, or logprobs are missing or invalid, but the routed_experts decoding block (base64.b85decode + np.frombuffer + .reshape) has no try-except. If the server returns a routed_experts dict with valid data and shape keys but malformed content (e.g. corrupt base85 or shape mismatch), an unhandled ValueError propagates out of from_native_response, causing the entire response — including valid text content — to be lost as a ModelError. Wrapping the decode in a try-except and falling back to routed_experts = None would be consistent with the rest of the function's defensive design.




Motivation
/v1/messagespath) to return token-level outputs and router-replay payloads and have the client surface them in the same shape used for OpenAI chat completions.routed_experts) throughsampling_argswithout provider-specific branching by forwarding unknown sampling args intoextra_body.Description
AnthropicMessagesClient.from_native_response:prompt_token_ids,token_ids, andlogprobsare validated and converted intoResponseTokenswhen present.routed_expertsdecoding (base85 -> int32 -> reshape -> list) and attached decodedrouted_expertstoResponseTokenswhen available.parse_completion_logprobsandparse_tokenshelpers and wiredtokens=parse_tokens(response)intoResponseMessagewhile preservingtokens=Nonewhen fields are incomplete.get_native_responseto validateextra_bodyis a mapping, fall backmax_tokenswhen missing, and move unknown Anthropic args intosampling_args["extra_body"]so router replay payloads are forwarded unchanged.base64,numpy) and new unit tests covering request forwarding, routed-expert decoding, token extraction, and the negative case whenlogprobsare missing.Testing
uv run ruff check --fix verifiers/clients/anthropic_messages_client.py tests/test_client_multimodal_types.pyand they passed.uv run pytest tests/test_client_multimodal_types.py tests/test_client_auth_errors.pyand all tests passed (20 passed).uv run pre-commit run --all-files; hooks initially reformatted a file and on final run the pre-commit checks passed.Codex Task
Note
Medium Risk
Touches provider request/response translation and adds numpy/base85 decoding, which could affect Anthropic message calls and token accounting if parsing or arg-normalization has edge cases.
Overview
Adds token-level output support to
AnthropicMessagesClient.from_native_response, populatingResponseMessage.tokensfromprompt_token_ids,token_ids, andlogprobs, and decoding optional router-replayrouted_expertspayloads (base85/int32/reshape) into the sharedResponseTokensshape.Updates
AnthropicMessagesClient.get_native_responseto defaultmax_tokensto32768when omitted and to forward unknown sampling args (e.g.routed_experts) viaextra_body, with validation thatextra_bodyis a mapping; adds focused unit tests covering forwarding behavior, defaulting, token extraction, routed-expert decoding, and the missing-logprobs fallback.Written by Cursor Bugbot for commit 34bd2ca. This will update automatically on new commits. Configure here.