fix(ai): attach user media to Anthropic requests#1238
Merged
Conversation
0383755 to
3cec6fa
Compare
AnthropicChatModel.convertMessage() built the USER message from userMsg.getText() only and silently dropped userMsg.getMedia(), so images passed via the media input path never reached Claude — the model received a text-only message and hallucinated. The OpenAI provider already forwards media; Anthropic did not. Convert user media into Anthropic image content blocks (base64 source) alongside the text, mirroring OpenAIResponsesChatModel. Adds a ContentBlock.image(mediaType, base64Data) factory (the image block type and Source were already modeled, just unused). Adds AnthropicChatModelMediaTest, which captures the outgoing MessagesRequest and asserts the user message carries an image content block with the verbatim base64 payload. Verified it fails before the fix (media dropped -> bare string content) and passes after. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
3cec6fa to
00ab8b3
Compare
kowser-orkes
approved these changes
Jul 2, 2026
This was referenced Jul 2, 2026
Open
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Images passed as input via the
mediapath never reached Anthropic (Claude) models. The model received a text-only message and hallucinated a response instead of reading the image.The pipeline was correct up to the provider conversion: media is stored on the
ChatMessage, andLLMHelper.getMessage()downloads the URL to bytes and builds a Spring AIUserMessagewith media. ButAnthropicChatModel.convertMessage()built the user message fromuserMsg.getText()only and silently droppeduserMsg.getMedia():The OpenAI provider (
OpenAIResponsesChatModel.convertMessage()) already forwards media as image content parts; the Anthropic converter never got that treatment. The AnthropicContentBlockAPI already modeledtype="image"with a base64Source— the capability existed, it just wasn't wired up.Fix
AnthropicChatModel.convertMessage(): when aUserMessagecarries media, build atextblock plus oneimageblock per media item (base64-encoding the downloaded bytes), mirroring the OpenAI path. Text-only messages are unchanged.AnthropicMessagesApi: add aContentBlock.image(mediaType, base64Data)factory.Test
Adds
AnthropicChatModelMediaTest, which mocks the Messages API, captures the outgoingMessagesRequest, and asserts the user message serializes to a content-block list containing animageblock withsource.type=base64,media_type=image/png, and the verbatim base64 payload — plus the accompanying text block.Validated both directions:
String(media dropped).Scope
Anthropic provider only; OpenAI/Gemini media paths are untouched.
convertMessageis the single message-building path for Anthropic (used by the normal call path), so all Anthropic requests are covered.End-to-end validation (full agentspan SDK e2e suite)
Built a server carrying this fix (the 0.3.0 agentspan server baseline + this exact
conductor-aichange) and ran the full agentspan Python SDK e2e suite against it.Result: 123 passed, 8 failed, 19 skipped/xfailed (150 tests, 23m).
The media-input suite (Suite 25) passes in full, including the Anthropic case — which is the direct target of this fix and was previously failing/skipped:
A direct probe confirms it:
anthropic/claude-sonnet-4-5+ an image of the textMELON7391now returnsMELON7391(before the fix the model received no image and hallucinated unrelated text).The 8 failures are unrelated to this PR — all are a pre-existing version mismatch between the newer standalone
agentspanGo CLI and the older 0.3.0 server's REST API (not the Anthropic provider):TestSuite16CliSkills(5) — CLIskill --versionflag unknown /skill loadHTTP 400 against the 0.3.0 server.TestSuite2ToolCalling,TestSuite4McpTools,TestSuite5HttpTools(1 each) —credentials delete→ HTTP 404No static resource api/credentials/...(endpoint absent on the 0.3.0 server).None touch the LLM provider path; they fail identically with or without this change.
Exact-version confirmation: the same result was reproduced building
conductor-aifrom thev3.32.0-rc.3tag with this fix cherry-picked (not just the 3.30.2 baseline) and running it inside the matching agentspan server —AnthropicChatModelMediaTestpasses at that source, and Suite 25 passes end-to-end (4 passed, Anthropic included).🤖 Generated with Claude Code