fix(ai): attach user media to Anthropic requests by ling-senpeng13 · Pull Request #1238 · conductor-oss/conductor

ling-senpeng13 · 2026-07-01T18:41:20Z

Problem

Images passed as input via the media path never reached Anthropic (Claude) models. The model received a text-only message and hallucinated a response instead of reading the image.

The pipeline was correct up to the provider conversion: media is stored on the ChatMessage, and LLMHelper.getMessage() downloads the URL to bytes and builds a Spring AI UserMessage with media. But AnthropicChatModel.convertMessage() built the user message from userMsg.getText() only and silently dropped userMsg.getMedia():

case USER -> {
    UserMessage userMsg = (UserMessage) msg;
    messages.add(Message.user(userMsg.getText()));   // getMedia() ignored
}

The OpenAI provider (OpenAIResponsesChatModel.convertMessage()) already forwards media as image content parts; the Anthropic converter never got that treatment. The Anthropic ContentBlock API already modeled type="image" with a base64 Source — the capability existed, it just wasn't wired up.

Fix

AnthropicChatModel.convertMessage(): when a UserMessage carries media, build a text block plus one image block per media item (base64-encoding the downloaded bytes), mirroring the OpenAI path. Text-only messages are unchanged.
AnthropicMessagesApi: add a ContentBlock.image(mediaType, base64Data) factory.

Test

Adds AnthropicChatModelMediaTest, which mocks the Messages API, captures the outgoing MessagesRequest, and asserts the user message serializes to a content-block list containing an image block with source.type=base64, media_type=image/png, and the verbatim base64 payload — plus the accompanying text block.

Validated both directions:

Before the fix: FAILS — user content serializes to a bare String (media dropped).
After the fix: PASSES.

./gradlew :conductor-ai:test --tests "org.conductoross.conductor.ai.providers.anthropic.AnthropicChatModel*"
BUILD SUCCESSFUL

Scope

Anthropic provider only; OpenAI/Gemini media paths are untouched. convertMessage is the single message-building path for Anthropic (used by the normal call path), so all Anthropic requests are covered.

End-to-end validation (full agentspan SDK e2e suite)

Built a server carrying this fix (the 0.3.0 agentspan server baseline + this exact conductor-ai change) and ran the full agentspan Python SDK e2e suite against it.

Result: 123 passed, 8 failed, 19 skipped/xfailed (150 tests, 23m).

The media-input suite (Suite 25) passes in full, including the Anthropic case — which is the direct target of this fix and was previously failing/skipped:

test_vision_reads_text_from_image[openai]       PASSED
test_vision_reads_text_from_image[anthropic]    PASSED   ← fixed by this PR (was: hallucinated, no image)
test_without_media_token_is_absent[openai]      PASSED
test_without_media_token_is_absent[anthropic]   PASSED

A direct probe confirms it: anthropic/claude-sonnet-4-5 + an image of the text MELON7391 now returns MELON7391 (before the fix the model received no image and hallucinated unrelated text).

The 8 failures are unrelated to this PR — all are a pre-existing version mismatch between the newer standalone agentspan Go CLI and the older 0.3.0 server's REST API (not the Anthropic provider):

TestSuite16CliSkills (5) — CLI skill --version flag unknown / skill load HTTP 400 against the 0.3.0 server.
TestSuite2ToolCalling, TestSuite4McpTools, TestSuite5HttpTools (1 each) — credentials delete → HTTP 404 No static resource api/credentials/... (endpoint absent on the 0.3.0 server).

None touch the LLM provider path; they fail identically with or without this change.

Exact-version confirmation: the same result was reproduced building conductor-ai from the v3.32.0-rc.3 tag with this fix cherry-picked (not just the 3.30.2 baseline) and running it inside the matching agentspan server — AnthropicChatModelMediaTest passes at that source, and Suite 25 passes end-to-end (4 passed, Anthropic included).

🤖 Generated with Claude Code

AnthropicChatModel.convertMessage() built the USER message from userMsg.getText() only and silently dropped userMsg.getMedia(), so images passed via the media input path never reached Claude — the model received a text-only message and hallucinated. The OpenAI provider already forwards media; Anthropic did not. Convert user media into Anthropic image content blocks (base64 source) alongside the text, mirroring OpenAIResponsesChatModel. Adds a ContentBlock.image(mediaType, base64Data) factory (the image block type and Source were already modeled, just unused). Adds AnthropicChatModelMediaTest, which captures the outgoing MessagesRequest and asserts the user message carries an image content block with the verbatim base64 payload. Verified it fails before the fix (media dropped -> bare string content) and passes after. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

ling-senpeng13 force-pushed the fix/anthropic-media-input branch from 0383755 to 3cec6fa Compare July 1, 2026 19:21

ling-senpeng13 force-pushed the fix/anthropic-media-input branch from 3cec6fa to 00ab8b3 Compare July 1, 2026 19:31

Merge branch 'main' into fix/anthropic-media-input

61ba60b

ling-senpeng13 requested a review from v1r3n July 1, 2026 19:57

ling-senpeng13 self-assigned this Jul 1, 2026

kowser-orkes approved these changes Jul 2, 2026

View reviewed changes

ling-senpeng13 merged commit f9f7603 into main Jul 2, 2026
7 checks passed

ling-senpeng13 deleted the fix/anthropic-media-input branch July 2, 2026 16:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(ai): attach user media to Anthropic requests#1238

fix(ai): attach user media to Anthropic requests#1238
ling-senpeng13 merged 2 commits into
mainfrom
fix/anthropic-media-input

ling-senpeng13 commented Jul 1, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ling-senpeng13 commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Test

Scope

End-to-end validation (full agentspan SDK e2e suite)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ling-senpeng13 commented Jul 1, 2026 •

edited

Loading