Skip to content

Conversation

@Chesars
Copy link
Contributor

@Chesars Chesars commented Dec 10, 2025

Relevant issues

Fixes #17737

Pre-Submission checklist

  • I have Added testing in the tests/litellm/ directory
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix

Changes


Bug 1: web_search_tool_result is dropped

When Anthropic returns web search results, LiteLLM was ignoring that field.

Example request

response = litellm.completion(
    model="anthropic/claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": "Search the web for X and use my calculator"}],
    tools=[
        {"type": "web_search_20250305"},  # Anthropic's built-in web search
        {"type": "function", "function": {"name": "calculator", ...}}
    ]
)

Anthropic returns:

{"type": "web_search_tool_result", "tool_use_id": "srvtoolu_01ABC", "content": [...]}

LiteLLM returned to user: Nothing. search results were lost.

Fix: Extract web_search_tool_result and include it in provider_specific_fields.web_search_results.


Bug 2: server_tool_use reconstructed as tool_use

When the user sends messages back for a multi-turn conversation, LiteLLM was converting server-side tool calls to regular tool calls.

User sends to LiteLLM:

{"tool_calls": [{"id": "srvtoolu_01ABC", "function": {"name": "web_search", ...}}]}

LiteLLM sent to Anthropic (before fix):

{"type": "tool_use", "id": "srvtoolu_01ABC", "name": "web_search", ...}

❌ Anthropic requires tool_result for every tool_use, but the user can't provide one for server-executed tools.

LiteLLM sends to Anthropic (after fix):

{"type": "server_tool_use", "id": "srvtoolu_01ABC", "name": "web_search", ...}
{"type": "web_search_tool_result", "tool_use_id": "srvtoolu_01ABC", "content": [...]}

✅ Block types.

Fix: Convert srvtoolu_ prefix and reconstruct as server_tool_use + web_search_tool_result.


Files changed

  • litellm/llms/anthropic/chat/transformation.py - Extract web_search_tool_result
  • litellm/litellm_core_utils/prompt_templates/factory.py - Reconstruct server_tool_use
  • tests/test_litellm/llms/anthropic/chat/test_anthropic_chat_transformation.py - 3 new tests
  • tests/llm_translation/test_prompt_factory.py - 5 new tests

…n multi-turn conversations

- Extract web_search_tool_result blocks in extract_response_content()
- Store web_search_results in provider_specific_fields for round-trip
- Detect srvtoolu_ prefix to reconstruct as server_tool_use (not tool_use)
- Add corresponding web_search_tool_result after server_tool_use blocks

This ensures multi-turn conversations with Anthropic web search + custom
tools work correctly without Anthropic expecting tool_result for server-
side tool executions.
@vercel
Copy link

vercel bot commented Dec 10, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
litellm Ready Ready Preview Comment Dec 10, 2025 1:36am

@krrishdholakia krrishdholakia merged commit 01dec55 into BerriAI:main Dec 10, 2025
4 of 7 checks passed
@krrishdholakia
Copy link
Contributor

@Chesars can you please ensure this covers streaming too?

@KeremTurgutlu
Copy link
Contributor

@krrishdholakia @Chesars streaming is likely failing because of #17254

Here is an example:

❯  sudo ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY uv run test_litellm_websearch_fix_async.py

Reading inline script metadata from `test_litellm_websearch_fix_async.py`
 Updated https://github.com/BerriAI/litellm.git (0d2f8ce93)

=== First API call (async + streaming) ===
Based on the search results, I found the average weights for male elephants:

- **Male African elephants**: 5,000 kg
- **Male Asian elephants**: 3,600 kg
Tool calls: ['web_search', 'add_numbers']

Executing add_numbers(5000, 3600) = 8600

=== Second API call (async + streaming continuation) ===

Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new
LiteLLM.Info: If you need to debug this error, use `litellm._turn_on_debug()'.

FAILED: litellm.BadRequestError: AnthropicException - Extra data: line 1 column 60 (char 59)
Received Messages=[{'role': 'user', 'content': 'Search the web for the avg weight in kgs of male African and Asian elephants. Then add the two. Be concise.'}, {'content': 'Based on the search results, I found the average weights for male elephants:\n\n- **Male African elephants**: 5,000 kg\n- **Male Asian elephants**: 3,600 kg', 'role': 'assistant', 'tool_calls': [{'function': {'arguments': '{"query": "average weight male African Asian elephants kg"}{}{}', 'name': 'web_search'}, 'id': 'srvtoolu_01Pupo3WHJ7g8JZF354UKcoE', 'type': 'function'}, {'function': {'arguments': '{"a": 5000, "b": 3600}', 'name': 'add_numbers'}, 'id': 'toolu_01RRf2pCDB7TBUfJGv2oPLGk', 'type': 'function'}]}, {'role': 'tool', 'tool_call_id': 'toolu_01RRf2pCDB7TBUfJGv2oPLGk', 'content': '8600'}]

you can see 'arguments': '{"query": "average weight male African Asian elephants kg"}{}{}', which has extra trailing {} which causes the JSON decoding issue.

@KeremTurgutlu
Copy link
Contributor

I can confirm that this is the only issue. For example postprocessing tool_calls before the next call like this works:

# Convert streamed chunks to complete message
full_response = litellm.stream_chunk_builder(chunks)
content = full_response.choices[0].message.content
tool_calls = full_response.choices[0].message.tool_calls or []
for tc in tool_calls:
    if tc.function.arguments:
        tc.function.arguments = tc.function.arguments.replace('{}', '')
❯  sudo ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY uv run test_litellm_websearch_fix_async.py

Password:
Reading inline script metadata from `test_litellm_websearch_fix_async.py`
 Updated https://github.com/BerriAI/litellm.git (0d2f8ce93)

=== First API call (async + streaming) ===
Based on the search results, I found information about male elephant weights:

- Male African savanna elephants average 5,000 kg (11,000 pounds)
- Male Asian elephants weigh on average about 3,600 kg (7,900 pounds)

Now let me add these weights:
Tool calls: ['web_search', 'add_numbers']

Executing add_numbers(5000, 3600) = 8600

=== Second API call (async + streaming continuation) ===
Based on the search results:

- Male African elephants average 5,000 kg
- Male Asian elephants weigh on average about 3,600 kg

**Total: 5,000 + 3,600 = 8,600 kg**
SUCCESS!

==================================================
Test result: PASSED

@Chesars Chesars deleted the fix/anthropic-server-tool-use-multi-turn branch December 10, 2025 23:53
Chesars added a commit to Chesars/litellm that referenced this pull request Dec 17, 2025
…tions

Fixes BerriAI#18137

Similar to the fix for web_search_tool_result (BerriAI#17746, BerriAI#17798), this PR
preserves web_fetch_tool_result blocks in multi-turn conversations.

Changes:
- Add handling for web_fetch_tool_result in transformation.py (non-streaming)
- Add capture of web_fetch_tool_result in handler.py (streaming)
- Fix streaming tool arguments bug where empty input {} was prepended to
  actual arguments by using empty string instead of str({})
- Add unit tests for web_fetch_tool_result handling
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Multi-turn conversations with Anthropic web search + custom tools broken - server_tool_use incorrectly converted to tool_use

3 participants