Skip to content

[BUG]: SlidingWindowConversationManager corrupts toolResult JSON structure causing LLM hallucinations #95

Closed
@cagataycali

Description

@cagataycali

Description

The SlidingWindowConversationManager converts structured toolResult JSON to plain text when trimming conversation history, which causes LLMs to lose the ability to properly parse tool results and leads to hallucinations.

Current Behavior

When the conversation exceeds the window size and the trim point falls on a message containing toolResult content, the _map_tool_result_content() method converts structured JSON like:

{"toolResult": {"toolUseId": "123", "content": [{"text": "Result"}], "status": "success"}}

Into plain text like:

"Tool Result Text Content: Result"
"Tool Result JSON Content: {...}"
"Tool Result Status: success"

This flattened text format loses the structured information that LLMs need to understand tool results, causing them to hallucinate and make incorrect tool calls.

Expected Behavior

The tool result JSON structure should be preserved throughout the conversation history, even when messages are trimmed due to window size constraints.

Impact

  • LLMs cannot properly parse tool results after context reduction
  • Leads to tool call hallucinations
  • May be related to repeated "bedrock threw context window overflow error" warnings
  • Breaks tool interaction patterns when conversations exceed window size

Root Cause

The _map_tool_result_content() method in sliding_window_conversation_manager.py was designed to convert tool results to plain text to work around the limitation of needing ToolUse and ToolResults to be paired. However, this approach corrupts the message structure that LLMs depend on.

Comparison with Internal Version

The internal Phoenix window manager (linked in the issue) correctly preserves tool use/result pairs without corrupting the JSON structure. It uses a different approach with _find_valid_cut_index() that adjusts cut points to keep tool pairs together.

Proposed Solution

The fix in PR #94 removes the problematic _map_tool_result_content() method and implements a new approach that:

  1. Maps tool IDs to track relationships between toolUse and toolResult
  2. Finds safe cutting points that preserve tool pairs together
  3. Maintains the original JSON structure throughout the conversation

Additional Context

This issue may also be related to the repeated "bedrock threw context window overflow error" warnings observed in some cases, as the corrupted message structure might be causing invalid requests to Bedrock.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions