Skip to content

Fix/545 multiple answer after delegation#775

Open
subhashkrpro wants to merge 3 commits intosipeed:mainfrom
subhashkrpro:fix/545-multiple-answer-after-delegation
Open

Fix/545 multiple answer after delegation#775
subhashkrpro wants to merge 3 commits intosipeed:mainfrom
subhashkrpro:fix/545-multiple-answer-after-delegation

Conversation

@subhashkrpro
Copy link

NOTE: I currently do not have a proper testing environment. I kindly request reviewers to explicitly verify this change before merging.

Fix Verification for Issue #545: Multiple Duplicate Messages

Issue Summary

When a subagent completes asynchronously, the main agent was sending the same message 15+ times (matching max_tool_iterations), instead of once.

Root Cause

The LLM iteration loop had no detection for when the same tool was called repeatedly with identical arguments, causing infinite repetition until hitting the iteration limit.

Fix Applied

Added deduplication logic in pkg/agent/loop.go at lines 627-658 in the runLLMIteration() function.

Code Trace - Bug Scenario (BEFORE FIX)

Using data from the issue logs, service health check:

max_tool_iterations: 15
Tool: message
Content: "Subagent-3 completed weather check..."

Iteration Flow (Before Fix):

Iteration 1:
├─ LLM sees: [System Prompt, User: "health check"]
├─ LLM returns: ToolCall(message, content="Subagent-3 completed...")
├─ Message sent ✓
├─ Tool result added to messages
└─ Continue loop

Iteration 2:
├─ LLM sees: [System, User, Assistant(message call), ToolResult, ...]
├─ LLM returns: ToolCall(message, content="Subagent-3 completed...") ← SAME CONTENT
├─ Message sent (DUPLICATE #1) ✗
├─ Tool result added to messages
└─ Continue loop

Iterations 3-15:
├─ Same pattern repeats...
└─ RESULT: Message sent 15 times ✗✗✗

Code Trace - Bug Scenario (AFTER FIX)

Iteration Flow (After Fix):

Iteration 1:
├─ Check: iteration > 1? NO (iteration == 1)
├─ Skip dedup check
├─ LLM sees: [System Prompt, User: "health check"]
├─ LLM returns: ToolCall(message, content="Subagent-3 completed...")
├─ Message sent ✓
├─ Create: assistantMsg with ToolCalls
├─ messages = [..., assistantMsg, tool_result]
└─ Continue loop

Iteration 2:
├─ Check: iteration > 1? YES ✓
├─ Check: len(messages) >= 2? YES ✓
├─ Get: lastAssistantMsg = messages[len-2] = assistantMsg from iteration 1
├─ Get: lastTC = lastAssistantMsg.ToolCalls[0]
├─ Get: currentTC = LLM's new tool call
├─ Compare: lastTC.Name == currentTC.Name? "message" == "message" ✓
├─ Compare: json.Marshal(lastTC.Arguments) == json.Marshal(currentTC.Arguments)?
│           "Subagent-3..." == "Subagent-3..." ✓ MATCH!
├─ Set: finalContent = response.Content
├─ Break: Exit loop immediately ← FIX APPLIED!
└─ Return: finalContent, iteration=2, nil



Copy link

@nikolasdehor nikolasdehor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem statement is clear and the dedup idea is reasonable, but this implementation has several issues:

1. Only compares the first tool call
The code checks lastAssistantMsg.ToolCalls[0] vs normalizedToolCalls[0]. If the LLM returns multiple tool calls and only the first one is duplicated, the rest are silently dropped. Conversely, if the first tool call differs but others are duplicates, the dedup misses it.

2. json.Marshal comparison is fragile
json.Marshal does not guarantee key ordering for maps. Two semantically identical map[string]any values can marshal to different JSON strings depending on Go's internal map iteration order. This means the dedup check can both false-negative (same args, different serialization order) and, less likely, false-positive. Use reflect.DeepEqual on the arguments directly, or sort keys before comparison.

3. Hardcoded messages[len-2] assumption
The code assumes messages[len(messages)-2] is always the previous assistant message. But the message array structure depends on how tool results are appended. If a tool call returns multiple result messages, or if the message structure changes, this indexing will silently compare the wrong message. It would be safer to walk backwards and find the last message with Role == "assistant".

4. Swallows errors from json.Marshal
Both json.Marshal calls discard errors with _. If marshaling fails (unlikely but possible with unusual argument types), both sides become "null" and you get a false match, breaking the loop prematurely.

5. No test coverage
The PR description explicitly says the author has no testing environment. This is a behavioral change in the core agent loop — it needs unit tests with at least: (a) duplicate tool calls correctly detected, (b) different tool calls not falsely flagged, (c) multiple tool calls per turn handled.

6. Single-repeat threshold is aggressive
Breaking after just one repeated tool call means legitimate retry patterns (e.g., tool fails, LLM retries with same args) will be killed. Consider requiring N consecutive duplicates (e.g., 3) before breaking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants