Skip to content

Parallel tool_calls cause "tool_call_id missing response" error #8479

@SeasonPilot

Description

@SeasonPilot

What version of Codex is running?

codex-cli 0.77.0

What subscription do you have?

no

Which model were you using?

gpt-5.2

What platform is your computer?

No response

What issue are you seeing?

Parallel tool_calls cause "tool_call_id missing response" error

Summary

When Codex sends multiple parallel tool_calls to the LLM API, some tool responses are lost, causing the subsequent API request to fail with:

An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'.

This error is reproducible and unrecoverable - once it occurs, the session cannot be resumed.

Environment

  • Codex version: codex-cli 0.77.0
  • OS: macOS Darwin 23.6.0
  • Log file: ~/.codex/log/codex-tui.log

Steps to Reproduce

  1. Start a Codex session
  2. Give the AI a task that requires reading multiple files (e.g., "analyze the A2A example code")
  3. The AI will attempt to execute multiple sed or file-read commands in parallel
  4. The error occurs immediately after the parallel tool calls

Expected Behavior

All tool call responses should be collected and sent back to the API in the correct format.

Actual Behavior

Some tool responses are lost, causing the API to reject the next request due to format violation.

Evidence from Logs

Occurrence 1: 2025-12-23 06:48

2025-12-23T06:48:28.508945Z  INFO ToolCall: shell_command {"command": "sed -n '1,220p' ...A2aNodeActionWithConfig.java"}
2025-12-23T06:48:28.509558Z  INFO ToolCall: shell_command {"command": "sed -n '1,220p' ...A2AExample.java"}
2025-12-23T06:48:28.509615Z  INFO ToolCall: shell_command {"command": "sed -n '1,260p' ...A2aRemoteAgent.java"}
2025-12-23T06:48:30.894742Z  INFO Turn error: {"error":{"message":"An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_xNiBLtPTkR9xLP7r41K3wjzr"}}

Occurrence 2: 2025-12-23 08:09 (different session)

2025-12-23T08:09:38.257295Z  INFO ToolCall: shell_command {"command": "sed -n '1,220p' ...A2AExample.java"}
2025-12-23T08:09:38.257437Z  INFO ToolCall: shell_command {"command": "sed -n '1,240p' ...A2AExampleController.java"}
2025-12-23T08:09:38.257486Z  INFO ToolCall: shell_command {"command": "sed -n '1,240p' ...README.md"}
2025-12-23T08:09:40.220236Z  INFO Turn error: {"error":{"message":"An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. The following tool_call_ids did not have response messages: call_roXyhlwmpnd3ZrA7DMM2nq7C"}}

Pattern Summary

Timestamp Parallel Commands Time to Error Missing tool_call_id
06:48:28 3 sed commands ~2 seconds call_xNiBLtPTkR9xLP7r41K3wjzr
08:09:38 3 sed commands ~2 seconds call_roXyhlwmpnd3ZrA7DMM2nq7C

Source Code Analysis

After analyzing the codex-rs source code, I've identified the following potential bug locations:

Key Files Involved

File Purpose
codex-rs/core/src/codex.rs Main turn loop and drain_in_flight
codex-rs/core/src/stream_events_utils.rs Tool call handling and response recording
codex-rs/core/src/context_manager/normalize.rs History validation

Bug Location 1: Race Condition in Turn Loop (codex.rs)

The main turn loop processes tool calls and collects responses:

ResponseEvent::OutputItemDone(item) => {
    // Tool execution queued as future
    if let Some(tool_future) = output_result.tool_future {
        in_flight.push_back(tool_future);  // Queued but not awaited yet
    }
}
// ...
ResponseEvent::Completed { ... } => {
    // Stream completed - loop breaks BEFORE drain_in_flight
    break Ok(TurnRunResult { needs_follow_up, last_agent_message });
}

Issue: If ResponseEvent::Completed arrives before all tool futures complete, and drain_in_flight is called after breaking the loop, timing issues may cause some responses to be lost.

Bug Location 2: Tool Response Recording Order (stream_events_utils.rs)

// In handle_output_item_done:
Ok(Some(call)) => {
    // Tool CALL recorded immediately
    ctx.sess.record_conversation_items(&ctx.turn_context, std::slice::from_ref(&item)).await;

    // Tool OUTPUT queued as future - recorded LATER
    let tool_future: InFlightFuture = Box::pin(
        ctx.tool_runtime.clone().handle_tool_call(call, cancellation_token)
    );
    output.tool_future = Some(tool_future);
}

Issue: The tool call is recorded immediately, but responses are queued as futures. If the session is interrupted before drain_in_flight completes, responses are lost.

Bug Location 3: Session Resumption (codex.rs)

Issue: If rollout items aren't persisted atomically for parallel tool calls (call + all responses), reconstruction may have incomplete pairs.

Root Cause Hypothesis

The most likely root cause is the timing between drain_in_flight and the turn loop:

  1. Parallel tool calls are dispatched → added to in_flight: FuturesOrdered
  2. Turn loop continues processing stream events
  3. ResponseEvent::Completed arrives → loop breaks
  4. drain_in_flight called AFTER loop ends
  5. If any tool execution is still pending or fails silently → response not recorded

Impact

  • Severity: High - completely blocks the session
  • Workaround: Delete the corrupted session file and start a new session
    rm ~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl
  • User Experience: Very poor - users lose all conversation context

Suggested Fixes

1. Ensure all in-flight futures complete before breaking

In codex.rs, drain in-flight futures before processing Completed:

ResponseEvent::Completed { ... } => {
    // BEFORE breaking, ensure all in-flight futures complete
    drain_in_flight(&mut in_flight, sess.clone(), turn_context.clone()).await?;

    should_emit_turn_diff = true;
    break Ok(TurnRunResult { needs_follow_up, last_agent_message });
}

2. Validate tool responses before API call

Before sending the next API request, validate that all tool_call_ids have corresponding responses:

fn validate_tool_responses(messages: &[Message]) -> Result<(), Error> {
    for (i, msg) in messages.iter().enumerate() {
        if let Some(tool_calls) = &msg.tool_calls {
            let expected_ids: HashSet<_> = tool_calls.iter().map(|tc| &tc.id).collect();
            let actual_ids: HashSet<_> = messages[i+1..]
                .iter()
                .filter_map(|m| m.tool_call_id.as_ref())
                .collect();

            let missing: Vec<_> = expected_ids.difference(&actual_ids).collect();
            if !missing.is_empty() {
                return Err(Error::MissingToolResponses(missing));
            }
        }
    }
    Ok(())
}

What steps can reproduce the bug?

3. Prevent session resume with corrupted state

When resuming a session, validate message sequence integrity before sending to API.


Checklist

  • I have searched existing issues for duplicates
  • I have included relevant log excerpts
  • I have described the expected vs actual behavior
  • I have provided steps to reproduce
  • I have analyzed the source code to identify root cause
  • I have suggested potential fixes

What is the expected behavior?

No response

Additional information

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    CLIIssues related to the Codex CLIbugSomething isn't workingchat-endpointBugs or PRs related to the chat/completions endpoint (wire API)tool-callsIssues related to tool calling

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions