Skip to content

fix(executor): inherit reasoning_content from conversation history for OpenAI-compatible providers#3543

Open
gwdgithubnom wants to merge 2 commits into
router-for-me:devfrom
agodomen:dev
Open

fix(executor): inherit reasoning_content from conversation history for OpenAI-compatible providers#3543
gwdgithubnom wants to merge 2 commits into
router-for-me:devfrom
agodomen:dev

Conversation

@gwdgithubnom
Copy link
Copy Markdown

Summary

fix: preserve reasoning_content across multi-turn tool-call conversations for all OpenAI-compatible providers (DeepSeek, Kimi, MiMo, etc.).

Problem: When thinking mode is enabled and tool calls are present, providers like DeepSeek and MiMo require reasoning_content to be passed back verbatim in subsequent requests. Some client SDKs (e.g. @ai-sdk/openai-compatible) may strip this field from conversation history, especially in older versions, causing 400 errors:

{"error":{"message":"The `reasoning_content` in the thinking mode must be passed back to the API.","type":"invalid_request_error"}}

See MiMo official documentation for the full requirement.

Fix: Add PreserveReasoningContent() in the executor layer that scans conversation history to inherit the latest non-empty reasoning_content into assistant messages with tool_calls that are missing it.

Key design principles:

  • Inherit, don't fabricate: Only propagates real reasoning_content from conversation history. No fallback to content text or placeholders.
  • Empty does not overwrite: An empty reasoning_content: "" does not overwrite a prior non-empty value.
  • Generic: Works for all OpenAI-compatible providers, not model-specific.
  • Same pattern as KimiExecutor (PR Fix Kimi tool-call payload normalization for reasoning_content #1467, MERGED): Uses the proven history-scanning approach already accepted by the maintainer.
  • Future-proof: When client SDKs are updated to preserve reasoning_content, this function becomes a no-op (only iterates messages for existence checks, no JSON modifications, negligible overhead).

Why executor layer, not translator layer: The translator-path-guard CI blocks any PR that touches internal/translator/. This executor-layer approach avoids that restriction while covering all source formats (OpenAI, Claude, Codex) that route through OpenAICompatExecutor.

Changes

File Action Description
internal/runtime/executor/helps/reasoning_preserve.go New PreserveReasoningContent() function
internal/runtime/executor/helps/reasoning_preserve_test.go New 11 unit tests
internal/runtime/executor/openai_compat_executor.go Modified +2 lines: call PreserveReasoningContent in Execute/ExecuteStream

Test Coverage (11 cases)

  1. Injects from history: prior assistant has reasoning → injects into next assistant with tool_calls
  2. Keeps existing value: assistant already has reasoning → no overwrite
  3. Skips messages without tool_calls
  4. No change when no history reasoning (no fabrication)
  5. Empty reasoning does not overwrite prior valid reasoning
  6. All-empty reasoning injects empty string (not fabricated content)
  7. Multiple tool-call rounds: tracks latest reasoning across turns
  8. Skips non-assistant roles (user messages don't affect tracking)
  9. Invalid JSON → returns unchanged
  10. Empty/nil inputs → returns unchanged
  11. JSON null reasoning_content treated as "seen but empty" (does not overwrite prior non-empty)

Validation

gofmt -w .
go test ./internal/runtime/executor/helps/...
go build -o test-output ./cmd/server

Related

Relationship to PR #3523

PR #3523 (Feature/support deepseek reasoning content) is model-limited to DeepSeek V4 and handles
Responses API reasoning items. This PR is complementary, not duplicate:

Dimension PR #3523 This PR
Scope DeepSeek V4 only All OpenAI-compatible providers
Method Responses items → reasoning_content Conversation history inheritance
Scenario Responses API translation SDK stripping safety net

Both are needed for full coverage. This PR works as a safety net even if PR #3523 is merged.


Critical: Translator Layer Gaps (Requires Maintainer @luispater Action)

This PR fixes the executor-layer gap. However, there are additional gaps in the translator layer that cause reasoning_content to be lost BEFORE the executor can preserve it. All translator fixes are blocked by translator-path-guard CI and require maintainer action.

Gap A: Protocol Translators Missing Reasoning Type Handling (e.g., Claude → OpenAI)

Example file: internal/translator/openai/claude/openai_claude_request.go (line ~146)

Problem: When a client (e.g., Claude Code, Kilo Code, Gemini CLI) routes through CPA to DeepSeek/MiMo, the response flow is:

Model Provider → OpenAI response (reasoning_content field)
→ CPA translates to Claude format → {"type":"reasoning","text":"..."} content block
→ Client stores in conversation history
→ Client sends next request with reasoning content blocks
→ CPA translates Claude → OpenAI → "reasoning" type has NO handler → DROPPED
→ Model Provider receives request without reasoning_content → 400 error

The Claude→OpenAI translator handles "thinking" (Anthropic's native type) but NOT "reasoning" (the type used when translating OpenAI responses back to Claude format):

// Current code (line 146):
switch partType {
case "thinking":       // ← handled
    reasoningParts = append(reasoningParts, thinkingText)
case "redacted_thinking": // ← ignored
// ...
// MISSING: case "reasoning":  ← this is the gap!
}

Impact: The same gap exists across multiple protocol translators (Claude→OpenAI, Gemini→OpenAI, etc.). Any client routing through CPA to OpenAI-compatible providers will lose reasoning_content on subsequent tool-call rounds.

Fix (Claude→OpenAI example): Add case "reasoning": handler identical to case "thinking"::

case "reasoning":
    if role == "assistant" {
        reasoningText := part.Get("text").String()
        if strings.TrimSpace(reasoningText) != "" {
            reasoningParts = append(reasoningParts, reasoningText)
        }
    }

Other protocol translators need the same pattern applied to their respective reasoning types (e.g., thought for Gemini).

Gap B: Responses API Translator Missing reasoning Input Items

File: internal/translator/openai/openai/responses/openai_openai-responses_request.go

Problem: ConvertOpenAIResponsesRequestToOpenAIChatCompletions() does not handle reasoning input items. When Codex CLI sends conversation history containing reasoning items from previous turns, the reasoning data is silently dropped during translation to Chat Completions format.

Fix: Add case "reasoning": branch to extract summary_text (and encrypted_content as fallback), inject into next assistant message as reasoning_content.

Why These Must Be Fixed Separately

All translator fixes are blocked by translator-path-guard CI which rejects any PR touching internal/translator/. This PR intentionally avoids that restriction by working at the executor layer.

Recommendation: The maintainer should fix Gap A and Gap B directly, or temporarily allow translator changes for this specific fix. Without these translator fixes, the executor-layer PreserveReasoningContent can only help when the reasoning_content field already exists in the request body (i.e., when the client SDK preserved it but other processing stripped it). When the translator drops reasoning data during format conversion, there is nothing for the executor to preserve.

Full Coverage Requires All Three Layers

Layer Gap Status Impact Without Fix
Executor (this PR) SDK strips reasoning_content ✅ Fixed Chat Completions clients lose reasoning
Translator: Claude→OpenAI (Gap A) reasoning type not handled ❌ Blocked Claude-format clients (Kilo, Claude Code) → DeepSeek/MiMo fail
Translator: Responses→Chat (Gap B) reasoning input items dropped ❌ Blocked Codex CLI → DeepSeek/MiMo fail

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new utility function, PreserveReasoningContent, in the helps package to ensure that reasoning_content is maintained in assistant messages during multi-turn tool-call scenarios. This logic is required by providers like MiMo and DeepSeek to maintain model context when client SDKs might otherwise strip it. The function is integrated into the OpenAICompatExecutor for both standard and streaming execution paths and is accompanied by comprehensive unit tests covering various injection scenarios and edge cases. I have no feedback to provide as no review comments were present.

@gwdgithubnom gwdgithubnom marked this pull request as ready for review May 25, 2026 09:54
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e150c06029

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/runtime/executor/helps/reasoning_preserve.go Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant