Skip to content

Conversation

@bigboateng
Copy link
Contributor

@bigboateng bigboateng commented Dec 10, 2025

Important

Fixes thinking block handling to prevent 400 errors by correcting type definitions and ensuring proper block ordering.

  • Behavior:
    • Fixes 400 errors by correcting BetaThinkingBlock type to use string for thinking.
    • Adds BetaRedactedThinkingBlock type for redacted thinking responses.
    • Ensures thinking blocks are ordered first in loop.ts and utils/message-processing.ts.
  • Tests:
    • Adds test-extended-thinking-multi-turn.ts and test-extended-thinking.ts to validate multi-turn and single-turn thinking block handling.
  • Misc:
    • Updates responseToParams in utils/message-processing.ts to handle new block types.
    • Modifies samplingLoop in loop.ts to include block ordering logic.

This description was created by Ellipsis for 3947312. You can customize this summary. It will automatically update as commits are pushed.

Fixes issue #12 where thinkingBudget usage caused 400 errors from
Anthropic's API due to improper thinking block handling.

Changes:
- Fix BetaThinkingBlock type: thinking field is string (content), not config object
- Add BetaRedactedThinkingBlock type for redacted thinking responses
- Update responseToParams to properly handle thinking and redacted_thinking blocks
- Add explicit block ordering when pushing assistant messages to ensure
  thinking blocks always come first (API requirement)
- Add test examples for extended thinking (single and multi-turn)

The root cause was conflating the thinking CONFIG (what you send to enable
thinking) with the thinking BLOCK (what the API returns with actual content).
@ellipsis-dev ellipsis-dev bot changed the title ... fix: correct thinking block handling to prevent 400 errors Dec 10, 2025
@bigboateng bigboateng merged commit b60c1ef into main Dec 10, 2025
1 check passed
@bigboateng bigboateng deleted the fix/thinking-block-handling branch December 10, 2025 18:13
Copy link

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Looks good to me! 👍

Reviewed everything up to 3947312 in 2 minutes and 19 seconds. Click for details.
  • Reviewed 328 lines of code in 6 files
  • Skipped 0 files when reviewing.
  • Skipped posting 5 draft comments. View those below.
  • Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. .changeset/fix-thinking-block-handling.md:1
  • Draft comment:
    Documentation is clear and detailed. The explanation of the fix and usage instructions are well described.
  • Reason this comment was not posted:
    Confidence changes required: 0% <= threshold 50% None
2. examples/test-extended-thinking-multi-turn.ts:55
  • Draft comment:
    The use of the expression "=".repeat(70) is clever but the extra space before .repeat might impact readability. Consider removing the space for clarity.
  • Reason this comment was not posted:
    Confidence changes required: 33% <= threshold 50% None
3. loop.ts:277
  • Draft comment:
    The explicit block ordering for assistant messages is a good solution. However, any new block types should be explicitly added to the order dictionary to avoid unexpected ordering. Also, verify that the extra request payload (extraBody.thinking set as a config object) aligns with the updated response type that now expects a string.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 50% <= threshold 50% The comment suggests adding new block types to a dictionary to avoid unexpected ordering, which is a specific code suggestion. However, the second part of the comment asks to verify alignment between request payload and response type, which is not allowed as it asks for verification. The first part is useful, but the second part violates the rules.
4. types/beta.ts:48
  • Draft comment:
    BetaThinkingBlock now requires a string for the 'thinking' field and a mandatory 'signature'. Ensure that all components generating these blocks supply a valid signature; otherwise, a default (like an empty string) might be needed.
  • Reason this comment was not posted:
    Comment did not seem useful. Confidence is useful = 20% <= threshold 50% The comment is informative and suggests ensuring that components supply a valid signature or use a default. It doesn't provide a specific code suggestion or ask for a test to be written. It seems to be more of a reminder or caution, which violates the rules.
5. utils/message-processing.ts:18
  • Draft comment:
    When converting 'thinking' blocks in responseToParams, consider preserving additional properties (e.g. id and cache_control) so that metadata isn’t lost for later processing.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The comment is speculative - it says "consider preserving" and mentions properties "for later processing" without showing evidence that these properties exist or are needed. The old code did use ...rest which would preserve extra properties, but the new code explicitly doesn't. However, I don't see any evidence in the file that id or cache_control properties are actually used on thinking blocks. The comment is making an assumption about what might be needed without demonstrating that it's actually a problem. This violates the rule about not making speculative comments. The old code did use ...rest to preserve additional properties, so the author intentionally changed this behavior. Maybe there's a reason they wanted to be more explicit and only include specific properties. Without seeing the type definitions or usage elsewhere, I can't be certain this isn't a real issue. While the old code did preserve extra properties, the fact that the author explicitly changed this suggests it was intentional. The comment is speculative because it doesn't show evidence that id or cache_control are actually present on thinking blocks or needed elsewhere. Without strong evidence of a problem, this should be deleted. Delete this comment. It's speculative and doesn't provide strong evidence that the suggested properties are actually needed. The change from ...rest to explicit properties appears intentional, and the comment doesn't demonstrate a concrete issue.

Workflow ID: wflow_RmCbR9KYpcKlqUeL

You can customize Ellipsis by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants