Skip to content

feat:Update OpenAPI spec with raw_prompting param, max_tokens clarification, and Thinking schema #242

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 8, 2025

Conversation

HavenDV
Copy link
Contributor

@HavenDV HavenDV commented Aug 8, 2025

Summary by CodeRabbit

  • New Features

    • Introduced a new option to enable enhanced reasoning capabilities, allowing the model to output internal reasoning steps before providing a final answer.
    • Added a new setting to control whether prompts are sent to the model without pre-processing.
  • Improvements

    • Clarified the behavior of the output length setting, including default values and limits, to help users better manage response completeness.

Copy link

coderabbitai bot commented Aug 8, 2025

Walkthrough

The OpenAPI specification was updated to add a new boolean parameter raw_prompting to several request schemas, clarify the behavior of the max_tokens parameter, introduce a new Thinking schema for enhanced reasoning features, and reference this schema in relevant request definitions.

Changes

Cohort / File(s) Change Summary
OpenAPI Specification Updates
src/libs/Cohere/openapi.yaml
- Added raw_prompting boolean parameter to multiple request schemas.
- Clarified max_tokens parameter description regarding defaults, limits, and finish reasons.
- Introduced new Thinking schema with type and optional token_budget, describing enhanced reasoning output.
- Added thinking property referencing the new schema in relevant request definitions.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API
    participant Model

    Client->>API: Send request (may include raw_prompting, max_tokens, thinking)
    API->>Model: Forward prompt (raw or preprocessed, per raw_prompting)
    Model-->>API: Generate response (may include thinking blocks)
    API-->>Client: Return response (with/without thinking content)
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇
In the warren of code, a new path appears,
With raw prompts and thinking, the future is clear.
Tokens explained, and logic unveiled,
Now models can reason before answers are hailed.
Hopping ahead with specs refined,
More thoughtful replies are now defined!

Note

🔌 MCP (Model Context Protocol) integration is now available in Early Access!

Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bot/update-openapi_202508081525

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@HavenDV HavenDV enabled auto-merge (squash) August 8, 2025 15:26
@HavenDV HavenDV merged commit 727b60c into main Aug 8, 2025
3 of 4 checks passed
@HavenDV HavenDV deleted the bot/update-openapi_202508081525 branch August 8, 2025 15:28
@coderabbitai coderabbitai bot changed the title feat:@coderabbitai feat:Update OpenAPI spec with raw_prompting param, max_tokens clarification, and Thinking schema Aug 8, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (1)
src/libs/Cohere/openapi.yaml (1)

7423-7423: LGTM: clearer max_tokens semantics

The clarified defaults and MAX_TOKENS finish_reason behavior look good. Consider adding an example in this schema to encourage explicit values and reduce cost surprises.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9c7fa48 and c182ec9.

⛔ Files ignored due to path filters (13)
  • src/libs/Cohere/Generated/Cohere..JsonSerializerContext.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.CohereClient.Chat.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.CohereClient.Chatv2.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.ICohereClient.Chat.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.ICohereClient.Chatv2.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.JsonConverters.ThinkingType.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.JsonConverters.ThinkingTypeNullable.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.JsonSerializerContextTypes.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.Models.ChatRequest.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.Models.Chatv2Request.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.Models.Thinking.Json.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.Models.Thinking.g.cs is excluded by !**/generated/**
  • src/libs/Cohere/Generated/Cohere.Models.ThinkingType.g.cs is excluded by !**/generated/**
📒 Files selected for processing (1)
  • src/libs/Cohere/openapi.yaml (5 hunks)

Comment on lines +151 to +155
raw_prompting:
type: boolean
description: "When enabled, the user's prompt will be sent to the model without\nany pre-processing.\n\nCompatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments\n"
x-fern-audiences:
- sdk-only
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Confirm consistency and placement of raw_prompting across endpoints

Ensure this property appears only on endpoints where SDK-side preprocessing is relevant (e.g., chat/generate) and is consistently audience-restricted.


🏁 Script executed:

#!/bin/bash
# Show all occurrences and nearby context to verify intended placement and sdk-only audience.
rg -n -A 6 -B 2 $'^\\s*raw_prompting:\\s*$' src/libs/Cohere/openapi.yaml

Length of output: 1590


Inconsistent audience restriction for raw_prompting
The raw_prompting property is marked as sdk-only in two places but is (incorrectly) exposed as public at lines 6529–6533. To keep it only on endpoints with SDK-side preprocessing (e.g., chat/generate), update that block:

• src/libs/Cohere/openapi.yaml:

  • Lines 151–156: x-fern-audiences: [ sdk-only ] ✔️
  • Lines 6529–6533: currently x-fern-audiences: [ public ]
  • Lines 7448–7452: x-fern-audiences: [ sdk-only ] ✔️

Please change lines 6532–6533 to:

x-fern-audiences:
  - sdk-only
🤖 Prompt for AI Agents
In src/libs/Cohere/openapi.yaml around lines 6529 to 6533, the `raw_prompting`
property is incorrectly marked with `x-fern-audiences: [ public ]`. To fix this,
change the audience restriction to `x-fern-audiences: [ sdk-only ]` by updating
lines 6532–6533 accordingly, ensuring consistency with other occurrences of
`raw_prompting`.

🛠️ Refactor suggestion

raw_prompting: set a safe default and fix brand capitalization; clarify implications

  • Add an explicit default: false to avoid accidental bypass of SDK preprocessing.
  • Fix “AWS Sagemaker” → “AWS SageMaker”.
  • Add a short caution in the description so consumers understand it bypasses SDK-side preprocessing (server-side safety still applies).
                 raw_prompting:
                   type: boolean
+                  default: false
-                  description: "When enabled, the user's prompt will be sent to the model without
-any pre-processing.\n\nCompatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments\n"
+                  description: "When enabled, the user's prompt will be sent to the model without
+any SDK-side pre-processing (e.g., templating/system preambles). Server-side safety policies still apply.\n\nCompatible Deployments: Cohere Platform, Azure, AWS SageMaker/Bedrock, Private Deployments\n"
                   x-fern-audiences:
                     - sdk-only
🤖 Prompt for AI Agents
In src/libs/Cohere/openapi.yaml around lines 151 to 155, set the default value
of raw_prompting explicitly to false to prevent accidental bypass of SDK
preprocessing. Correct the brand name from "AWS Sagemaker" to "AWS SageMaker"
for proper capitalization. Additionally, update the description to include a
caution note explaining that enabling raw_prompting bypasses SDK-side
preprocessing, although server-side safety checks remain in effect.

Comment on lines +7448 to +7452
raw_prompting:
type: boolean
description: "When enabled, the user's prompt will be sent to the model without\nany pre-processing.\n\nCompatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments\n"
x-fern-audiences:
- sdk-only
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Duplicate: apply same raw_prompting defaults and brand capitalization here

Mirror the earlier suggestions (default: false, “AWS SageMaker”) to keep the spec consistent.

                 raw_prompting:
                   type: boolean
+                  default: false
-                  description: "When enabled, the user's prompt will be sent to the model without
-any pre-processing.\n\nCompatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments\n"
+                  description: "When enabled, the user's prompt will be sent to the model without
+any SDK-side pre-processing (e.g., templating/system preambles). Server-side safety policies still apply.\n\nCompatible Deployments: Cohere Platform, Azure, AWS SageMaker/Bedrock, Private Deployments\n"
                   x-fern-audiences:
                     - sdk-only

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In src/libs/Cohere/openapi.yaml around lines 7448 to 7452, the raw_prompting
defaults and brand capitalization are inconsistent with earlier parts of the
spec. Update this section to apply the same default value of false for
raw_prompting and use the brand capitalization "AWS SageMaker" to maintain
consistency throughout the specification.

Comment on lines +7496 to +7497
thinking:
$ref: '#/components/schemas/Thinking'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

thinking property: verify coverage across all relevant request variants

Good addition. Please double-check that all applicable request schemas (v1/v2, streaming/non-streaming) expose thinking consistently if that’s intended.


🏁 Script executed:

#!/bin/bash
# Find all request bodies that include `thinking:` to confirm consistent coverage.
rg -n -A 3 -B 6 $'^\\s*thinking:\\s*$' src/libs/Cohere/openapi.yaml

# Also verify any response schemas mention thinking content blocks if they are returned.
rg -n -A 4 -B 2 'thinking content block|thinking.*content block|thinking.*tool_calls' src/libs/Cohere/openapi.yaml

Length of output: 1688


thinking property: missing in other request variants

I only found thinking: referenced once (line 7496 in src/libs/Cohere/openapi.yaml). It’s not exposed in the other request schemas (e.g. v2, streaming endpoints). If you intend to offer “thinking” across all variants, please add the thinking property to each relevant request body.

• Location with existing property
– src/libs/Cohere/openapi.yaml line 7496

Please update the other request schemas accordingly.

🤖 Prompt for AI Agents
In src/libs/Cohere/openapi.yaml around line 7496, the "thinking" property is
defined only in one request schema but is missing from other relevant request
schemas such as v2 and streaming endpoints. To fix this, identify all request
body schemas where "thinking" should be available and add the "thinking"
property definition consistently to each of those schemas to ensure uniform API
behavior.

Comment on lines +17627 to +17647
Thinking:
required:
- type
type: object
properties:
token_budget:
minimum: 1
type: integer
description: "The maximum number of tokens the model can use for thinking, which must be set to a positive integer.\nThe model will stop thinking if it reaches the thinking token budget and will proceed with the response.\n"
x-fern-audiences:
- public
type:
enum:
- enabled
- disabled
type: string
x-fern-audiences:
- public
description: "Thinking gives the model enhanced reasoning capabilities for complex tasks, while also providing transparency into its step-by-step thought process before it delivers its final answer.\nWhen thinking is turned on, the model creates thinking content blocks where it outputs its internal reasoning. The model will incorporate insights from this reasoning before crafting a final response.\nWhen thinking is used without tools, the API response will include both thinking and text content blocks. Meanwhile, when thinking is used alongside tools and the model makes tool calls, the API response will include the thinking content block and `tool_calls`.\n"
x-fern-audiences:
- public
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Strengthen the Thinking schema: default, constraints, and minor typing polish

  • Provide a safe default (disabled).
  • Disallow extraneous properties to keep payloads predictable.
  • Add int32 format to token_budget.
  • Optional: examples to guide clients.
     Thinking:
       required:
         - type
       type: object
+      default:
+        type: disabled
+      additionalProperties: false
       properties:
         token_budget:
           minimum: 1
           type: integer
+          format: int32
           description: "The maximum number of tokens the model can use for thinking, which must be set to a positive integer.
 The model will stop thinking if it reaches the thinking token budget and will proceed with the response.
 "
           x-fern-audiences:
             - public
         type:
           enum:
             - enabled
             - disabled
           type: string
           x-fern-audiences:
             - public
       description: "Thinking gives the model enhanced reasoning capabilities for complex tasks, while also providing transparency into its step-by-step thought process before it delivers its final answer.
 When thinking is turned on, the model creates thinking content blocks where it outputs its internal reasoning. The model will incorporate insights from this reasoning before crafting a final response.
 When thinking is used without tools, the API response will include both thinking and text content blocks. Meanwhile, when thinking is used alongside tools and the model makes tool calls, the API response will include the thinking content block and `tool_calls`.
 "
       x-fern-audiences:
         - public

Note: If you want to prevent token_budget when type=disabled, consider modeling two variants via oneOf (enabled/disabled) in OpenAPI 3.1; otherwise keep server-side validation.

🤖 Prompt for AI Agents
In src/libs/Cohere/openapi.yaml around lines 17627 to 17647, the Thinking schema
needs improvements: set the default value to "disabled" for safety, add
constraints to disallow additional properties to ensure predictable payloads,
specify the token_budget property with int32 format for proper typing, and
optionally include examples to help clients understand usage. Consider using
oneOf to model enabled and disabled variants if you want to prevent token_budget
when type is disabled; otherwise, handle this validation on the server side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant