-
-
Notifications
You must be signed in to change notification settings - Fork 0
feat:Update OpenAPI spec with raw_prompting param, max_tokens clarification, and Thinking schema #242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe OpenAPI specification was updated to add a new boolean parameter Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant API
participant Model
Client->>API: Send request (may include raw_prompting, max_tokens, thinking)
API->>Model: Forward prompt (raw or preprocessed, per raw_prompting)
Model-->>API: Generate response (may include thinking blocks)
API-->>Client: Return response (with/without thinking content)
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Note 🔌 MCP (Model Context Protocol) integration is now available in Early Access!Pro users can now connect to remote MCP servers under the Integrations page to get reviews and chat conversations that understand additional development context. ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (1)
src/libs/Cohere/openapi.yaml (1)
7423-7423
: LGTM: clearer max_tokens semanticsThe clarified defaults and MAX_TOKENS finish_reason behavior look good. Consider adding an example in this schema to encourage explicit values and reduce cost surprises.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (13)
src/libs/Cohere/Generated/Cohere..JsonSerializerContext.g.cs
is excluded by!**/generated/**
src/libs/Cohere/Generated/Cohere.CohereClient.Chat.g.cs
is excluded by!**/generated/**
src/libs/Cohere/Generated/Cohere.CohereClient.Chatv2.g.cs
is excluded by!**/generated/**
src/libs/Cohere/Generated/Cohere.ICohereClient.Chat.g.cs
is excluded by!**/generated/**
src/libs/Cohere/Generated/Cohere.ICohereClient.Chatv2.g.cs
is excluded by!**/generated/**
src/libs/Cohere/Generated/Cohere.JsonConverters.ThinkingType.g.cs
is excluded by!**/generated/**
src/libs/Cohere/Generated/Cohere.JsonConverters.ThinkingTypeNullable.g.cs
is excluded by!**/generated/**
src/libs/Cohere/Generated/Cohere.JsonSerializerContextTypes.g.cs
is excluded by!**/generated/**
src/libs/Cohere/Generated/Cohere.Models.ChatRequest.g.cs
is excluded by!**/generated/**
src/libs/Cohere/Generated/Cohere.Models.Chatv2Request.g.cs
is excluded by!**/generated/**
src/libs/Cohere/Generated/Cohere.Models.Thinking.Json.g.cs
is excluded by!**/generated/**
src/libs/Cohere/Generated/Cohere.Models.Thinking.g.cs
is excluded by!**/generated/**
src/libs/Cohere/Generated/Cohere.Models.ThinkingType.g.cs
is excluded by!**/generated/**
📒 Files selected for processing (1)
src/libs/Cohere/openapi.yaml
(5 hunks)
raw_prompting: | ||
type: boolean | ||
description: "When enabled, the user's prompt will be sent to the model without\nany pre-processing.\n\nCompatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments\n" | ||
x-fern-audiences: | ||
- sdk-only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Confirm consistency and placement of raw_prompting across endpoints
Ensure this property appears only on endpoints where SDK-side preprocessing is relevant (e.g., chat/generate) and is consistently audience-restricted.
🏁 Script executed:
#!/bin/bash
# Show all occurrences and nearby context to verify intended placement and sdk-only audience.
rg -n -A 6 -B 2 $'^\\s*raw_prompting:\\s*$' src/libs/Cohere/openapi.yaml
Length of output: 1590
Inconsistent audience restriction for raw_prompting
The raw_prompting
property is marked as sdk-only
in two places but is (incorrectly) exposed as public
at lines 6529–6533. To keep it only on endpoints with SDK-side preprocessing (e.g., chat/generate), update that block:
• src/libs/Cohere/openapi.yaml:
- Lines 151–156:
x-fern-audiences: [ sdk-only ]
✔️ - Lines 6529–6533: currently
x-fern-audiences: [ public ]
⛔ - Lines 7448–7452:
x-fern-audiences: [ sdk-only ]
✔️
Please change lines 6532–6533 to:
x-fern-audiences:
- sdk-only
🤖 Prompt for AI Agents
In src/libs/Cohere/openapi.yaml around lines 6529 to 6533, the `raw_prompting`
property is incorrectly marked with `x-fern-audiences: [ public ]`. To fix this,
change the audience restriction to `x-fern-audiences: [ sdk-only ]` by updating
lines 6532–6533 accordingly, ensuring consistency with other occurrences of
`raw_prompting`.
🛠️ Refactor suggestion
raw_prompting: set a safe default and fix brand capitalization; clarify implications
- Add an explicit default: false to avoid accidental bypass of SDK preprocessing.
- Fix “AWS Sagemaker” → “AWS SageMaker”.
- Add a short caution in the description so consumers understand it bypasses SDK-side preprocessing (server-side safety still applies).
raw_prompting:
type: boolean
+ default: false
- description: "When enabled, the user's prompt will be sent to the model without
-any pre-processing.\n\nCompatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments\n"
+ description: "When enabled, the user's prompt will be sent to the model without
+any SDK-side pre-processing (e.g., templating/system preambles). Server-side safety policies still apply.\n\nCompatible Deployments: Cohere Platform, Azure, AWS SageMaker/Bedrock, Private Deployments\n"
x-fern-audiences:
- sdk-only
🤖 Prompt for AI Agents
In src/libs/Cohere/openapi.yaml around lines 151 to 155, set the default value
of raw_prompting explicitly to false to prevent accidental bypass of SDK
preprocessing. Correct the brand name from "AWS Sagemaker" to "AWS SageMaker"
for proper capitalization. Additionally, update the description to include a
caution note explaining that enabling raw_prompting bypasses SDK-side
preprocessing, although server-side safety checks remain in effect.
raw_prompting: | ||
type: boolean | ||
description: "When enabled, the user's prompt will be sent to the model without\nany pre-processing.\n\nCompatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments\n" | ||
x-fern-audiences: | ||
- sdk-only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Duplicate: apply same raw_prompting defaults and brand capitalization here
Mirror the earlier suggestions (default: false, “AWS SageMaker”) to keep the spec consistent.
raw_prompting:
type: boolean
+ default: false
- description: "When enabled, the user's prompt will be sent to the model without
-any pre-processing.\n\nCompatible Deployments: Cohere Platform, Azure, AWS Sagemaker/Bedrock, Private Deployments\n"
+ description: "When enabled, the user's prompt will be sent to the model without
+any SDK-side pre-processing (e.g., templating/system preambles). Server-side safety policies still apply.\n\nCompatible Deployments: Cohere Platform, Azure, AWS SageMaker/Bedrock, Private Deployments\n"
x-fern-audiences:
- sdk-only
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/libs/Cohere/openapi.yaml around lines 7448 to 7452, the raw_prompting
defaults and brand capitalization are inconsistent with earlier parts of the
spec. Update this section to apply the same default value of false for
raw_prompting and use the brand capitalization "AWS SageMaker" to maintain
consistency throughout the specification.
thinking: | ||
$ref: '#/components/schemas/Thinking' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
thinking property: verify coverage across all relevant request variants
Good addition. Please double-check that all applicable request schemas (v1/v2, streaming/non-streaming) expose thinking consistently if that’s intended.
🏁 Script executed:
#!/bin/bash
# Find all request bodies that include `thinking:` to confirm consistent coverage.
rg -n -A 3 -B 6 $'^\\s*thinking:\\s*$' src/libs/Cohere/openapi.yaml
# Also verify any response schemas mention thinking content blocks if they are returned.
rg -n -A 4 -B 2 'thinking content block|thinking.*content block|thinking.*tool_calls' src/libs/Cohere/openapi.yaml
Length of output: 1688
thinking property: missing in other request variants
I only found thinking:
referenced once (line 7496 in src/libs/Cohere/openapi.yaml). It’s not exposed in the other request schemas (e.g. v2, streaming endpoints). If you intend to offer “thinking” across all variants, please add the thinking
property to each relevant request body.
• Location with existing property
– src/libs/Cohere/openapi.yaml line 7496
Please update the other request schemas accordingly.
🤖 Prompt for AI Agents
In src/libs/Cohere/openapi.yaml around line 7496, the "thinking" property is
defined only in one request schema but is missing from other relevant request
schemas such as v2 and streaming endpoints. To fix this, identify all request
body schemas where "thinking" should be available and add the "thinking"
property definition consistently to each of those schemas to ensure uniform API
behavior.
Thinking: | ||
required: | ||
- type | ||
type: object | ||
properties: | ||
token_budget: | ||
minimum: 1 | ||
type: integer | ||
description: "The maximum number of tokens the model can use for thinking, which must be set to a positive integer.\nThe model will stop thinking if it reaches the thinking token budget and will proceed with the response.\n" | ||
x-fern-audiences: | ||
- public | ||
type: | ||
enum: | ||
- enabled | ||
- disabled | ||
type: string | ||
x-fern-audiences: | ||
- public | ||
description: "Thinking gives the model enhanced reasoning capabilities for complex tasks, while also providing transparency into its step-by-step thought process before it delivers its final answer.\nWhen thinking is turned on, the model creates thinking content blocks where it outputs its internal reasoning. The model will incorporate insights from this reasoning before crafting a final response.\nWhen thinking is used without tools, the API response will include both thinking and text content blocks. Meanwhile, when thinking is used alongside tools and the model makes tool calls, the API response will include the thinking content block and `tool_calls`.\n" | ||
x-fern-audiences: | ||
- public |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Strengthen the Thinking schema: default, constraints, and minor typing polish
- Provide a safe default (disabled).
- Disallow extraneous properties to keep payloads predictable.
- Add int32 format to token_budget.
- Optional: examples to guide clients.
Thinking:
required:
- type
type: object
+ default:
+ type: disabled
+ additionalProperties: false
properties:
token_budget:
minimum: 1
type: integer
+ format: int32
description: "The maximum number of tokens the model can use for thinking, which must be set to a positive integer.
The model will stop thinking if it reaches the thinking token budget and will proceed with the response.
"
x-fern-audiences:
- public
type:
enum:
- enabled
- disabled
type: string
x-fern-audiences:
- public
description: "Thinking gives the model enhanced reasoning capabilities for complex tasks, while also providing transparency into its step-by-step thought process before it delivers its final answer.
When thinking is turned on, the model creates thinking content blocks where it outputs its internal reasoning. The model will incorporate insights from this reasoning before crafting a final response.
When thinking is used without tools, the API response will include both thinking and text content blocks. Meanwhile, when thinking is used alongside tools and the model makes tool calls, the API response will include the thinking content block and `tool_calls`.
"
x-fern-audiences:
- public
Note: If you want to prevent token_budget when type=disabled, consider modeling two variants via oneOf (enabled/disabled) in OpenAPI 3.1; otherwise keep server-side validation.
🤖 Prompt for AI Agents
In src/libs/Cohere/openapi.yaml around lines 17627 to 17647, the Thinking schema
needs improvements: set the default value to "disabled" for safety, add
constraints to disallow additional properties to ensure predictable payloads,
specify the token_budget property with int32 format for proper typing, and
optionally include examples to help clients understand usage. Consider using
oneOf to model enabled and disabled variants if you want to prevent token_budget
when type is disabled; otherwise, handle this validation on the server side.
Summary by CodeRabbit
New Features
Improvements