-
-
Notifications
You must be signed in to change notification settings - Fork 282
feature about responses api, also introduces a new configuration system, structured logging #142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…se output message type
…nd state management
…ure in translation tests
…arsing and allign with vscode-copilot-chat extractThinkingData, otherwise it will cause miss cache occasionally
…ing signature check and update prompt
…ing small model if no tools are used 2.add bun idleTimeout = 0 3.feat: Compatible with Claude code JSONL file usage error scenarios, delay closeBlockIfOpen and map responses api to anthropic support tool_use and fix spelling errors 4.feat: add configuration management with extra prompt handling and ensure config file creation
…is incompatible with gpt-5-mini
…ssage translation
…just runServer to set verbose level correctly
…ponses-api # Conflicts: # src/start.ts
…adjusting input token calculations and handling tool prompts
WalkthroughAdds a new Responses API (POST /responses and POST /v1/responses), disk-backed config and per-handler loggers, a bidirectional translation layer between Anthropic and Copilot (including streaming translation and stateful function-call assembly), tokenizer refinements, route wiring, and tests for the new translation logic. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Server
participant Translator
participant Copilot as Copilot Responses API
participant Logger
rect rgb(235,245,255)
Note over Client,Server: Non-streaming /v1/responses flow
Client->>Server: POST /v1/responses (Anthropic payload)
Server->>Logger: logger.debug(request)
Server->>Translator: translateAnthropicMessagesToResponsesPayload()
Translator-->>Server: ResponsesPayload
Server->>Copilot: POST /responses (JSON)
Copilot-->>Server: ResponsesResult (JSON)
Server->>Translator: translateResponsesResultToAnthropic()
Translator-->>Server: AnthropicResponse
Server-->>Client: 200 JSON
end
rect rgb(255,245,235)
Note over Client,Server: Streaming flow (stream=true)
Client->>Server: POST /v1/responses (stream=true)
Server->>Copilot: POST /responses (stream)
Copilot-->>Server: stream of ResponseStreamEvent
loop for each event
Server->>Translator: translateResponsesStreamEvent(event, state)
Translator-->>Server: AnthropicStreamEvents[]
Server-->>Client: SSE chunk(s)
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Areas to review closely:
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (22)
README.md(1 hunks)src/lib/config.ts(1 hunks)src/lib/logger.ts(1 hunks)src/lib/paths.ts(1 hunks)src/lib/state.ts(1 hunks)src/lib/tokenizer.ts(8 hunks)src/routes/chat-completions/handler.ts(4 hunks)src/routes/messages/anthropic-types.ts(1 hunks)src/routes/messages/count-tokens-handler.ts(1 hunks)src/routes/messages/handler.ts(5 hunks)src/routes/messages/responses-stream-translation.ts(1 hunks)src/routes/messages/responses-translation.ts(1 hunks)src/routes/responses/handler.ts(1 hunks)src/routes/responses/route.ts(1 hunks)src/routes/responses/utils.ts(1 hunks)src/server.ts(2 hunks)src/services/copilot/create-responses.ts(1 hunks)src/services/copilot/get-models.ts(2 hunks)src/start.ts(2 hunks)tests/anthropic-request.test.ts(2 hunks)tests/responses-stream-translation.test.ts(1 hunks)tests/responses-translation.test.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (13)
src/server.ts (1)
src/routes/responses/route.ts (1)
responsesRoutes(7-7)
src/routes/responses/handler.ts (6)
src/lib/logger.ts (1)
createHandlerLogger(148-182)src/lib/rate-limit.ts (1)
checkRateLimit(8-46)src/lib/state.ts (1)
state(21-27)src/services/copilot/create-responses.ts (3)
ResponsesPayload(8-26)createResponses(328-355)ResponsesResult(113-131)src/routes/responses/utils.ts (1)
getResponsesRequestOptions(6-13)src/lib/approval.ts (1)
awaitApproval(5-15)
src/lib/logger.ts (2)
src/lib/paths.ts (1)
PATHS(10-14)src/lib/state.ts (1)
state(21-27)
tests/responses-stream-translation.test.ts (3)
src/services/copilot/create-responses.ts (3)
ResponseOutputItemAddedEvent(270-275)ResponseFunctionCallArgumentsDeltaEvent(247-253)ResponseFunctionCallArgumentsDoneEvent(255-262)src/routes/messages/responses-stream-translation.ts (2)
createResponsesStreamState(74-82)translateResponsesStreamEvent(84-142)src/routes/messages/anthropic-types.ts (1)
AnthropicStreamEventData(185-193)
src/routes/responses/route.ts (2)
src/routes/responses/handler.ts (1)
handleResponses(21-73)src/lib/error.ts (1)
forwardError(15-47)
src/routes/messages/responses-translation.ts (3)
src/routes/messages/anthropic-types.ts (13)
AnthropicMessagesPayload(3-26)AnthropicMessage(82-82)AnthropicUserMessage(72-75)AnthropicAssistantMessage(77-80)AnthropicUserContentBlock(62-65)AnthropicAssistantContentBlock(67-70)AnthropicImageBlock(33-40)AnthropicThinkingBlock(56-60)AnthropicToolUseBlock(49-54)AnthropicToolResultBlock(42-47)AnthropicTextBlock(28-31)AnthropicTool(84-88)AnthropicResponse(90-112)src/services/copilot/create-responses.ts (20)
ResponsesPayload(8-26)ResponseInputItem(89-94)ResponseInputContent(96-99)ResponseInputMessage(57-62)ResponseInputText(101-104)ResponseInputImage(106-111)ResponseInputReasoning(79-87)ResponseFunctionToolCallItem(64-70)ResponseFunctionCallOutputItem(72-77)Tool(35-35)ToolChoiceOptions(28-28)ToolChoiceFunction(30-33)ResponsesResult(113-131)ResponseOutputItem(143-146)ResponseOutputContentBlock(178-181)ResponseOutputReasoning(156-162)ResponseReasoningBlock(164-167)ResponseOutputFunctionCall(169-176)ResponseOutputText(183-187)ResponseOutputRefusal(189-192)src/lib/config.ts (2)
getReasoningEffortForModel(92-97)getExtraPromptForModel(82-85)
src/routes/messages/handler.ts (10)
src/lib/logger.ts (1)
createHandlerLogger(148-182)src/lib/state.ts (1)
state(21-27)src/routes/messages/anthropic-types.ts (1)
AnthropicMessagesPayload(3-26)src/lib/config.ts (1)
getSmallModel(87-90)src/routes/messages/non-stream-translation.ts (2)
translateToOpenAI(29-47)translateToAnthropic(281-327)src/services/copilot/create-chat-completions.ts (2)
createChatCompletions(8-47)ChatCompletionResponse(95-110)src/routes/messages/responses-translation.ts (2)
translateAnthropicMessagesToResponsesPayload(48-87)translateResponsesResultToAnthropic(352-374)src/routes/responses/utils.ts (1)
getResponsesRequestOptions(6-13)src/services/copilot/create-responses.ts (3)
createResponses(328-355)ResponseStreamEvent(206-219)ResponsesResult(113-131)src/routes/messages/responses-stream-translation.ts (3)
createResponsesStreamState(74-82)translateResponsesStreamEvent(84-142)buildErrorEvent(606-612)
src/routes/responses/utils.ts (1)
src/services/copilot/create-responses.ts (2)
ResponsesPayload(8-26)ResponseInputItem(89-94)
src/lib/config.ts (1)
src/lib/paths.ts (1)
PATHS(10-14)
src/routes/messages/responses-stream-translation.ts (3)
src/services/copilot/create-responses.ts (15)
ResponseStreamEvent(206-219)ResponseCreatedEvent(233-237)ResponseOutputItemAddedEvent(270-275)ResponseOutputItemDoneEvent(277-282)ResponseFunctionCallArgumentsDeltaEvent(247-253)ResponseFunctionCallArgumentsDoneEvent(255-262)ResponseTextDeltaEvent(302-309)ResponseReasoningSummaryTextDeltaEvent(284-291)ResponseReasoningSummaryTextDoneEvent(293-300)ResponseTextDoneEvent(311-318)ResponseCompletedEvent(221-225)ResponseIncompleteEvent(227-231)ResponseFailedEvent(264-268)ResponseErrorEvent(239-245)ResponsesResult(113-131)src/routes/messages/anthropic-types.ts (1)
AnthropicStreamEventData(185-193)src/routes/messages/responses-translation.ts (1)
translateResponsesResultToAnthropic(352-374)
src/routes/chat-completions/handler.ts (2)
src/lib/logger.ts (1)
createHandlerLogger(148-182)src/lib/state.ts (1)
state(21-27)
src/services/copilot/create-responses.ts (3)
src/lib/state.ts (1)
state(21-27)src/lib/api-config.ts (2)
copilotHeaders(20-37)copilotBaseUrl(16-19)src/lib/error.ts (1)
HTTPError(6-13)
src/start.ts (2)
src/lib/state.ts (1)
state(21-27)src/server.ts (1)
server(13-13)
🔇 Additional comments (16)
src/lib/paths.ts (2)
8-8: LGTM!The CONFIG_PATH constant follows the established pattern and integrates cleanly with the existing paths system.
19-19: LGTM!Ensuring the config file exists at startup is the correct approach for persistent configuration.
src/lib/state.ts (1)
18-18: LGTM!The
verboseflag integrates cleanly with the new logging system and follows the existing state pattern.src/routes/responses/route.ts (1)
1-15: LGTM!The route setup follows established patterns and includes proper error handling via
forwardError.src/start.ts (3)
34-34: LGTM!Setting
state.verboseearly ensures the logging system is properly configured before server startup.
117-117: Clarify the rationale for dynamic server import.The server module is now imported dynamically rather than at the top level. While this can be beneficial for deferring initialization until after setup, it also makes the dependency less obvious.
Could you clarify:
- What specific issue does the dynamic import solve?
- Are there side effects in the server module that need to be delayed?
- Does this affect error reporting during server initialization?
122-124: Verify the implications of disabling idle timeout.Setting
idleTimeout: 0disables the idle connection timeout, which means connections can remain open indefinitely. This could potentially:
- Prevent proper resource cleanup
- Allow connections to accumulate if clients don't properly close them
- Impact server resource usage under certain conditions
Please verify:
- Is there a specific issue with Bun's default idle timeout behavior?
- Have you tested this under load with long-lived connections?
- Are there application-level timeouts to prevent resource exhaustion?
src/server.ts (2)
9-9: LGTM!The import follows the established pattern for route modules.
25-25: LGTM!Mounting the responses route at both
/responsesand/v1/responses(line 31) provides consistent API versioning compatibility.tests/anthropic-request.test.ts (2)
139-139: LGTM!The test properly includes the new required
signaturefield on thinking blocks, maintaining consistency with the updatedAnthropicThinkingBlockinterface.
172-172: LGTM!The test data correctly includes the signature field for thinking blocks with tool calls.
src/routes/chat-completions/handler.ts (4)
6-6: LGTM!The migration to per-handler logging improves log organization and aligns with the new structured logging system.
17-17: LGTM!Creating a dedicated logger instance for the chat-completions handler enables better log filtering and rotation.
23-23: LGTM!The logging calls are appropriately updated to use the handler-specific logger instance.
44-50: Good defensive coding with the isNullish check.The
isNullishguard properly handles cases wheremax_tokensisnullorundefined, preventing potential runtime issues when accessingselectedModel?.capabilities.limits.max_output_tokens.src/routes/messages/anthropic-types.ts (1)
56-60: Signature field is properly handled across all translation layers and streaming.All code paths that construct
AnthropicThinkingBlockobjects correctly include thesignaturefield:
- In
responses-translation.ts:389, thinking blocks are constructed withsignature: (item.encrypted_content ?? "") + "@" + item.id- In
responses-stream-translation.ts:199-207, signature deltas are properly emitted as stream events- In
responses-translation.ts:250-251, the signature field is correctly parsed and consumed- Tests include the signature field with proper values
No issues found. The required field is consistently set during construction and properly propagated through all translation and streaming modules.
| if (!anthropicPayload.tools || anthropicPayload.tools.length === 0) { | ||
| anthropicPayload.model = getSmallModel() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't override the requested model for normal chats
Line 50 resets anthropicPayload.model to getSmallModel() whenever tools is empty. Almost every Anthropic Messages request arrives without tools, so this branch fires for genuine user traffic, silently downgrading premium model selections and preventing us from ever hitting the Responses API for models that actually support it. Net effect: users who ask for gpt-4 (or any other advanced tier) will now be served the small model instead—a clear functional regression.
Please drop this unconditional override (or gate it behind a precise warmup detector) so real requests keep the model they asked for.
- if (!anthropicPayload.tools || anthropicPayload.tools.length === 0) {
- anthropicPayload.model = getSmallModel()
- }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (!anthropicPayload.tools || anthropicPayload.tools.length === 0) { | |
| anthropicPayload.model = getSmallModel() | |
| } |
🤖 Prompt for AI Agents
In src/routes/messages/handler.ts around lines 50 to 52, the code
unconditionally sets anthropicPayload.model = getSmallModel() whenever
anthropicPayload.tools is missing or empty, which overrides user-requested
premium models; remove this unconditional override so the payload preserves the
client-requested model (or only apply getSmallModel() under a specific warmup
detector condition). Ensure no change to other logic: delete the branch that
sets the small model for empty tools (or wrap it in an explicit warmup check) so
normal chats keep their requested model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it fix claude code 2.0.28 warmup request consume premium request, forcing small model if no tools are used . anthropic message api used in claude code .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@caozhiyuan, thank you for the clarification! I understand now that this is intentionally addressing the Claude Code 2.0.28 warmup issue where those requests were consuming premium model tokens unnecessarily.
Just to confirm my understanding: this logic will apply the small model to any request without tools (not only warmup requests specifically), which means regular user chats without tool definitions will also use the small model. If that's the intended behavior for your use case, then this makes sense as a pragmatic solution.
I appreciate you explaining the context—I'll note this as intentional. 👍
(\__/)
(='.'=) *noting for future*
(")_(")
✏️ Learnings added
Learnt from: caozhiyuan
Repo: ericc-ch/copilot-api PR: 142
File: src/routes/messages/handler.ts:50-52
Timestamp: 2025-11-11T04:33:30.513Z
Learning: In src/routes/messages/handler.ts, forcing anthropicPayload.model to getSmallModel() when no tools are present is intentional behavior to fix Claude Code 2.0.28 warmup requests consuming premium model tokens. This applies to all requests without tools, not just warmup requests, and is an accepted design decision.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| logger.debug("Responses request payload:", JSON.stringify(payload)) | ||
|
|
||
| const selectedModel = state.models?.data.find( | ||
| (model) => model.id === payload.model, | ||
| ) | ||
| const supportsResponses = | ||
| selectedModel?.supported_endpoints?.includes(RESPONSES_ENDPOINT) ?? false | ||
|
|
||
| if (!supportsResponses) { | ||
| return c.json( | ||
| { | ||
| error: { | ||
| message: | ||
| "This model does not support the responses endpoint. Please choose a different model.", | ||
| type: "invalid_request_error", | ||
| }, | ||
| }, | ||
| 400, | ||
| ) | ||
| } | ||
|
|
||
| const { vision, initiator } = getResponsesRequestOptions(payload) | ||
|
|
||
| if (state.manualApprove) { | ||
| await awaitApproval() | ||
| } | ||
|
|
||
| const response = await createResponses(payload, { vision, initiator }) | ||
|
|
||
| if (isStreamingRequested(payload) && isAsyncIterable(response)) { | ||
| logger.debug("Forwarding native Responses stream") | ||
| return streamSSE(c, async (stream) => { | ||
| for await (const chunk of response) { | ||
| logger.debug("Responses stream chunk:", JSON.stringify(chunk)) | ||
| await stream.writeSSE({ | ||
| id: (chunk as { id?: string }).id, | ||
| event: (chunk as { event?: string }).event, | ||
| data: (chunk as { data?: string }).data ?? "", | ||
| }) | ||
| } | ||
| }) | ||
| } | ||
|
|
||
| logger.debug( | ||
| "Forwarding native Responses result:", | ||
| JSON.stringify(response).slice(-400), | ||
| ) | ||
| return c.json(response as ResponsesResult) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid logging full request/response payloads
These logger.debug calls write the entire request payload and every streamed chunk to disk via the new per-handler logger (retained for seven days by default). That captures user prompts, tool arguments, and model outputs verbatim, which is a high-risk compliance/PII retention issue unless you have explicit consent and data-governance controls in place. Please remove or heavily redact these logs (e.g., log only metadata like model id or event types, or gate them behind a short-lived in-memory logger) before shipping.
🤖 Prompt for AI Agents
In src/routes/responses/handler.ts around lines 25 to 72, remove or replace the
current logger.debug calls that serialize the full request payload and streamed
chunks; instead log only non-PII metadata (e.g., model id, whether
vision/initiator flags are set, event type, chunk id) and never log prompt or
chunk.data content verbatim. For the initial payload log drop
JSON.stringify(payload) and log payload.model or selectedModel.id and any
boolean flags; for streaming, log only chunk.id and chunk.event (no chunk.data)
or redact/mask it, and consider gating any full-content logging behind an
explicit opt-in short-lived in-memory flag or a secure, auditable
data-governance path before shipping. Ensure logs written here do not persist
sensitive user data and update comments/tests accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
user need run start -v to enable logger
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@caozhiyuan, thank you for the clarification! I see that the logger.debug calls only write to disk when users explicitly run with the -v flag, which significantly reduces the exposure compared to always-on logging.
That said, even opt-in verbose logging carries some residual risk:
- User awareness: Users enabling
-vfor debugging may not realize it captures full prompts, tool arguments, and model outputs for 7 days. - Compliance context: In some jurisdictions (GDPR, CCPA), even short-term retention of user content requires explicit consent and data-governance controls, even when opt-in.
- Production incidents: Someone might enable verbose mode to troubleshoot an issue and forget the retention implications.
Suggestions to further reduce risk:
- Log metadata only in verbose mode: Even with
-v, consider logging onlypayload.model,vision,initiator,chunk.id, andchunk.event(notchunk.dataor full payloads). - Add a separate flag for full-content logging: Reserve a more explicit flag (e.g.,
--log-full-payloads) with a clear warning about PII retention. - Document the implications: Add a note in your README or startup message that
-vlogs request/response content for 7 days.
If verbose mode is specifically intended for debugging with full payloads and your users understand the implications, this design is reasonable. Just ensure it's documented and consider the suggestions above for production environments.
✏️ Learnings added
Learnt from: caozhiyuan
Repo: ericc-ch/copilot-api PR: 142
File: src/routes/responses/handler.ts:25-72
Timestamp: 2025-11-11T04:15:40.752Z
Learning: In the copilot-api repository, debug logging that includes full request/response payloads is intentionally gated behind the `-v` (verbose) flag to reduce PII exposure. Users must explicitly enable verbose mode to activate detailed logging.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| getPayloadItems(payload).some((item) => { | ||
| if (!("role" in item) || !item.role) { | ||
| return true | ||
| } | ||
| const role = typeof item.role === "string" ? item.role.toLowerCase() : "" | ||
| return role === "assistant" | ||
| }) | ||
|
|
||
| export const hasVisionInput = (payload: ResponsesPayload): boolean => { | ||
| const values = getPayloadItems(payload) | ||
| return values.some((item) => containsVisionContent(item)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix initiator detection to avoid misclassifying user-driven calls.
hasAgentInitiator currently returns true for any payload item lacking a role (e.g., tool call entries) or whenever an assistant_role message appears anywhere in history. After translation, almost every request includes prior assistant/tool items, so we now send X-Initiator: agent even when the user just typed the latest message. That misreports real user traffic to Copilot and can trigger agent-only throttling or different policy handling. Please determine the initiator from the most recent message that actually has a role, ignoring items without one.
-export const hasAgentInitiator = (payload: ResponsesPayload): boolean =>
- getPayloadItems(payload).some((item) => {
- if (!("role" in item) || !item.role) {
- return true
- }
- const role = typeof item.role === "string" ? item.role.toLowerCase() : ""
- return role === "assistant"
- })
+export const hasAgentInitiator = (payload: ResponsesPayload): boolean => {
+ const items = getPayloadItems(payload)
+
+ for (let index = items.length - 1; index >= 0; index -= 1) {
+ const item = items[index]
+ if (!("role" in item)) continue
+
+ const role = (item as { role?: unknown }).role
+ if (typeof role === "string" && role.length > 0) {
+ return role.toLowerCase() === "assistant"
+ }
+ }
+
+ return false
+}🤖 Prompt for AI Agents
In src/routes/responses/utils.ts around lines 16 to 27, the current logic treats
any item lacking a role as an agent initiator and also looks anywhere in history
for assistant messages; update it to determine the initiator from the most
recent payload item that actually has a role: iterate the payload items in
reverse order, skip items without a "role" or with an empty role, take the first
item that has a role, normalize it to lowercase, and return true only if that
role equals "assistant" (otherwise return false); ensure you handle
string/possibly-nonstring role types safely and keep type signatures unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current logic is consistent with the create-chat-completions.ts isAgentCall logic: https://github.com/ericc-ch/copilot-api/blob/master/src/services/copilot/create-chat-completions.ts#L21
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Fetch the create-chat-completions.ts file to examine the isAgentCall logic
cat src/services/copilot/create-chat-completions.ts
Length of output: 4282
@caozhiyuan, thank you for pointing out the consistency with isAgentCall in create-chat-completions.ts. You're right that both implementations use .some() to detect agent involvement in the payload.
I can see that isAgentCall checks:
const isAgentCall = payload.messages.some((msg) =>
["assistant", "tool"].includes(msg.role),
)While hasAgentInitiator has similar logic, there are a couple of differences:
- Missing role handling:
hasAgentInitiatorreturnstruefor items without arolefield, whileisAgentCalldoesn't encounter this case (all messages have roles) - Tool role:
isAgentCallchecks for both"assistant"and"tool"roles, buthasAgentInitiatoronly checks for"assistant"
Questions to clarify the intended behavior:
- Is the intent of
X-Initiatorto indicate "who sent the most recent message" or "is there any agent involvement in the conversation history"? - Should items without a
rolefield (e.g., function outputs, tool call items in the Responses API structure) be treated as agent-initiated? - Should
hasAgentInitiatoralso check for"tool"role likeisAgentCalldoes for consistency?
If the current behavior (checking for any assistant/tool in the entire payload) is intentional for both endpoints, then the implementation is consistent as you noted, and the review comment's concern would apply to the broader design rather than just this specific function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responses API function outputs, tool call items without a role field .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check the ResponseInputItem type definition and related types to understand which items have role fields
cat src/services/copilot/create-responses.ts | grep -A 30 "type ResponseInputItem"
Length of output: 696
🏁 Script executed:
#!/bin/bash
# Look for all the individual input item interfaces
ast-grep --pattern 'export interface Response$_Item {
$$$
}'
Length of output: 0
🏁 Script executed:
#!/bin/bash
# Search for ResponseFunctionToolCallItem and ResponseFunctionCallOutputItem definitions
rg -nP -A 10 "interface Response(Function|FunctionCall)" src/services/copilot/create-responses.ts
Length of output: 1087
🏁 Script executed:
#!/bin/bash
# Get a broader view of the ResponseInputItem union and its constituents
rg -nP -B 2 -A 40 "^export type ResponseInputItem" src/services/copilot/create-responses.ts
Length of output: 1145
🏁 Script executed:
#!/bin/bash
# Check the ResponseInputMessage definition to confirm it has a role field
rg -nP -A 10 "^export interface ResponseInputMessage" src/services/copilot/create-responses.ts
Length of output: 320
@caozhiyuan, thank you for the clarification! I can now see from the type definitions that in the Responses API:
ResponseInputMessagehas arolefield (user/assistant/system/developer)ResponseFunctionToolCallItemandResponseFunctionCallOutputItemlegitimately don't have arolefieldResponseInputReasoningalso lacks a role field
So the current logic that returns true for items without a role field will catch function calls and outputs, treating them as agent-initiated. This makes sense because the presence of tool/function interaction typically indicates agent involvement in the workflow.
To clarify the intended semantics: Does the X-Initiator header indicate:
- Any agent involvement in the conversation history (current implementation via
.some()), or - Who sent the most recent conversational message (checking only the last message with a role)?
If the intent is #1 (detecting any agent involvement), then the current implementation is correct and consistent with isAgentCall. If it's #2 (identifying the initiator of the current turn), then we'd need to check only the most recent message-type item.
Could you confirm which semantic is intended for the Responses API?
🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current logic is consistent with the create-chat-completions.ts isAgentCall logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@caozhiyuan, understood! Since the logic is intentionally consistent with the existing isAgentCall implementation in create-chat-completions.ts, and you've confirmed this twice, I'll consider this design decision final. The current implementation maintains consistency across your codebase, which is a valid architectural choice.
Thank you for clarifying! 👍
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 5433 @caozhiyuan You're absolutely correct! I apologize for the error in the sequence diagram. After reviewing the actual implementation:
Here's the corrected sequence diagram: sequenceDiagram
participant Client
participant Server
participant ResponsesAPI as Copilot<br/>Responses API
participant Translator
participant Logger
rect rgb(230, 245, 255)
Note over Client,Server: /v1/responses (Native ResponsesPayload)
Client->>Server: POST /v1/responses<br/>(ResponsesPayload)
Server->>Logger: debug: request received
Server->>Server: validate payload & model capability
Server->>ResponsesAPI: POST /responses with payload
ResponsesAPI-->>Server: ResponsesResult
Server-->>Client: JSON response (ResponsesResult)
end
rect rgb(255, 240, 245)
Note over Client,Server: /v1/messages (Anthropic Format + Translation)
Client->>Server: POST /v1/messages<br/>(AnthropicMessagesPayload)
Server->>Server: shouldUseResponsesApi()?
Server->>Translator: translateAnthropicMessagesToResponsesPayload()
Translator-->>Server: ResponsesPayload
Server->>ResponsesAPI: POST /responses with payload
ResponsesAPI-->>Server: ResponsesResult
Server->>Translator: translateResponsesResultToAnthropic()
Translator-->>Server: AnthropicResponse
Server-->>Client: JSON response (AnthropicResponse)
end
rect rgb(245, 245, 230)
Note over Client,Server: Streaming (either endpoint)
Client->>Server: POST with stream=true
Server->>ResponsesAPI: fetch /responses (stream)
ResponsesAPI-->>Server: EventSource (streamed events)
loop for each ResponseStreamEvent
alt /v1/responses
Server-->>Client: SSE chunk (native format)
else /v1/messages
Server->>Translator: translateResponsesStreamEvent()
Translator-->>Server: Array<AnthropicStreamEventData>
Server-->>Client: SSE chunk (Anthropic format)
end
end
end
rect rgb(245, 230, 245)
Note over Server,Logger: Logging Path
Server->>Logger: createHandlerLogger("handler-name")
Logger->>Logger: ensure logs/ directory
Logger->>Logger: append to daily log file
Logger->>Logger: flush buffer periodically
end
Key differences:
Thank you for catching this! 🐰 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
src/routes/messages/responses-stream-translation.ts (4)
84-142: Guard post‑completion events and backstopmessage_startemissionRight now
translateResponsesStreamEventnever checksstate.messageCompleted, andhandleResponseCompletedassumes a prior"response.created"→message_startevent. If the upstream ever emits stray events after completion/failure or omitsresponse.created, we could generate malformed Anthropic streams (content without amessage_start, or extra deltas aftermessage_stop).Consider:
- Early‑returning when a terminal state has been reached.
- Emitting a
message_startlazily fromhandleResponseCompletedwhen one hasn’t been sent yet.For example:
export const translateResponsesStreamEvent = ( rawEvent: ResponseStreamEvent, state: ResponsesStreamState, ): Array<AnthropicStreamEventData> => { + if (state.messageCompleted) { + // Ignore any events after a terminal condition. + return [] + } const eventType = rawEvent.type switch (eventType) {const handleResponseCompleted = ( rawEvent: ResponseCompletedEvent | ResponseIncompleteEvent, state: ResponsesStreamState, ): Array<AnthropicStreamEventData> => { const response = rawEvent.response const events = new Array<AnthropicStreamEventData>() + if (!state.messageStartSent) { + events.push(...messageStart(state, response)) + } + closeAllOpenBlocks(state, events) const anthropic = translateResponsesResultToAnthropic(response) events.push(Also applies to: 410-432, 479-506
215-243:functionCallStatevalidation branch is effectively unreachableIn
handleFunctionCallArgumentsDelta, you callopenFunctionCallBlockbefore readingstate.functionCallStateByOutputIndex:const blockIndex = openFunctionCallBlock(state, { outputIndex, events }) const functionCallState = state.functionCallStateByOutputIndex.get(outputIndex) if (!functionCallState) { return handleFunctionCallArgumentsValidationError(...) }Because
openFunctionCallBlockalways creates and registers aFunctionCallStreamStatewhen one doesn’t exist,functionCallStatewill never be falsy here. The"Received function call arguments delta without an open tool call block."path can’t actually trigger.If you still want to enforce “no delta before a tool‑use block” as a hard error, you likely meant to validate before opening a block. For example:
- const blockIndex = openFunctionCallBlock(state, { - outputIndex, - events, - }) - - const functionCallState = - state.functionCallStateByOutputIndex.get(outputIndex) - if (!functionCallState) { + const existingState = + state.functionCallStateByOutputIndex.get(outputIndex) + if (!existingState) { return handleFunctionCallArgumentsValidationError( new FunctionCallArgumentsValidationError( "Received function call arguments delta without an open tool call block.", ), state, events, ) } + const blockIndex = openFunctionCallBlock(state, { + outputIndex, + events, + }) + const functionCallState = existingStateOtherwise, consider deleting the unreachable validation branch to keep the handler lean.
617-665: Assumption: one function call stream peroutputIndex
openFunctionCallBlockandFunctionCallDetailskey the function‑call stream solely byoutputIndex:state.functionCallStateByOutputIndex: Map<number, FunctionCallStreamState>If the underlying Responses stream ever emits multiple
function_callitems for the sameoutput_index(e.g., parallel tool calls inside a single output), they’ll be merged into a single Anthropictool_useblock with oneid/name, and later calls would overwrite or reuse the first block.If the Responses service guarantees at most one function call per
output_index, this is fine but worth documenting nearFunctionCallStreamState. If not, consider extending the key (e.g., usingoutputIndex + item_idor a composite key) so each distinct call maps to its own content block.Also applies to: 667-693
31-55: Whitespace‑run guard looks good; consider clarifying intentThe
updateWhitespaceRunState+handleFunctionCallArgumentsDeltacombo nicely defends against the “infinite line breaks” tool‑call bug while preserving previous state across chunks.Note that the current implementation:
- Counts only
\r,\n, and\ttowardMAX_CONSECUTIVE_FUNCTION_CALL_WHITESPACE.- Treats spaces as neutral (they neither increment nor reset the run).
If the intent is “cap consecutive line breaks/tabs” this is perfect. If instead you truly want “no more than N consecutive whitespace characters of any kind”, you’d need to include
" "(and possibly other Unicode whitespace) in the counter.Also applies to: 215-272
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/routes/messages/responses-stream-translation.ts(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-11T04:33:30.522Z
Learnt from: caozhiyuan
Repo: ericc-ch/copilot-api PR: 142
File: src/routes/messages/handler.ts:50-52
Timestamp: 2025-11-11T04:33:30.522Z
Learning: In src/routes/messages/handler.ts, forcing anthropicPayload.model to getSmallModel() when no tools are present is intentional behavior to fix Claude Code 2.0.28 warmup requests consuming premium model tokens. This applies to all requests without tools, not just warmup requests, and is an accepted design decision.
Applied to files:
src/routes/messages/responses-stream-translation.ts
🧬 Code graph analysis (1)
src/routes/messages/responses-stream-translation.ts (3)
src/services/copilot/create-responses.ts (15)
ResponseStreamEvent(206-219)ResponseCreatedEvent(233-237)ResponseOutputItemAddedEvent(270-275)ResponseOutputItemDoneEvent(277-282)ResponseFunctionCallArgumentsDeltaEvent(247-253)ResponseFunctionCallArgumentsDoneEvent(255-262)ResponseTextDeltaEvent(302-309)ResponseReasoningSummaryTextDeltaEvent(284-291)ResponseReasoningSummaryTextDoneEvent(293-300)ResponseTextDoneEvent(311-318)ResponseCompletedEvent(221-225)ResponseIncompleteEvent(227-231)ResponseFailedEvent(264-268)ResponseErrorEvent(239-245)ResponsesResult(113-131)src/routes/messages/anthropic-types.ts (1)
AnthropicStreamEventData(185-193)src/routes/messages/responses-translation.ts (1)
translateResponsesResultToAnthropic(352-374)
🔇 Additional comments (1)
src/routes/messages/responses-stream-translation.ts (1)
57-82: Stateful block management is clean and matches Anthropic’s block modelThe combination of
ResponsesStreamState,openTextBlockIfNeeded,openThinkingBlockIfNeeded, and the text/thinking handlers yields:
- Stable
indexassignments per(outputIndex, contentIndex)/ thinking key.- At most one open block at a time via
closeOpenBlocks, matching Anthropic’s sequential content block semantics.- Sensible fallbacks where
*.doneevents synthesize a delta when no prior delta was seen.Overall this is a solid, maintainable state machine for the streaming bridge.
Also applies to: 304-408, 508-572
| const handleErrorEvent = ( | ||
| rawEvent: ResponseErrorEvent, | ||
| state: ResponsesStreamState, | ||
| ): Array<AnthropicStreamEventData> => { | ||
| const message = | ||
| typeof rawEvent.message === "string" ? | ||
| rawEvent.message | ||
| : "An unexpected error occurred during streaming." | ||
|
|
||
| state.messageCompleted = true | ||
| return [buildErrorEvent(message)] | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Ensure error events also close open content blocks
handleResponseFailed and handleFunctionCallArgumentsValidationError both call closeAllOpenBlocks, but handleErrorEvent does not. If an "error" event arrives while a text/tool/thinking block is open, Anthropic consumers will see an error without matching content_block_stop events, which can break parsers that assume blocks are well‑balanced.
Recommend aligning handleErrorEvent with the other error paths:
const handleErrorEvent = (
rawEvent: ResponseErrorEvent,
state: ResponsesStreamState,
): Array<AnthropicStreamEventData> => {
- const message =
- typeof rawEvent.message === "string" ?
- rawEvent.message
- : "An unexpected error occurred during streaming."
-
- state.messageCompleted = true
- return [buildErrorEvent(message)]
+ const events: Array<AnthropicStreamEventData> = []
+ const message =
+ typeof rawEvent.message === "string"
+ ? rawEvent.message
+ : "An unexpected error occurred during streaming."
+
+ closeAllOpenBlocks(state, events)
+ state.messageCompleted = true
+ events.push(buildErrorEvent(message))
+ return events
}Also applies to: 574-604
---
🏁 Script executed:
```shell
#!/bin/bash
# Inspect the handleErrorEvent function and surrounding context
rg -n -A 20 "const handleErrorEvent" src/routes/messages/responses-stream-translation.ts
# Inspect handleResponseFailed
rg -n -A 20 "const handleResponseFailed" src/routes/messages/responses-stream-translation.ts
# Inspect handleFunctionCallArgumentsValidationError
rg -n -A 20 "const handleFunctionCallArgumentsValidationError" src/routes/messages/responses-stream-translation.ts
# Check the range 574-604 mentioned in the review
sed -n '574,604p' src/routes/messages/responses-stream-translation.ts
# Search for closeAllOpenBlocks usage
rg -n "closeAllOpenBlocks" src/routes/messages/responses-stream-translation.ts
Length of output: 2967
---
🏁 Script executed:
```shell
#!/bin/bash
# Find other error handlers or similar patterns after line 574
sed -n '574,650p' src/routes/messages/responses-stream-translation.ts | head -80
Length of output: 1938
---
🏁 Script executed:
```shell
#!/bin/bash
# Search for all functions that might be error handlers
rg -n "^const handle" src/routes/messages/responses-stream-translation.ts
# Also search for any patterns with "Error" or "error" in function names
rg -n "const.*[Ee]rror.*= \(" src/routes/messages/responses-stream-translation.ts
Length of output: 695
Ensure error events also close open content blocks
The review comment is correct. Verification confirms:
handleErrorEvent(lines 451–462) does not callcloseAllOpenBlockshandleResponseFailed(line 440) does call ithandleFunctionCallArgumentsValidationError(line 471) does call it
Without closing open blocks, an error event arriving during an open text/tool/thinking block will leave consumers with unbalanced content_block_stop events, breaking parsers expecting well-formed streams.
The recommended fix properly aligns handleErrorEvent with the established error-handling pattern:
const handleErrorEvent = (
rawEvent: ResponseErrorEvent,
state: ResponsesStreamState,
): Array<AnthropicStreamEventData> => {
+ const events: Array<AnthropicStreamEventData> = []
const message =
typeof rawEvent.message === "string" ?
rawEvent.message
: "An unexpected error occurred during streaming."
+ closeAllOpenBlocks(state, events)
state.messageCompleted = true
+ events.push(buildErrorEvent(message))
+ return events
- return [buildErrorEvent(message)]
}(Note: The review's reference to "Also applies to: 574–604" appears to be a review artifact; no handler exists in that range.)
🤖 Prompt for AI Agents
In src/routes/messages/responses-stream-translation.ts around lines 451 to 462,
handleErrorEvent sets state.messageCompleted and returns a buildErrorEvent but
does not call closeAllOpenBlocks; add a call to closeAllOpenBlocks(state) before
returning so any open content blocks are closed, then keep setting
state.messageCompleted = true and return [buildErrorEvent(message)] to match the
other error handlers' pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not closeAllOpenBlocks is also ok , when sent a error event to client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@caozhiyuan, understood! I'll note that your client implementation handles error events without requiring all content blocks to be closed first. Thank you for clarifying the intended behavior.
(\__/)
(•ㅅ•)
/>📝 Noted!
✏️ Learnings added
Learnt from: caozhiyuan
Repo: ericc-ch/copilot-api PR: 142
File: src/routes/messages/responses-stream-translation.ts:451-462
Timestamp: 2025-11-24T05:50:12.345Z
Learning: In src/routes/messages/responses-stream-translation.ts, the handleErrorEvent function intentionally does not call closeAllOpenBlocks before sending an error event to the client. This is acceptable in their error handling design - clients handle error events without requiring balanced content_block_stop events.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: caozhiyuan
Repo: ericc-ch/copilot-api PR: 142
File: src/routes/messages/handler.ts:50-52
Timestamp: 2025-11-11T04:33:30.522Z
Learning: In src/routes/messages/handler.ts, forcing anthropicPayload.model to getSmallModel() when no tools are present is intentional behavior to fix Claude Code 2.0.28 warmup requests consuming premium model tokens. This applies to all requests without tools, not just warmup requests, and is an accepted design decision.
Some clients, like RooCode may send `service_tier` to `/responses` endpoint, but Copilot do not support this field and returns error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/services/copilot/create-responses.ts (2)
343-347: Consider adding a timeout to the fetch request.Long-running requests without a timeout can hang indefinitely, degrading reliability in production environments.
You can add a timeout using AbortController:
const controller = new AbortController() const timeoutId = setTimeout(() => controller.abort(), 60000) // 60s timeout try { const response = await fetch(`${copilotBaseUrl(state)}/responses`, { method: "POST", headers, body: JSON.stringify(requestPayload), signal: controller.signal, }) clearTimeout(timeoutId) // ... rest of the logic } catch (error) { clearTimeout(timeoutId) throw error }
350-350: Logging the entire response object may be verbose.The
responseobject includes headers, status, and other metadata that might not be necessary for error diagnostics and could expose sensitive information.Consider logging only essential details:
- consola.error("Failed to create responses", response) + consola.error("Failed to create responses", { + status: response.status, + statusText: response.statusText + })
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/services/copilot/create-responses.ts(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/services/copilot/create-responses.ts (3)
src/lib/state.ts (1)
state(21-27)src/lib/api-config.ts (2)
copilotHeaders(20-37)copilotBaseUrl(16-19)src/lib/error.ts (1)
HTTPError(6-13)
| } | ||
|
|
||
| // service_tier is not supported by github copilot | ||
| payload.service_tier = null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid mutating the input payload.
Directly mutating payload.service_tier creates an unexpected side effect for callers who may reuse the payload object. This violates the principle of least surprise.
Consider creating a copy before modification:
const headers: Record<string, string> = {
...copilotHeaders(state, vision),
"X-Initiator": initiator,
}
- // service_tier is not supported by github copilot
- payload.service_tier = null
+ // service_tier is not supported by github copilot
+ const requestPayload = { ...payload, service_tier: null }
const response = await fetch(`${copilotBaseUrl(state)}/responses`, {
method: "POST",
headers,
- body: JSON.stringify(payload),
+ body: JSON.stringify(requestPayload),
})📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| payload.service_tier = null | |
| const headers: Record<string, string> = { | |
| ...copilotHeaders(state, vision), | |
| "X-Initiator": initiator, | |
| } | |
| // service_tier is not supported by github copilot | |
| const requestPayload = { ...payload, service_tier: null } | |
| const response = await fetch(`${copilotBaseUrl(state)}/responses`, { | |
| method: "POST", | |
| headers, | |
| body: JSON.stringify(requestPayload), | |
| }) |
🤖 Prompt for AI Agents
In src/services/copilot/create-responses.ts around line 341, the code mutates
the incoming payload by setting payload.service_tier = null; instead create a
shallow copy of the payload (e.g., using object spread or Object.assign) and
modify the copy's service_tier instead, then use that modified copy in
subsequent logic so the original payload passed by the caller remains unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
payload.service_tier = null can avoid creating an object, already tested and it works fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@caozhiyuan, understood! If the performance benefit from avoiding object creation is valuable here and the mutation doesn't cause issues in your use case, that's a reasonable tradeoff. Thank you for clarifying the intent.
🧠 Learnings used
Learnt from: caozhiyuan
Repo: ericc-ch/copilot-api PR: 142
File: src/routes/messages/handler.ts:50-52
Timestamp: 2025-11-11T04:33:30.522Z
Learning: In src/routes/messages/handler.ts, forcing anthropicPayload.model to getSmallModel() when no tools are present is intentional behavior to fix Claude Code 2.0.28 warmup requests consuming premium model tokens. This applies to all requests without tools, not just warmup requests, and is an accepted design decision.
This pull request introduces a new configuration system, structured logging, and support for the /v1/responses endpoint, along with improvements to model selection and request handling. The most important changes are grouped below:
Responses API Integration:
Configuration Management:
src/lib/config.tsmodule to provide persistent application configuration, including support for model-specific prompts, reasoning effort levels, and default model selection. Configuration is stored in a newconfig.jsonfile in the app data directory, with automatic creation and safe permissions. [1] [2]Logging Improvements:
src/lib/logger.tsfor handler-level logging, with log rotation, retention, and structured output. Integrated this logger into key request handlers for better diagnostics. [1] [2] [3] [4] [5]Token Counting Logic:
src/lib/tokenizer.tsto more accurately account for tool calls, array parameters, and model-specific behaviors (including GPT and Anthropic/Grok models). Added support for excluding certain schema keys and improved calculation for nested parameters. [1] [2] [3] [4] [5] [6] [7] [8]Summary by CodeRabbit
New Features
Improvements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.