feat: Add GPT-5 Pro with background mode auto-resume and polling#8608
feat: Add GPT-5 Pro with background mode auto-resume and polling#8608hannesrudolph wants to merge 17 commits intomainfrom
Conversation
Not sure yet. Will look at it after we get Native Tool Calling out later this week. Sorry for the delay. |
…for long-running models (e.g., gpt-5-pro) - Introduce ModelInfo.disableTimeout to opt out of request timeouts on a per-model basis - Apply in OpenAI-compatible, Ollama, and LM Studio providers (timeout=0 when flag is true) - Preserve global “API Request Timeout” behavior (0 still disables globally); per-model flag takes precedence for that model - Motivation: gpt-5-pro often requires longer runtimes; per-model override avoids forcing a global setting that impacts all models - Add/extend unit tests to validate provider behavior
…nd non‑streaming notice - Add GPT‑5 Pro to model registry with: - contextWindow: 400k, maxTokens: 272k - supportsImages: true, supportsPromptCache: true, supportsVerbosity: true, supportsTemperature: false - reasoningEffort: high (Responses API only) - pricing: $15/1M input tokens, $120/1M output tokens - Set disableTimeout: true to avoid requiring a global timeout override - Description clarifies: this is a slow, reasoning‑focused model designed for tough problems; requests may take several minutes; it does not stream (UI may appear idle until completion)
…-5-pro model entry (server-side timeouts). Prep for background mode approach.
Enable OpenAI Responses background mode with resilient streaming for GPT‑5 Pro and any model flagged via metadata.
Key changes:
- Background mode enablement
• Auto-enable for models with info.backgroundMode === true (e.g., gpt-5-pro-2025-10-06) defined in [packages/types/src/providers/openai.ts](packages/types/src/providers/openai.ts).
• Also respects manual override (openAiNativeBackgroundMode) from ProviderSettings/ApiHandlerOptions.
- Request shape (Responses API)
• background:true, stream:true, store:true set in [OpenAiNativeHandler.buildRequestBody()](src/api/providers/openai-native.ts:224).
- Streaming UX and status events
• New ApiStreamStatusChunk in [src/api/transform/stream.ts](src/api/transform/stream.ts) with statuses: queued, in_progress, completed, failed, canceled, reconnecting, polling.
• Provider emits status chunks in SDK + SSE paths via [OpenAiNativeHandler.processEvent()](src/api/providers/openai-native.ts:1100) and [OpenAiNativeHandler.handleStreamResponse()](src/api/providers/openai-native.ts:651).
• UI spinner shows background lifecycle labels in [webview-ui/src/components/chat/ChatRow.tsx](webview-ui/src/components/chat/ChatRow.tsx) using [webview-ui/src/utils/backgroundStatus.ts](webview-ui/src/utils/backgroundStatus.ts).
- Resilience: auto-resume + poll fallback
• On stream drop for background tasks, attempt SSE resume using response.id and last sequence_number with exponential backoff in [OpenAiNativeHandler.attemptResumeOrPoll()](src/api/providers/openai-native.ts:1215).
• If resume fails, poll GET /v1/responses/{id} every 2s until terminal and synthesize final output/usage.
• Deduplicate resumed events via resumeCutoffSequence in [handleStreamResponse()](src/api/providers/openai-native.ts:737).
- Settings (no new UI switch)
• Added optional provider settings and ApiHandlerOptions: autoResume, resumeMaxRetries, resumeBaseDelayMs, pollIntervalMs, pollMaxMinutes in [packages/types/src/provider-settings.ts](packages/types/src/provider-settings.ts) and [src/shared/api.ts](src/shared/api.ts).
- Cleanup
• Removed VS Code contributes toggle for background mode; behavior now model-driven + programmatic override.
- Tests
• Provider: coverage for background status emission, auto-resume success, resume→poll fallback, non-background negative in [src/api/providers/__tests__/openai-native.spec.ts](src/api/providers/__tests__/openai-native.spec.ts).
• Usage parity unchanged validated in [src/api/providers/__tests__/openai-native-usage.spec.ts](src/api/providers/__tests__/openai-native-usage.spec.ts).
• UI: label mapping tests for background statuses in [webview-ui/src/utils/__tests__/backgroundStatus.spec.ts](webview-ui/src/utils/__tests__/backgroundStatus.spec.ts).
Notes:
- Aligns with TEMP_OPENAI_BACKGROUND_TASK_DOCS.DM: background requires store=true; supports streaming resume via response.id + sequence_number.
- Default behavior unchanged for non-background models; no breaking changes.
…description, remove duplicate test, revert gitignore
…ded dep to useMemo; test: remove duplicate GPT-5 Pro background-mode test; chore(core): remove temp debug log
…l description for clarity
…background labels; fix deps warning in ChatRow useMemo
…assify permanent vs transient errors; chore(task): remove temporary debug log
…core/task): avoid full-state refresh on each background status chunk to reduce re-renders
3a0add7 to
f138361
Compare
@hannesrudolph Thank you so much for your effort. I am looking forward to use this feature soon in production :) |
Feel free to give it a try in its current form. I have not tested it since I just updated it and have to run out! BBL |
|
@hannesrudolph Hi, I’ve been testing the For long-running background responses, the initial That extra I tested the following change on @@
- private normalizeUsage(usage: any, model: OpenAiNativeModel): ApiStreamUsageChunk | undefined {
+ private buildResponsesUrl(path: string): string {
+ const rawBase = this.options.openAiNativeBaseUrl || "https://api.openai.com"
+ // Normalize base by trimming trailing slashes
+ const normalizedBase = rawBase.replace(/\/+$/, "")
+ // If the base already ends with a version segment (e.g. /v1), do not append another
+ const hasVersion = /\/v\d+(?:\.\d+)?$/.test(normalizedBase)
+ const baseWithVersion = hasVersion ? normalizedBase : `${normalizedBase}/v1`
+ const normalizedPath = path.startsWith("/") ? path : `/${path}`
+ return `${baseWithVersion}${normalizedPath}`
+ }
+
+ private normalizeUsage(usage: any, model: OpenAiNativeModel): ApiStreamUsageChunk | undefined {
@@
- const apiKey = this.options.openAiNativeApiKey ?? "not-provided"
- const baseUrl = this.options.openAiNativeBaseUrl || "https://api.openai.com"
- const url = `${baseUrl}/v1/responses`
+ const apiKey = this.options.openAiNativeApiKey ?? "not-provided"
+ const url = this.buildResponsesUrl("responses")
@@
- const apiKey = this.options.openAiNativeApiKey ?? "not-provided"
- const baseUrl = this.options.openAiNativeBaseUrl || "https://api.openai.com"
+ const apiKey = this.options.openAiNativeApiKey ?? "not-provided"
const resumeMaxRetries = this.options.openAiNativeBackgroundResumeMaxRetries ?? 3
const resumeBaseDelayMs = this.options.openAiNativeBackgroundResumeBaseDelayMs ?? 1000
@@
- const resumeUrl = `${baseUrl}/v1/responses/${responseId}?stream=true&starting_after=${lastSeq}`
+ const resumeUrl = this.buildResponsesUrl(
+ `responses/${responseId}?stream=true&starting_after=${lastSeq}`,
+ )
@@
- const pollRes = await fetch(`${baseUrl}/v1/responses/${responseId}`, {
+ const pollRes = await fetch(this.buildResponsesUrl(`responses/${responseId}`), {
method: "GET",
headers: {
Authorization: `Bearer ${apiKey}`,
},
Behavior-wise:
If you’d prefer, I can also open a small PR from my fork targeting the After applying the change above, everything has been working great with Azure OpenAI. |
|
Closing for now as it is way off from the current codebase. |
Summary
Adds GPT-5 Pro model with OpenAI Responses API background mode support. Background requests can take several minutes, so this implements resilient streaming with automatic recovery.
Why
GPT-5 Pro is a slow, reasoning-focused model that can take several minutes to respond. The standard streaming approach times out or appears stuck. OpenAI's Responses API background mode is designed for these long-running requests.
What Changed
Model Addition
gpt-5-pro-2025-10-06withbackgroundMode: trueflag in model metadataBackground Mode Implementation
background: true,stream: true,store: truefor flagged modelsqueued→in_progress→completed/failedResilient Streaming
GET /v1/responses/{id}?starting_after={seq}Files Changed
packages/types/src/providers/openai.ts- Model metadatasrc/api/providers/openai-native.ts- Background mode logic, auto-resume, pollingsrc/core/task/Task.ts- Status event handlingwebview-ui/src/utils/backgroundStatus.ts- Status label mappingwebview-ui/src/components/chat/*- UI status displayTesting
Important
Adds GPT-5 Pro model with background mode, implementing resilient streaming and UI updates for status tracking.
gpt-5-pro-2025-10-06model withbackgroundMode: trueinopenai.ts.openai-native.ts.ChatRow.tsxandChatView.tsxfor background status display.ChatRow.tsxandChatView.tsxto handle new status labels.backgroundStatus.spec.tsfor status label mapping.openai-native.spec.ts.This description was created by
for 706138e. You can customize this summary. It will automatically update as commits are pushed.