Skip to content

[codex] upgrade voice runtime for realtime-1.5 and responses ws#136

Draft
human-bee wants to merge 1 commit intomainfrom
codex/voice-1-5-responses-ws-cutover
Draft

[codex] upgrade voice runtime for realtime-1.5 and responses ws#136
human-bee wants to merge 1 commit intomainfrom
codex/voice-1-5-responses-ws-cutover

Conversation

@human-bee
Copy link
Owner

Issue and User Impact

PRESENT voice runtime needed a production-grade upgrade for the February 2026 voice model/transport releases:

  • adaptive session-level tiering for gpt-realtime-1.5 vs gpt-realtime-mini
  • direct Responses WebSocket transport path with gpt-audio-1.5
  • preserved queue/idempotency/tool envelope correctness in tool-heavy chains

Without this, model quality/tiering intent and transport improvements were inconsistent, and WS mode introduced reliability/contract regressions.

Root Cause Analysis

Primary root causes identified during implementation/review:

  • Adaptive model selection was bypassed by resolver defaults injecting models.voiceRealtime as an explicit override.
  • WS reply lifecycle diverged from existing reply timeout controls and could drop turns on repeated transport failures.
  • Responses tool catalog/validation drifted from LiveKit tool schemas.
  • Participant-aware dedupe/idempotency logic had ordering gaps in voice and client quick-apply fallbacks.
  • Tool event telemetry could misattribute provider path/model under responses_ws.

Why This Fix Solves Root Cause

This change removes the adaptive bypass, aligns WS timeout behavior with existing reply controls, enforces schema-backed WS tool validation, preserves turns via controlled fallback after repeated WS failures, and normalizes participant-aware dedupe/idempotency before fingerprint/hash generation.

What Changed

Control Plane and Settings

  • Added/extended model control fields:
    • models.voiceRealtimePrimary
    • models.voiceRealtimeSecondary
    • knobs.voice.realtimeModelStrategy
  • Updated resolver defaults to keep models.voiceRealtime explicit-only (no implicit forcing).
  • Updated model settings guidance/reference entries and marked transport/responses model knobs as env-only coverage.

Voice Runtime

  • Added transport abstraction + new ResponsesWebSocketTransport (wss://api.openai.com/v1/responses).
  • Added adaptive realtime selection plumbing and explicit STT model wiring for multi-participant transcription.
  • Added transport-aware model/provider-path telemetry.
  • Added WS tool catalog generation from the same tool context schemas.
  • Added strict WS tool argument validation via runtime schemas.
  • Added WS failure handling with retry and fallback to realtime session after repeated transport errors (no silent turn loss).

Queue/Dedupe/Idempotency Hardening

  • Injected participant identity before voice dedupe fingerprinting and dispatch idempotency hashing.
  • Updated browser quick-apply fallback idempotency hash to include participant identity.

Tests

  • Added resolver.test.ts for explicit-vs-adaptive model default behavior.
  • Added responses-ws-transport.test.ts for WS tool loop and continuation behavior.
  • Extended existing config tests.

Backward Compatibility Analysis

  • External browser tool envelope contract remains unchanged (tool_call shapes and status contract preserved).
  • Queue-first orchestration and mutation idempotency semantics are preserved and hardened for participant-aware cases.
  • Runtime cutover remains direct to responses_ws by default, but repeated WS failures now degrade to in-process realtime reply generation for turn preservation.
  • No DB schema or migration changes.

Validation and Checks

Ran successfully:

  • npm run typecheck:agent
  • npm run lint (passes; repo has existing warnings)
  • npm test -- src/lib/agents/realtime/voice-agent/__tests__ src/lib/agents/control-plane/*.test.ts src/components/tool-dispatcher/hooks/useToolRunner.test.tsx

Reviewer Passes and Resolved Findings

Independent reviewer lanes covered:

  • correctness/regression/perf risk
  • maintainability/architecture
  • compatibility/contracts

Resolved findings included:

  • adaptive model bypass
  • WS timeout guardrail mismatch
  • WS turn-loss on repeated errors
  • schema-validation bypass in WS tool execution
  • participant-aware dedupe/idempotency ordering gaps
  • provider-path attribution drift in dedupe telemetry
  • client quick-apply participant-agnostic fallback key

Remaining Risks and Follow-ups

  • Direct Responses WS cutover remains operationally sensitive to env entitlement/network conditions.
  • Follow-up recommended: production soak benchmark for 20+ tool-call chains and reconnect contention windows.

UI Evidence

  • UI-visible change is limited to model settings/reference metadata copy and coverage labels.
  • No behavior-heavy frontend interaction changes; no before/after recording attached for this PR.

@vercel
Copy link

vercel bot commented Feb 26, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
present Building Building Preview, Comment Feb 26, 2026 9:00pm

Request Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant