manual turn detection with realtime model always has unnecessary 500ms delay

### Describe the bug

Looking at this code snippet: https://github.com/livekit/agents-js/blob/455b5bac95b120b886ec477212696053f86976d3/agents/src/voice/audio_recognition.ts#L641-L667

I noticed that we're always adding a 500ms delay even for realtime pipeline that has no STT turned on. I'm using manual turn detection where I call commitUserTurn(), and it's causing a 500ms delay before speech handle gets created.

I verified that the delay() function is being called even with no transcription.

here's my config:
```
const session = new voice.AgentSession({
      llm: new openai.realtime.RealtimeModel({
        model: 'gpt-realtime-mini',
        turnDetection: null,
        modalities: ['audio', 'text'],
      }),
      turnDetection: 'manual',
      voiceOptions: {
        preemptiveGeneration: false,
        minEndpointingDelay: 0,
        maxEndpointingDelay: 0,
        minInterruptionDuration: 0,
        allowInterruptions: false,
      },
    });

await session.start({
      agent: this.agent,
      room: this.room,
      inputOptions: {
        audioEnabled: true,
        textEnabled: false,
      },
      outputOptions: {
        audioEnabled: false,
        transcriptionEnabled: false,
      },
    });
```

### Relevant log output

_No response_

### Describe your environment


  System:
    OS: macOS 14.7
    CPU: (10) arm64 Apple M1 Max
    Memory: 98.83 MB / 32.00 GB
    Shell: 5.9 - /bin/zsh
  Binaries:
    Node: 24.11.1 - ~/.nvm/versions/node/v24.11.1/bin/node
    npm: 11.6.2 - ~/.nvm/versions/node/v24.11.1/bin/npm
    pnpm: 10.25.0 - /opt/homebrew/bin/pnpm
    Watchman: 2025.11.10.00 - /opt/homebrew/bin/watchman


    "@livekit/agents": "1.0.30",
    "@livekit/agents-plugin-livekit": "1.0.30",
    "@livekit/agents-plugin-openai": "1.0.30",

### Minimal reproducible example

_No response_

### Additional information

_No response_

	const commitUserTurnTask =
	(delayDuration: number = 500) =>
	async (controller: AbortController) => {
	if (Date.now() - this.lastFinalTranscriptTime > delayDuration) {
	// flush the stt by pushing silence
	if (audioDetached && this.sampleRate !== undefined) {
	const numSamples = Math.floor(this.sampleRate * 0.5);
	const silence = new Int16Array(numSamples * 2);
	const silenceFrame = new AudioFrame(silence, this.sampleRate, 1, numSamples);
	this.silenceAudioWriter.write(silenceFrame);
	}

	// wait for the final transcript to be available
	await delay(delayDuration, { signal: controller.signal });
	}

	if (this.audioInterimTranscript) {
	// append interim transcript in case the final transcript is not ready
	this.audioTranscript = `${this.audioTranscript} ${this.audioInterimTranscript}`.trim();
	}
	this.audioInterimTranscript = '';

	const chatCtx = this.hooks.retrieveChatCtx();
	this.logger.debug('running EOU detection on commitUserTurn');
	this.runEOUDetection(chatCtx);
	this.userTurnCommitted = true;
	};

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

manual turn detection with realtime model always has unnecessary 500ms delay #926

Describe the bug

Relevant log output

Describe your environment

Minimal reproducible example

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

manual turn detection with realtime model always has unnecessary 500ms delay #926

Description

Describe the bug

Relevant log output

Describe your environment

Minimal reproducible example

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions