Description
Service
OpenAI
Describe the bug
I was setting up the realtime API for voice chatting, but I found after my first question and reply from the model, the web socket would close and terminate the conversation. (This was while using CreateServerVoiceActivityTurnDetectionOptions
)
When calling ReceiveUpdatesAsync
to receive server messages, I didn't expect this to close the _clientWebSocket
, but it did.
It took me a while to figure out that the AsyncWebsocketMessageResultEnumerator
is automatically disposing of the _clientWebSocket
when it finishes iterating, causing ReceiveUpdatesAsync
to terminate the WebSocket. Commenting out the disposing call within AsyncWebsocketMessageResultEnumerator
resulted in the expected behavior for me:
public ValueTask DisposeAsync()
{
//_clientWebSocket?.Dispose();
return new ValueTask(Task.CompletedTask);
}
This allowed me to have my expected 2-way conversation. If there's another intended method to receive server events without terminating the socket, or to keep the socket alive for more than one request-response, please let me know.
Steps to reproduce
- Initialize a
RealtimeConversationSession
with server voice activity turn detection:
var client = new RealtimeConversationClient(model: "gpt-4o-realtime-preview-2024-10-01", new(apiKey));
CancellationTokenSource cts = new();
var session = await client.StartConversationSessionAsync(cts.Token);
var options = new ConversationSessionOptions()
{
Instructions = "<system prompt>",
TurnDetectionOptions = ConversationTurnDetectionOptions.CreateServerVoiceActivityTurnDetectionOptions(0.5f, TimeSpan.FromMilliseconds(300), TimeSpan.FromMilliseconds(200)),
Voice = ConversationVoice.Alloy,
OutputAudioFormat = ConversationAudioFormat.Pcm16,
InputTranscriptionOptions = new ConversationInputTranscriptionOptions()
{
Model = "whisper-1"
}
};
await session.ConfigureSessionAsync(options);
- Begin sending audio through
SendAudioAsync
(in my case with NAudio Wave):
waveIn.DataAvailable += (s, a) =>
{
using var memoryStream = new MemoryStream();
memoryStream.Write(a.Buffer, 0, a.BytesRecorded);
memoryStream.Position = 0;
session.SendAudioAsync(memoryStream, token).Wait();
};
- Begin handling server responses with
ReceiveUpdatesAsync
in a loop:
while (true)
{
await foreach (var update in session.ReceiveUpdatesAsync(token))
{
//Handle received updates
}
}
- Make an audible request to the AI, and wait for its response to complete. On the second loop,
ReceiveUpdatesAsync
will throw aSystem.ObjectDisposedException
:
Cannot access a disposed object.
Object name: 'System.Net.WebSockets.ClientWebSocket'.
Code snippets
No response
OS
winOS
.NET version
.NET 8 Core
Library version
2.1.0-beta.1