Skip to content

(Beta) Websocket closing unexpectedly on ReceiveUpdatesAsync #244

Open
@camelCase12

Description

@camelCase12

Service

OpenAI

Describe the bug

I was setting up the realtime API for voice chatting, but I found after my first question and reply from the model, the web socket would close and terminate the conversation. (This was while using CreateServerVoiceActivityTurnDetectionOptions)

When calling ReceiveUpdatesAsync to receive server messages, I didn't expect this to close the _clientWebSocket, but it did.

It took me a while to figure out that the AsyncWebsocketMessageResultEnumerator is automatically disposing of the _clientWebSocket when it finishes iterating, causing ReceiveUpdatesAsync to terminate the WebSocket. Commenting out the disposing call within AsyncWebsocketMessageResultEnumerator resulted in the expected behavior for me:

public ValueTask DisposeAsync()
{
    //_clientWebSocket?.Dispose();
    return new ValueTask(Task.CompletedTask);
}

This allowed me to have my expected 2-way conversation. If there's another intended method to receive server events without terminating the socket, or to keep the socket alive for more than one request-response, please let me know.

Steps to reproduce

  1. Initialize a RealtimeConversationSession with server voice activity turn detection:
var client = new RealtimeConversationClient(model: "gpt-4o-realtime-preview-2024-10-01", new(apiKey));


CancellationTokenSource cts = new();

var session = await client.StartConversationSessionAsync(cts.Token);

var options = new ConversationSessionOptions()
{
    Instructions = "<system prompt>",
    TurnDetectionOptions = ConversationTurnDetectionOptions.CreateServerVoiceActivityTurnDetectionOptions(0.5f, TimeSpan.FromMilliseconds(300), TimeSpan.FromMilliseconds(200)),
    Voice = ConversationVoice.Alloy,
    OutputAudioFormat = ConversationAudioFormat.Pcm16,
    InputTranscriptionOptions = new ConversationInputTranscriptionOptions()
    {
        Model = "whisper-1"
    }
};

await session.ConfigureSessionAsync(options);
  1. Begin sending audio through SendAudioAsync (in my case with NAudio Wave):
waveIn.DataAvailable += (s, a) =>
{
    using var memoryStream = new MemoryStream();
    memoryStream.Write(a.Buffer, 0, a.BytesRecorded);
    memoryStream.Position = 0;
    session.SendAudioAsync(memoryStream, token).Wait();
};
  1. Begin handling server responses with ReceiveUpdatesAsync in a loop:
while (true)
{
    await foreach (var update in session.ReceiveUpdatesAsync(token))
    {
        //Handle received updates
    }
}
  1. Make an audible request to the AI, and wait for its response to complete. On the second loop, ReceiveUpdatesAsync will throw a System.ObjectDisposedException:
Cannot access a disposed object.
Object name: 'System.Net.WebSockets.ClientWebSocket'.

Code snippets

No response

OS

winOS

.NET version

.NET 8 Core

Library version

2.1.0-beta.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions