Add logging for invalid WebSocket payloads and new properties #49

yarns123 · 2025-06-02T18:43:23Z

Introduced LogConnectionClosedWithInvalidPyload method in MultiModalLiveClientLoggingExtensions to log errors for WebSocket closures due to invalid payloads.
Updated MultiModalLiveClient to log close status descriptions for invalid payloads.
Added OutputAudioTranscription and InputAudioTranscription properties to BidiGenerateContentSetup, initialized with new instances.
Created a new class OutputAudioTranscription for handling audio transcription data.

Summary by CodeRabbit

New Features
- Added options for handling input and output audio transcription in content generation setup.
Bug Fixes
- Improved error logging for WebSocket connections closed due to invalid payloads, providing more detailed information.

- Introduced `LogConnectionClosedWithInvalidPyload` method in `MultiModalLiveClientLoggingExtensions` to log errors for WebSocket closures due to invalid payloads. - Updated `MultiModalLiveClient` to log close status descriptions for invalid payloads. - Added `OutputAudioTranscription` and `InputAudioTranscription` properties to `BidiGenerateContentSetup`, initialized with new instances. - Created a new class `OutputAudioTranscription` for handling audio transcription data.

coderabbitai · 2025-06-02T18:43:29Z

Walkthrough

A new logging method for handling invalid WebSocket payloads was introduced, along with its integration into the WebSocket client's disconnection logic. Additionally, the BidiGenerateContentSetup class was extended with two nullable properties for audio transcription, and a new OutputAudioTranscription class was added.

Changes

File(s)	Change Summary
src/GenerativeAI.Live/Logging/LoggingExtensions.cs	Added `LogConnectionClosedWithInvalidPyload` logging method to handle WebSocket invalid payload disconnections.
src/GenerativeAI.Live/Models/MultiModalLiveClient.cs	Integrated new logging method in WebSocket disconnect handler for `InvalidPayloadData` close status.
src/GenerativeAI/Types/MultimodalLive/BidiGenerateContentSetup.cs	Added `OutputAudioTranscription` and `InputAudioTranscription` properties; introduced `OutputAudioTranscription` class.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant MultiModalLiveClient
    participant Logger

    Client->>MultiModalLiveClient: ConnectAsync()
    MultiModalLiveClient-->>Client: Establish WebSocket connection
    Note over MultiModalLiveClient: WebSocket disconnects
    MultiModalLiveClient->>MultiModalLiveClient: Check close status
    alt close status is InvalidPayloadData
        MultiModalLiveClient->>Logger: LogConnectionClosedWithInvalidPyload(closeStatusDescription)
    else other close status
        MultiModalLiveClient->>Logger: Log generic connection closed
    end
    MultiModalLiveClient-->>Client: Disconnected event

Poem

A bunny logs with careful paws,
Invalid payloads get their cause.
New properties for sound and voice,
Transcription added—what a choice!
When WebSockets close, we know just why,
With logging neat, we hop right by.
🐇

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (2)

src/GenerativeAI.Live/Models/MultiModalLiveClient.cs (1)
434-434: Remove redundant comment.

The comment //log info.CloseStatusDescription is redundant since the code below clearly shows what's being logged.
-                //log info.CloseStatusDescription
src/GenerativeAI/Types/MultimodalLive/BidiGenerateContentSetup.cs (1)

53-56: Implement the OutputAudioTranscription class.

The OutputAudioTranscription class is currently empty, which appears to be a placeholder. Please implement the required properties based on the API specification.

Do you want me to help generate the implementation based on the Gemini API documentation for audio transcription properties?

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7f68457 and 58590a5.

📒 Files selected for processing (3)

src/GenerativeAI.Live/Logging/LoggingExtensions.cs (1 hunks)
src/GenerativeAI.Live/Models/MultiModalLiveClient.cs (1 hunks)
src/GenerativeAI/Types/MultimodalLive/BidiGenerateContentSetup.cs (1 hunks)

🔇 Additional comments (1)

src/GenerativeAI/Types/MultimodalLive/BidiGenerateContentSetup.cs (1)

48-48: Consider the default initialization inconsistency.

OutputAudioTranscription is initialized with a new instance by default, while InputAudioTranscription is not. This inconsistency might be intentional, but please verify if both should have the same initialization behavior.

coderabbitai · 2025-06-02T18:48:01Z

src/GenerativeAI.Live/Logging/LoggingExtensions.cs

+    [LoggerMessage(EventId = 113, Level = LogLevel.Error, Message = "WebSocket connection closed caused by invalid payload: {CloseStatusDescription}")]
+    public static partial void LogConnectionClosedWithInvalidPyload(this ILogger logger, string closeStatusDescription);


⚠️ Potential issue

Fix the typo in the method name.

The method name LogConnectionClosedWithInvalidPyload has a typo - it should be LogConnectionClosedWithInvalidPayload (missing 'a' in "Payload").

Apply this diff to fix the typo:

- [LoggerMessage(EventId = 113, Level = LogLevel.Error, Message = "WebSocket connection closed caused by invalid payload: {CloseStatusDescription}")] - public static partial void LogConnectionClosedWithInvalidPyload(this ILogger logger, string closeStatusDescription); + [LoggerMessage(EventId = 113, Level = LogLevel.Error, Message = "WebSocket connection closed caused by invalid payload: {CloseStatusDescription}")] + public static partial void LogConnectionClosedWithInvalidPayload(this ILogger logger, string closeStatusDescription);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

[LoggerMessage(EventId = 113, Level = LogLevel.Error, Message = "WebSocket connection closed caused by invalid payload: {CloseStatusDescription}")]

public static partial void LogConnectionClosedWithInvalidPyload(this ILogger logger, string closeStatusDescription);

[LoggerMessage(EventId = 113, Level = LogLevel.Error, Message = "WebSocket connection closed caused by invalid payload: {CloseStatusDescription}")]

public static partial void LogConnectionClosedWithInvalidPayload(this ILogger logger, string closeStatusDescription);

🤖 Prompt for AI Agents

In src/GenerativeAI.Live/Logging/LoggingExtensions.cs at lines 51 to 52, the method name LogConnectionClosedWithInvalidPyload contains a typo. Rename the method to LogConnectionClosedWithInvalidPayload by adding the missing 'a' in "Payload" to correct the spelling.

coderabbitai · 2025-06-02T18:48:01Z

src/GenerativeAI.Live/Models/MultiModalLiveClient.cs

+            else if (info.CloseStatus == WebSocketCloseStatus.InvalidPayloadData)
+            {
+                //log info.CloseStatusDescription
+                _logger?.LogConnectionClosedWithInvalidPyload(info.CloseStatusDescription!);
+            }


💡 Verification agent

🧩 Analysis chain

Fix the typo and verify null safety.

Two issues in this code segment:

Same typo as in the logging extensions: LogConnectionClosedWithInvalidPyload should be LogConnectionClosedWithInvalidPayload

The null-forgiving operator (!) on CloseStatusDescription could be risky if the value is actually null

Apply this diff to fix the typo:

- _logger?.LogConnectionClosedWithInvalidPyload(info.CloseStatusDescription!); + _logger?.LogConnectionClosedWithInvalidPayload(info.CloseStatusDescription!);

Verify that CloseStatusDescription is never null when CloseStatus is InvalidPayloadData:

🏁 Script executed:

#!/bin/bash # Description: Check if CloseStatusDescription can be null for InvalidPayloadData scenarios # Search for WebSocketCloseStatus.InvalidPayloadData usage patterns rg -A 5 -B 5 "InvalidPayloadData" # Look for any null checks on CloseStatusDescription rg -A 3 -B 3 "CloseStatusDescription"

Length of output: 2680

Fix typo, handle possible null, and update extension method

Two issues found in InvalidPayloadData branch:

Typo in the logging call and its extension: InvalidPyload → InvalidPayload

Unsafe null-forgiving operator on info.CloseStatusDescription

Please update both call site and extension definition, and guard against a null description. Example diff:

In src/GenerativeAI.Live/Models/MultiModalLiveClient.cs:

- else if (info.CloseStatus == WebSocketCloseStatus.InvalidPayloadData) - { - //log info.CloseStatusDescription - _logger?.LogConnectionClosedWithInvalidPyload(info.CloseStatusDescription!); - } + else if (info.CloseStatus == WebSocketCloseStatus.InvalidPayloadData) + { + // ensure description is never null + var description = info.CloseStatusDescription ?? "No payload description"; + _logger?.LogConnectionClosedWithInvalidPayload(description); + }

In src/GenerativeAI.Live/Logging/LoggingExtensions.cs:

- [LoggerMessage(EventId = 113, Level = LogLevel.Error, - Message = "WebSocket connection closed caused by invalid payload: {CloseStatusDescription}")] - public static partial void LogConnectionClosedWithInvalidPyload(this ILogger logger, string closeStatusDescription); + [LoggerMessage(EventId = 113, Level = LogLevel.Error, + Message = "WebSocket connection closed caused by invalid payload: {CloseStatusDescription}")] + public static partial void LogConnectionClosedWithInvalidPayload(this ILogger logger, string closeStatusDescription);

Rename both call and extension to “InvalidPayload”

Replace ! with ?? or another null‐safe default

(Optional) Verify whether a Disconnected event should also be raised for this case per your connection lifecycle

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

else if (info.CloseStatus == WebSocketCloseStatus.InvalidPayloadData)

{

//log info.CloseStatusDescription

_logger?.LogConnectionClosedWithInvalidPyload(info.CloseStatusDescription!);

}

else if (info.CloseStatus == WebSocketCloseStatus.InvalidPayloadData)

{

// ensure description is never null

var description = info.CloseStatusDescription ?? "No payload description";

_logger?.LogConnectionClosedWithInvalidPayload(description);

}

🤖 Prompt for AI Agents

In src/GenerativeAI.Live/Models/MultiModalLiveClient.cs around lines 432 to 436, fix the typo in the logging call from LogConnectionClosedWithInvalidPyload to LogConnectionClosedWithInvalidPayload. Remove the null-forgiving operator on info.CloseStatusDescription and instead provide a null-safe fallback value (e.g., an empty string or a default message) to avoid potential null reference exceptions. Also update the corresponding extension method in src/GenerativeAI.Live/Logging/LoggingExtensions.cs to rename the method accordingly and handle null inputs safely.

coderabbitai · 2025-06-02T18:48:01Z

src/GenerativeAI/Types/MultimodalLive/BidiGenerateContentSetup.cs

+    [JsonPropertyName("outputAudioTranscription")]
+    public OutputAudioTranscription? OutputAudioTranscription { get; set; } = new OutputAudioTranscription();
+
+    [JsonPropertyName("inputAudioTranscription")]
+    public OutputAudioTranscription? InputAudioTranscription { get; set; }


💡 Verification agent

🧩 Analysis chain

Reconsider the type design for input and output transcription.

Both OutputAudioTranscription and InputAudioTranscription properties use the same OutputAudioTranscription type. This seems inconsistent and potentially confusing. Consider:

Creating separate types for input and output transcription if they have different structures

Using a more generic type name like AudioTranscription if they are truly identical

Adding XML documentation to clarify the purpose and differences between these properties

Verify the intended design by checking the API documentation or related usage:

🌐 Web query:

What are the differences between inputAudioTranscription and outputAudioTranscription in the Gemini Multimodal Live API setup configuration?

💡 Result:

Differences Between inputAudioTranscription and outputAudioTranscription in the Gemini Multimodal Live API

The Gemini Multimodal Live API provides configuration options for real-time transcription of both incoming and outgoing audio. In the session setup, two distinct configuration fields manage these behaviors: inputAudioTranscription and outputAudioTranscription. Here are their key differences:

inputAudioTranscription

Purpose: Enables transcription of audio input sent from the client (usually the user's speech or audio data).

Function: When set, the API automatically processes and transcribes the incoming audio stream into text that aligns with the language of the input audio.

Use Case: Useful for capturing and understanding what the user says in real time; for example, to display live captions of user speech or for further processing by the model.

Language Alignment: The transcription is performed in the language detected or specified for the input audio[1][5].

outputAudioTranscription

Purpose: Enables transcription of audio output generated by the model (the assistant's spoken responses).

Function: When enabled, the API provides a text transcription of the audio that the model generates as its response, aligning with the output audio's language code.

Use Case: Useful for displaying the assistant's spoken responses as text, offering accessibility for users who need visual confirmation, or enabling further text-based processing.

Language Alignment: The transcription matches the language set for the model's audio output, which can be explicitly defined in the output configuration[1][5].

Summary Table

Feature inputAudioTranscription outputAudioTranscription

Transcribes Incoming (user) audio Outgoing (assistant/model) audio

Text aligns with Input audio language Output audio language code

Use case Capture/understand user speech in real time Display/understand model's spoken responses in text

Activated by Including input_audio_transcription in setup Including output_audio_transcription in setup

Typical Use in Configuration

In a typical session configuration, both can be enabled simultaneously for full conversational visibility:

config = { "response_modalities": ["AUDIO"], "input_audio_transcription": {}, "output_audio_transcription": {} }

This setup will provide both the user's spoken input and the assistant's spoken output as synchronized text streams, facilitating live transcription on both ends[5].

Practical Example

When a user speaks, inputAudioTranscription provides a real-time transcript of their speech.

When the model replies via synthesized speech, outputAudioTranscription delivers the text equivalent of that synthesized audio.

Summary

inputAudioTranscription: Transcribes and exposes the user's input speech.

outputAudioTranscription: Transcribes and exposes the model's spoken output, strictly following the language code specified for output.

This dual transcription capability ensures clear, accessible, and auditable interactions in voice or multimodal applications using the Gemini Multimodal Live API[1][5].

Citations:

1: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/multimodal-live

2: https://ai.google.dev/gemini-api/docs/live

3: https://docs.pipecat.ai/server/services/s2s/gemini

4: https://ai.google.dev/gemini-api/docs/audio

5: https://cloud.google.com/vertex-ai/generative-ai/docs/live-api

6: https://www.youtube.com/watch?v=LMhe2egLsrQ

7: https://www.youtube.com/watch?v=1eybCQYOJnY

8: https://www.googlecloudcommunity.com/gc/Gemini-Code-Assist/Summary-after-audio-conversation-in-Gemini-s-multimodal-live-API/m-p/854353

9: https://www.youtube.com/watch?v=MGHwe7j2TyI

10: https://firebase.google.com/docs/ai-logic/analyze-audio

Refactor audio‐transcription types and property initialization

Both input and output transcription share the same (currently empty) configuration shape, so the OutputAudioTranscription type name is misleading and the two properties shouldn’t be initialized inconsistently. Please:

Rename the placeholder class to a more generic name, e.g.:
/// <summary> /// Configuration for transcribing audio streams. /// </summary> public class AudioTranscription { }

Update your properties in BidiGenerateContentSetup.cs (around lines 47–51) to use this new type and remove the default instantiation:
- [JsonPropertyName("outputAudioTranscription")] - public OutputAudioTranscription? OutputAudioTranscription { get; set; } = new OutputAudioTranscription(); + /// <summary> + /// Transcribes the assistant’s outgoing audio into text. + /// </summary> + [JsonPropertyName("outputAudioTranscription")] + public AudioTranscription? OutputAudioTranscription { get; set; } - [JsonPropertyName("inputAudioTranscription")] - public OutputAudioTranscription? InputAudioTranscription { get; set; } + /// <summary> + /// Transcribes the user’s incoming audio into text. + /// </summary> + [JsonPropertyName("inputAudioTranscription")] + public AudioTranscription? InputAudioTranscription { get; set; }

Add XML documentation on both the class and its properties to clarify their distinct roles.

These changes will make the intent clear, avoid confusion over type reuse, and enforce explicit configuration for each transcription stream.

🤖 Prompt for AI Agents

In src/GenerativeAI/Types/MultimodalLive/BidiGenerateContentSetup.cs around lines 47 to 51, rename the OutputAudioTranscription class to a more generic name like AudioTranscription to reflect its shared configuration role. Update both InputAudioTranscription and OutputAudioTranscription properties to use this new AudioTranscription type and remove the default initialization on OutputAudioTranscription to keep consistency. Add XML documentation comments on the AudioTranscription class and on both properties to clearly describe their distinct purposes for input and output audio transcription configurations.

coderabbitai bot reviewed Jun 2, 2025

View reviewed changes

gunpal5 merged commit 46b358a into gunpal5:main Jun 2, 2025
1 check passed

coderabbitai bot mentioned this pull request Jun 3, 2025

Add audio transcription settings to MultiModalLiveClient #50

Merged

coderabbitai bot mentioned this pull request Oct 19, 2025

Refactored DisconnectionHappened handler to always report errors #82

Merged

This was referenced Dec 3, 2025

Bump the all group with 24 updates tryAGI/LangChain.Providers#226

Closed

Bump Google_GenerativeAI from 2.5.10 to 3.4.1 tryAGI/LangChain.Providers#230

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add logging for invalid WebSocket payloads and new properties #49

Add logging for invalid WebSocket payloads and new properties #49

Uh oh!

yarns123 commented Jun 2, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Jun 2, 2025 •

edited

Loading

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Jun 2, 2025

Uh oh!

coderabbitai bot Jun 2, 2025

Uh oh!

coderabbitai bot Jun 2, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		[LoggerMessage(EventId = 113, Level = LogLevel.Error, Message = "WebSocket connection closed caused by invalid payload: {CloseStatusDescription}")]
		public static partial void LogConnectionClosedWithInvalidPyload(this ILogger logger, string closeStatusDescription);

Feature	inputAudioTranscription	outputAudioTranscription
Transcribes	Incoming (user) audio	Outgoing (assistant/model) audio
Text aligns with	Input audio language	Output audio language code
Use case	Capture/understand user speech in real time	Display/understand model's spoken responses in text
Activated by	Including `input_audio_transcription` in setup	Including `output_audio_transcription` in setup

Add logging for invalid WebSocket payloads and new properties #49

Add logging for invalid WebSocket payloads and new properties #49

Uh oh!

Conversation

yarns123 commented Jun 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jun 2, 2025

Choose a reason for hiding this comment

Differences Between inputAudioTranscription and outputAudioTranscription in the Gemini Multimodal Live API

inputAudioTranscription

outputAudioTranscription

Summary Table

Typical Use in Configuration

Practical Example

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yarns123 commented Jun 2, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Jun 2, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

Differences Between `inputAudioTranscription` and `outputAudioTranscription` in the Gemini Multimodal Live API