feat: voice mode tts endpoint #7294

phact · 2025-03-27T05:10:23Z

This pull request introduces several changes to the voice mode API, including the addition of new WebSocket endpoints, improvements to event logging, and updates to the frontend to support session-based interactions. The most important changes are summarized below:

Backend Changes:

Added OpenAI import to voice_mode.py to enable integration with OpenAI's API.
Introduced a new create_event_logger function for logging WebSocket events with deduplication and counting.
Created a new WebSocket endpoint for text-to-speech (TTS) interaction, including the TTSConfig class and related functions.
Replaced the existing log_event function in process_vad_audio with the new create_event_logger function.

Frontend Changes:

Added session_id to MessagesQueryParams in use-get-messages-polling.ts to support session-based interactions.
Updated SessionSelector component to use setNewSessionCloseVoiceAssistant from voiceStore to manage voice assistant sessions.
Modified ChatViewWrapper to pass sidebarOpen prop to ChatView and adjust layout based on sidebar state.

…ove code readability and maintainability 📝 (chat-input.tsx): Add functionality to set voice assistant active state when showAudioInput is true 📝 (voice-assistant.tsx): Add functionality to set voice assistant active state and scroll to bottom when closing audio input 📝 (chat-view.tsx): Update ChatView component to consider sidebarOpen and isVoiceAssistantActive states 📝 (voiceStore.ts): Add isVoiceAssistantActive state and setIsVoiceAssistantActive function to voice store 📝 (index.ts, voice.types.ts): Update types to include sidebarOpen prop in chatViewProps and isVoiceAssistantActive state in VoiceStoreType

…ty to chat input component 🔧 (voice-button.tsx): Update voice button to set new session close voice assistant state 🔧 (sidebar-open-view.tsx): Update sidebar open view to set new session close voice assistant state 🔧 (voiceStore.ts, voice.types.ts): Add new session close voice assistant state and setter to voice store and types

…unction to clean up code and improve readability

…ove code readability and maintainability 📝 (chat-input.tsx): Add functionality to set voice assistant active state when showAudioInput is true 📝 (voice-assistant.tsx): Add functionality to set voice assistant active state and scroll to bottom when closing audio input 📝 (chat-view.tsx): Update ChatView component to consider sidebarOpen and isVoiceAssistantActive states 📝 (voiceStore.ts): Add isVoiceAssistantActive state and setIsVoiceAssistantActive function to voice store 📝 (index.ts, voice.types.ts): Update types to include sidebarOpen prop in chatViewProps and isVoiceAssistantActive state in VoiceStoreType

codspeed-hq · 2025-03-27T05:29:09Z

CodSpeed Performance Report

Merging #7294 will improve performances by 58.55%

_{Comparing voice-mode-tts (8f275e1) with main (e82b23f)}

Summary

⚡ 2 improvements
✅ 17 untouched benchmarks

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
⚡	`test_build_flow_invalid_job_id`	12.4 ms	8 ms	+54.83%
⚡	`test_cancel_nonexistent_build`	12.2 ms	7.7 ms	+58.55%

…-tts`) To optimize the provided `get_tts_config` function for better runtime performance, particularly in cases where `session_id` is not `None` but frequently not in the `tts_config_cache`, we can eliminate some redundant lookups and minimize dictionary access. Here’s the optimized version. ### Optimizations. 1. **Cache Lookup Optimization**: Instead of checking for the key existence and then retrieving it or setting it, we directly attempt to retrieve the value using a try-except block. This uses a single dictionary lookup in the common case where the session ID is found in the cache. 2. **Exception Handling**: Using `KeyError` in the `except` block to handle the case where the session ID is not found in the cache. This approach is faster due to fewer dictionary operations in the normal flow. This results in the function performing fewer dictionary lookups under typical usage, leading to improved performance.

…ter in speech creation function 🐛 (use-start-conversation.ts): update WebSocket URL to use flow_tts endpoint and add support for audio language and input audio transcription model in WebSocket session update configuration

…-tts`) To optimize the `get_tts_config` function for speed, we can reduce the number of lookups on the `tts_config_cache` dictionary and ensure that the instance creation is kept minimal. Here is the revised version. In this optimized version. 1. Remove the unnecessary variable `msg`. 2. Reduced the lookup on `tts_config_cache` by using `dict.get()`, which avoids the need for an explicit dictionary lookup before accessing the value. 3. This approach still lazily initializes the `TTSConfig` and ensures thread safety by directly working on the cache after a single not-found check. By minimizing the dictionary lookups and being more direct in our conditional checks, we should achieve a slight performance improvement, especially when cache misses are not very frequent.

codeflash-ai · 2025-03-27T14:00:26Z

src/backend/base/langflow/api/v1/voice_mode.py

+
+    if session_id not in tts_config_cache:
+        tts_config_cache[session_id] = TTSConfig(session_id, openai_key)
+    return tts_config_cache[session_id]


⚡️Codeflash found 22% (0.22x) speedup for get_tts_config

⏱️ Runtime : 1.04 millisecond → 857 microseconds (best of 5 runs)

📝 Explanation and details

To optimize the get_tts_config function for speed, we can reduce the number of lookups on the tts_config_cache dictionary and ensure that the instance creation is kept minimal. Here is the revised version.

In this optimized version.

Remove the unnecessary variable msg.

Reduced the lookup on tts_config_cache by using dict.get(), which avoids the need for an explicit dictionary lookup before accessing the value.

This approach still lazily initializes the TTSConfig and ensures thread safety by directly working on the cache after a single not-found check.

By minimizing the dictionary lookups and being more direct in our conditional checks, we should achieve a slight performance improvement, especially when cache misses are not very frequent.

✅ Correctness verification report:

Test Status

⚙️ Existing Unit Tests 🔘 None Found

🌀 Generated Regression Tests ✅ 6 Passed

⏪ Replay Tests 🔘 None Found

🔎 Concolic Coverage Tests 🔘 None Found

📊 Tests Coverage undefined

🌀 Generated Regression Tests Details

from typing import Any # imports import pytest # used for our unit tests from langflow.api.v1.voice_mode import get_tts_config from openai import OpenAI # function to test class TTSConfig: def __init__(self, session_id: str, openai_key: str): self.session_id = session_id self.barge_in_enabled = False self.default_tts_session = { "type": "transcription_session.update", "session": { "input_audio_format": "pcm16", "input_audio_transcription": { "model": "gpt-4o-mini-transcribe", "language": "en", }, "turn_detection": { "type": "server_vad", "threshold": 0.5, # Placeholder value "prefix_padding_ms": 300, # Placeholder value "silence_duration_ms": 500, # Placeholder value }, "input_audio_noise_reduction": {"type": "near_field"}, "include": [], }, } self.tts_session: dict[str, Any] = {} self.oai_client = OpenAI(api_key=openai_key) def get_session_dict(self): """Return a copy of the default session dictionary with current settings.""" return dict(self.default_tts_session) def get_openai_client(self): return self.oai_client tts_config_cache: dict[str, TTSConfig] = {} from langflow.api.v1.voice_mode import get_tts_config # unit tests def test_session_id_none(): # Test with session_id as None with pytest.raises(ValueError, match="session_id cannot be None"): get_tts_config(None, "key1") def test_non_string_session_id(): # Test with non-string session_id with pytest.raises(TypeError): get_tts_config(123, "key1") with pytest.raises(TypeError): get_tts_config(["list"], "key1") def test_non_string_openai_key(): # Test with non-string openai_key with pytest.raises(TypeError): get_tts_config("session1", 123) with pytest.raises(TypeError): get_tts_config("session1", ["list"]) from typing import Any # imports import pytest # used for our unit tests from langflow.api.v1.voice_mode import get_tts_config # function to test from openai import OpenAI # Define constants used in the TTSConfig class SILENCE_THRESHOLD = 0.5 PREFIX_PADDING_MS = 300 SILENCE_DURATION_MS = 700 class TTSConfig: def __init__(self, session_id: str, openai_key: str): self.session_id = session_id self.barge_in_enabled = False self.default_tts_session = { "type": "transcription_session.update", "session": { "input_audio_format": "pcm16", "input_audio_transcription": { "model": "gpt-4o-mini-transcribe", # "prompt": "expect words in english", "language": "en", }, "turn_detection": { "type": "server_vad", "threshold": SILENCE_THRESHOLD, "prefix_padding_ms": PREFIX_PADDING_MS, "silence_duration_ms": SILENCE_DURATION_MS, }, "input_audio_noise_reduction": {"type": "near_field"}, "include": [], }, } self.tts_session: dict[str, Any] = {} self.oai_client = OpenAI(api_key=openai_key) def get_session_dict(self): """Return a copy of the default session dictionary with current settings.""" return dict(self.default_tts_session) def get_openai_client(self): return self.oai_client tts_config_cache: dict[str, TTSConfig] = {} from langflow.api.v1.voice_mode import get_tts_config # unit tests # Valid Inputs def test_none_session_id_raises_value_error(): with pytest.raises(ValueError, match="session_id cannot be None"): get_tts_config(None, "valid_key_123")

To test or edit this optimization locally git merge codeflash/optimize-pr7294-2025-03-27T14.00.20

Suggested change

return tts_config_cache[session_id]

raise ValueError("session_id cannot be None")

# Use get with a default to reduce dictionary lookups

config = tts_config_cache.get(session_id)

if config is None:

config = TTSConfig(session_id, openai_key)

tts_config_cache[session_id] = config

return config

async def flow_tts_websocket_no_session(

…fig class for voice mode customization ♻️ (use-start-conversation.ts): refactor code to use "transcription_session.update" type and update session attributes based on audioSettings and audioLanguage variables

…-mode-tts`) Sure, we can optimize the given `create_event_logger` function for better runtime performance and memory usage.. 1. Using `nonlocal` keyword for state variables to avoid dictionary key access overhead. 2. Simplifying the event count increment using `+=` operator and initializing it as 0 by default. Let's rewrite the code. Key Changes. - Replaced `state` dictionary with local variables `last_event_type` and `event_count`. - Used `nonlocal` to modify `last_event_type` and `event_count` from the inner function. - Simplified event count initialization and increment process.

codeflash-ai · 2025-03-27T15:17:33Z

src/backend/base/langflow/api/v1/voice_mode.py

+    Args:
+        session_id: The session ID to include in log messages
+    """
+    state = {"last_event_type": None, "event_count": 0}


⚡️Codeflash found 69% (0.69x) speedup for create_event_logger

⏱️ Runtime : 35.4 microseconds → 21.0 microseconds (best of 19 runs)

📝 Explanation and details

Sure, we can optimize the given create_event_logger function for better runtime performance and memory usage..

Using nonlocal keyword for state variables to avoid dictionary key access overhead.

Simplifying the event count increment using += operator and initializing it as 0 by default.

Let's rewrite the code.

Key Changes.

Replaced state dictionary with local variables last_event_type and event_count.

Used nonlocal to modify last_event_type and event_count from the inner function.

Simplified event count initialization and increment process.

✅ Correctness verification report:

Test Status

⚙️ Existing Unit Tests 🔘 None Found

🌀 Generated Regression Tests ✅ 14 Passed

⏪ Replay Tests 🔘 None Found

🔎 Concolic Coverage Tests 🔘 None Found

📊 Tests Coverage undefined

🌀 Generated Regression Tests Details

from unittest.mock import patch # imports import pytest # used for our unit tests from langflow.api.v1.voice_mode import create_event_logger # function to test from langflow.logging import logger # unit tests # Basic Functionality def test_mixed_event_types(): with patch.object(logger, 'debug') as mock_debug: codeflash_output = create_event_logger("session1"); log_event = codeflash_output log_event({"type": "event1"}, "Client → OpenAI") log_event({"type": "event2"}, "Client → OpenAI") log_event({"type": "event1"}, "Client → OpenAI") # Edge Cases def test_empty_event_dictionary(): with patch.object(logger, 'debug') as mock_debug: codeflash_output = create_event_logger("session1"); log_event = codeflash_output with pytest.raises(KeyError): log_event({}, "Client → OpenAI") def test_missing_event_type_key(): with patch.object(logger, 'debug') as mock_debug: codeflash_output = create_event_logger("session1"); log_event = codeflash_output with pytest.raises(KeyError): log_event({"data": "value"}, "Client → OpenAI") def test_none_as_event(): with patch.object(logger, 'debug') as mock_debug: codeflash_output = create_event_logger("session1"); log_event = codeflash_output with pytest.raises(TypeError): log_event(None, "Client → OpenAI") def test_non_dict_event(): with patch.object(logger, 'debug') as mock_debug: codeflash_output = create_event_logger("session1"); log_event = codeflash_output with pytest.raises(TypeError): log_event("event", "Client → OpenAI") # Large Scale Test Cases def test_high_volume_of_events(): with patch.object(logger, 'debug') as mock_debug: codeflash_output = create_event_logger("session1"); log_event = codeflash_output for i in range(1000): log_event({"type": f"event{i}"}, "Client → OpenAI") def test_high_volume_of_same_events(): with patch.object(logger, 'debug') as mock_debug: codeflash_output = create_event_logger("session1"); log_event = codeflash_output for i in range(1000): log_event({"type": "event1"}, "Client → OpenAI") # Direction Variations def test_different_session_ids(): with patch.object(logger, 'debug') as mock_debug: codeflash_output = create_event_logger("session1"); log_event1 = codeflash_output codeflash_output = create_event_logger("session2"); log_event2 = codeflash_output log_event1({"type": "event1"}, "Client → OpenAI") log_event2({"type": "event1"}, "Client → OpenAI") # State Persistence def test_state_reset(): with patch.object(logger, 'debug') as mock_debug: codeflash_output = create_event_logger("session1"); log_event = codeflash_output log_event({"type": "event1"}, "Client → OpenAI") log_event({"type": "event2"}, "Client → OpenAI") log_event({"type": "event1"}, "Client → OpenAI") # Invalid Inputs from unittest.mock import call, patch # imports import pytest # used for our unit tests from langflow.api.v1.voice_mode import create_event_logger # function to test from langflow.logging import logger # unit tests # Basic Functionality def test_empty_event_dictionary(): logger_mock = patch('langflow.logging.logger').start() codeflash_output = create_event_logger("session1"); event_logger = codeflash_output with pytest.raises(KeyError): event_logger({}, "Client → OpenAI") def test_missing_event_type(): logger_mock = patch('langflow.logging.logger').start() codeflash_output = create_event_logger("session1"); event_logger = codeflash_output with pytest.raises(KeyError): event_logger({"data": "value"}, "Client → OpenAI") def test_null_event(): logger_mock = patch('langflow.logging.logger').start() codeflash_output = create_event_logger("session1"); event_logger = codeflash_output with pytest.raises(TypeError): event_logger(None, "Client → OpenAI") # Event Deduplication def test_count_increment_for_same_event_type(): logger_mock = patch('langflow.logging.logger').start() codeflash_output = create_event_logger("session1"); event_logger = codeflash_output event_logger({"type": "event1"}, "Client → OpenAI") event_logger({"type": "event1"}, "Client → OpenAI") def test_count_reset_on_different_event_type(): logger_mock = patch('langflow.logging.logger').start() codeflash_output = create_event_logger("session1"); event_logger = codeflash_output event_logger({"type": "event1"}, "Client → OpenAI") event_logger({"type": "event2"}, "Client → OpenAI") # Session ID Inclusion def test_high_volume_of_events(): logger_mock = patch('langflow.logging.logger').start() codeflash_output = create_event_logger("session1"); event_logger = codeflash_output for i in range(1000): event_logger({"type": f"event{i % 10}"}, "Client → OpenAI") def test_high_frequency_of_type_changes(): logger_mock = patch('langflow.logging.logger').start() codeflash_output = create_event_logger("session1"); event_logger = codeflash_output for i in range(1000): event_logger({"type": f"event{i % 2}"}, "Client → OpenAI") # Direction Handling

To test or edit this optimization locally git merge codeflash/optimize-pr7294-2025-03-27T15.17.27

Suggested change

state = {"last_event_type": None, "event_count": 0}

last_event_type = None

event_count = 0

def log_event(event: dict, direction: str) -> None:

"""Log WebSocket events with deduplication and counting.

Args:

event: The event dictionary to log

direction: The direction of the event (e.g., "Client → OpenAI")

"""

nonlocal last_event_type, event_count

event_type = event.get("type")

logger.debug(f"Event (session - {session_id}): {direction} {event_type}")

if event_type != last_event_type:

last_event_type = event_type

event_count = 0

event_count += 1

…-tts`) To optimize the `get_tts_config` function, we need to ensure that our cache lookups and assignments are as efficient as possible. Here is a more optimized version of the code. ### Changes. 1. Replaced the check `if session_id not in tts_config_cache:` with cache lookup `tts_config = tts_config_cache.get(session_id)` to avoid redundant dictionary lookups and make the code cleaner. 2. Assign `TTSConfig` to a variable only if necessary, which keeps the function's logic straightforward and efficient. In terms of optimization, these changes minimize the number of dictionary lookups and streamline the cache retrieval and assignment process.

…Modal component for managing audio playback state 🔧 (voice-assistant.tsx): pass isPlayingRef prop to VoiceAssistant component for controlling audio playback state

…(`voice-mode-tts`) In the provided code, the primary focus should be on optimizing the `VoiceConfig` class and its methods to improve performance in terms of efficient memory usage and faster execution. To achieve this, we can streamline the initialization process and minimize repeated computations. Here’s an updated version of the `VoiceConfig` class with optimizations. ### Key Optimizations. 1. **Static Method for Default Session**: Used a static method `_create_default_session` to generate the default session dictionary. This ensures that the dictionary is only defined once, and avoids redefining it within each instance, which saves memory and processing time. 2. **Use of `.copy()`**: Instead of creating a new dictionary each time in `get_session_dict`, the `copy()` method is used to return a new dictionary based on the `default_openai_realtime_session` template. This way, the same dictionary template is reused, avoiding unnecessary computations. 3. **Minimized Default Attribute Initialization**: Avoiding instantiation of certain attributes unless necessary reduces initial memory usage. These changes ensure that the initialization processes and attribute access in the `VoiceConfig` class are streamlined for better performance and efficient memory usage.

codeflash-ai · 2025-03-27T20:30:23Z

⚡️ Codeflash found optimizations for this PR

📄 14% (0.14x) speedup for `VoiceConfig.get_session_dict` in `src/backend/base/langflow/api/v1/voice_mode.py`

⏱️ Runtime : 578 microseconds → 508 microseconds (best of 151 runs)

I created a new dependent PR with the suggested changes. Please review:

⚡️ Speed up method VoiceConfig.get_session_dict by 14% in PR #7294 (voice-mode-tts) #7315

If you approve, it will be merged into this PR (branch voice-mode-tts).

…e-mode-tts`) To optimize this program, we can reduce unnecessary data creation and improve the memory usage. `numpy` itself is already highly optimized for performance, but let's ensure that we are making the most efficient use of its capabilities. ### Changes Made. 1. Within the `astype` function, included `order='C'` in order to ensure that the operation is done in C-style memory order which is generally faster for contiguous arrays. 2. Added `copy=False` argument which prevents creating a new array if the dtype conversion can be done in-place, saving memory and time. This should result in a more memory-efficient conversion process while maintaining the same functionality and data integrity.

codeflash-ai · 2025-03-27T21:14:26Z

src/backend/base/langflow/api/v1/voice_mode.py

+
+
+def pcm16_to_float_array(pcm_data):
+    values = np.frombuffer(pcm_data, dtype=np.int16).astype(np.float32)


⚡️Codeflash found 72% (0.72x) speedup for pcm16_to_float_array

⏱️ Runtime : 3.82 milliseconds → 2.22 milliseconds (best of 202 runs)

📝 Explanation and details

To optimize this program, we can reduce unnecessary data creation and improve the memory usage. numpy itself is already highly optimized for performance, but let's ensure that we are making the most efficient use of its capabilities.

Changes Made.

Within the astype function, included order='C' in order to ensure that the operation is done in C-style memory order which is generally faster for contiguous arrays.

Added copy=False argument which prevents creating a new array if the dtype conversion can be done in-place, saving memory and time.

This should result in a more memory-efficient conversion process while maintaining the same functionality and data integrity.

✅ Correctness verification report:

Test Status

⚙️ Existing Unit Tests 🔘 None Found

🌀 Generated Regression Tests ✅ 28 Passed

⏪ Replay Tests 🔘 None Found

🔎 Concolic Coverage Tests 🔘 None Found

📊 Tests Coverage undefined

🌀 Generated Regression Tests Details

import numpy as np # imports import pytest # used for our unit tests from langflow.api.v1.voice_mode import pcm16_to_float_array # unit tests def test_basic_functionality(): # Standard PCM Data pcm_data = b'\x00\x00\xff\x7f\x01\x80' # [0, 32767, -32767] expected = np.array([0, 32767, -32767], dtype=np.float32) / 32768.0 codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output def test_edge_empty_buffer(): # Empty Buffer pcm_data = b'' expected = np.array([], dtype=np.float32) codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output def test_edge_single_value(): # Single Value pcm_data = b'\x00\x00' # [0] expected = np.array([0], dtype=np.float32) / 32768.0 codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output def test_edge_max_positive_value(): # Maximum Positive Value pcm_data = b'\xff\x7f' # [32767] expected = np.array([32767], dtype=np.float32) / 32768.0 codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output def test_edge_max_negative_value(): # Maximum Negative Value pcm_data = b'\x00\x80' # [-32768] expected = np.array([-32768], dtype=np.float32) / 32768.0 codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output def test_boundary_alternating_extremes(): # Alternating Extremes pcm_data = b'\xff\x7f\x00\x80\xff\x7f\x00\x80' # [32767, -32768, 32767, -32768] expected = np.array([32767, -32768, 32767, -32768], dtype=np.float32) / 32768.0 codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output def test_large_scale_large_buffer(): # Large Buffer pcm_data = b'\x00\x00' * 1000000 # Array of one million zeros expected = np.zeros(1000000, dtype=np.float32) codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output def test_random_data(): # Random PCM Data np.random.seed(0) # Seed for reproducibility random_pcm = np.random.randint(-32768, 32767, size=100, dtype=np.int16) pcm_data = random_pcm.tobytes() expected = random_pcm.astype(np.float32) / 32768.0 codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output def test_special_all_zeros(): # All Zeros pcm_data = b'\x00\x00\x00\x00\x00\x00' # [0, 0, 0] expected = np.array([0, 0, 0], dtype=np.float32) / 32768.0 codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output def test_special_all_ones(): # All Ones pcm_data = b'\x01\x00\x01\x00\x01\x00' # [1, 1, 1] expected = np.array([1, 1, 1], dtype=np.float32) / 32768.0 codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output def test_invalid_non_pcm_data(): # Non-PCM Data pcm_data = b'\x01\x02\x03\x04' # Random bytes expected = np.array([513, 1027], dtype=np.float32) / 32768.0 codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output def test_mixed_values(): # Mixed Positive and Negative Values pcm_data = b'\x00\x00\xff\x7f\x00\x80\x01\x00' # [0, 32767, -32768, 1] expected = np.array([0, 32767, -32768, 1], dtype=np.float32) / 32768.0 codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output # Run the tests if __name__ == "__main__": pytest.main() # codeflash_output is used to check that the output of the original code is the same as that of the optimized code. import numpy as np # imports import pytest # used for our unit tests from langflow.api.v1.voice_mode import pcm16_to_float_array # unit tests def test_basic_functionality(): # Typical PCM Data pcm_data = b'\x00\x00\xff\x7f\x00\x80\x00@\xff\xbf' # [0, 32767, -32768, 16384, -16384] expected = np.array([0, 32767, -32768, 16384, -16384], dtype=np.float32) / 32768.0 np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected) # Repeating Pattern pcm_data = b'\xe8\x03\x18\xfc' # [1000, -1000] expected = np.array([1000, -1000], dtype=np.float32) / 32768.0 np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected) def test_edge_cases(): # Empty Buffer pcm_data = b'' expected = np.array([], dtype=np.float32) np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected) # Single Value pcm_data = b'\x00\x00' # [0] expected = np.array([0], dtype=np.float32) / 32768.0 np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected) # Minimum and Maximum Values pcm_data = b'\x00\x80\xff\x7f' # [-32768, 32767] expected = np.array([-32768, 32767], dtype=np.float32) / 32768.0 np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected) def test_invalid_inputs(): # Non-Bytes Input with pytest.raises(TypeError): pcm16_to_float_array("not a byte buffer") # Odd-Length Buffer with pytest.raises(ValueError): pcm16_to_float_array(b'\x00\x00\x00') def test_large_scale(): # Large Buffer pcm_data = np.random.randint(-32768, 32767, size=1000000, dtype=np.int16).tobytes() expected = np.frombuffer(pcm_data, dtype=np.int16).astype(np.float32) / 32768.0 np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected) def test_special_values(): # All Zeros pcm_data = b'\x00\x00' * 100 expected = np.zeros(100, dtype=np.float32) np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected) # All Ones pcm_data = b'\x01\x00' * 100 # [1, 1, 1, ..., 1] expected = np.ones(100, dtype=np.float32) / 32768.0 np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected) # Alternating Extremes pcm_data = b'\x00\x80\xff\x7f' * 50 # [-32768, 32767, -32768, 32767, ...] expected = np.array([-32768, 32767] * 50, dtype=np.float32) / 32768.0 np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected) def test_boundary_conditions(): # Near Zero Values pcm_data = b'\x01\x00\xff\xff' # [1, -1] expected = np.array([1, -1], dtype=np.float32) / 32768.0 np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected) def test_real_world_data(): # Audio Snippet pcm_data = b'\x00\x00\xff\x7f\x00\x80\x00@\xff\xbf' # [0, 32767, -32768, 16384, -16384] expected = np.array([0, 32767, -32768, 16384, -16384], dtype=np.float32) / 32768.0 np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected) def test_endianness(): # Little-Endian and Big-Endian pcm_data_le = b'\x01\x00\x00\x80' # [1, -32768] in little-endian pcm_data_be = b'\x00\x01\x80\x00' # [1, -32768] in big-endian expected_le = np.array([1, -32768], dtype=np.float32) / 32768.0 expected_be = np.array([256, -32768], dtype=np.float32) / 32768.0 np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data_le), expected_le) np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data_be), expected_be) def test_mixed_values(): # Mixed Positive and Negative Values pcm_data = b'\x00\x00\xff\x7f\x00\x80' # [0, 32767, -32768] expected = np.array([0, 32767, -32768], dtype=np.float32) / 32768.0 np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected) # codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To test or edit this optimization locally git merge codeflash/optimize-pr7294-2025-03-27T21.14.20

Suggested change

values = np.frombuffer(pcm_data, dtype=np.int16).astype(np.float32)

# Use the same buffer for float conversion to save memory

values = np.frombuffer(pcm_data, dtype=np.int16)

return values.astype(np.float32, order="C", copy=False) / 32768.0

…-mode-tts`) To optimize the `create_event_logger` function, we can make a few changes to improve the performance, such as reducing the number of dictionary accesses and avoiding repeated calculations. Instead of repeatedly accessing dictionary keys, we can store their values in local variables. Here is the optimized version. Changes made. 1. Replaced the nested `state` dictionary with local variables `last_event_type` and `event_count`. The use of local variables is faster than accessing dictionary items. 2. Changed `state["event_count"] = int(state["event_count"]) + 1` to `event_count += 1` which is more direct and eliminates redundant conversion to `int`. These changes should provide better performance by minimizing dictionary accesses and simplifying operations.

Cristhianzl

lgtm

… voice-mode-tts

Cristhianzl and others added 8 commits March 26, 2025 13:11

url change

f92690b

♻️ (chat-input.tsx): remove unused setNewSessionCloseVoiceAssistant f…

528873d

…unction to clean up code and improve readability

url

00a60cd

merge

9d410b9

new endpoint

76d40e1

phact requested a review from Cristhianzl March 27, 2025 05:10

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Mar 27, 2025