Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: voice mode tts endpoint #7294

Open
wants to merge 24 commits into
base: main
Choose a base branch
from
Open

feat: voice mode tts endpoint #7294

wants to merge 24 commits into from

Conversation

phact
Copy link
Collaborator

@phact phact commented Mar 27, 2025

This pull request introduces several changes to the voice mode API, including the addition of new WebSocket endpoints, improvements to event logging, and updates to the frontend to support session-based interactions. The most important changes are summarized below:

Backend Changes:

  • Added OpenAI import to voice_mode.py to enable integration with OpenAI's API.
  • Introduced a new create_event_logger function for logging WebSocket events with deduplication and counting.
  • Created a new WebSocket endpoint for text-to-speech (TTS) interaction, including the TTSConfig class and related functions.
  • Replaced the existing log_event function in process_vad_audio with the new create_event_logger function.

Frontend Changes:

  • Added session_id to MessagesQueryParams in use-get-messages-polling.ts to support session-based interactions.
  • Updated SessionSelector component to use setNewSessionCloseVoiceAssistant from voiceStore to manage voice assistant sessions.
  • Modified ChatViewWrapper to pass sidebarOpen prop to ChatView and adjust layout based on sidebar state.

Cristhianzl and others added 8 commits March 26, 2025 13:11
…ove code readability and maintainability

📝 (chat-input.tsx): Add functionality to set voice assistant active state when showAudioInput is true
📝 (voice-assistant.tsx): Add functionality to set voice assistant active state and scroll to bottom when closing audio input
📝 (chat-view.tsx): Update ChatView component to consider sidebarOpen and isVoiceAssistantActive states
📝 (voiceStore.ts): Add isVoiceAssistantActive state and setIsVoiceAssistantActive function to voice store
📝 (index.ts, voice.types.ts): Update types to include sidebarOpen prop in chatViewProps and isVoiceAssistantActive state in VoiceStoreType
…ty to chat input component

🔧 (voice-button.tsx): Update voice button to set new session close voice assistant state
🔧 (sidebar-open-view.tsx): Update sidebar open view to set new session close voice assistant state
🔧 (voiceStore.ts, voice.types.ts): Add new session close voice assistant state and setter to voice store and types
…unction to clean up code and improve readability
…ove code readability and maintainability

📝 (chat-input.tsx): Add functionality to set voice assistant active state when showAudioInput is true
📝 (voice-assistant.tsx): Add functionality to set voice assistant active state and scroll to bottom when closing audio input
📝 (chat-view.tsx): Update ChatView component to consider sidebarOpen and isVoiceAssistantActive states
📝 (voiceStore.ts): Add isVoiceAssistantActive state and setIsVoiceAssistantActive function to voice store
📝 (index.ts, voice.types.ts): Update types to include sidebarOpen prop in chatViewProps and isVoiceAssistantActive state in VoiceStoreType
@phact phact requested a review from Cristhianzl March 27, 2025 05:10
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Mar 27, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Mar 27, 2025
Copy link

codspeed-hq bot commented Mar 27, 2025

CodSpeed Performance Report

Merging #7294 will improve performances by 58.55%

Comparing voice-mode-tts (8f275e1) with main (e82b23f)

Summary

⚡ 2 improvements
✅ 17 untouched benchmarks

Benchmarks breakdown

Benchmark BASE HEAD Change
test_build_flow_invalid_job_id 12.4 ms 8 ms +54.83%
test_cancel_nonexistent_build 12.2 ms 7.7 ms +58.55%

codeflash-ai bot added a commit that referenced this pull request Mar 27, 2025
…-tts`)

To optimize the provided `get_tts_config` function for better runtime performance, particularly in cases where `session_id` is not `None` but frequently not in the `tts_config_cache`, we can eliminate some redundant lookups and minimize dictionary access.

Here’s the optimized version.



### Optimizations.
1. **Cache Lookup Optimization**: Instead of checking for the key existence and then retrieving it or setting it, we directly attempt to retrieve the value using a try-except block. This uses a single dictionary lookup in the common case where the session ID is found in the cache.
2. **Exception Handling**: Using `KeyError` in the `except` block to handle the case where the session ID is not found in the cache. This approach is faster due to fewer dictionary operations in the normal flow.

This results in the function performing fewer dictionary lookups under typical usage, leading to improved performance.
…ter in speech creation function

🐛 (use-start-conversation.ts): update WebSocket URL to use flow_tts endpoint and add support for audio language and input audio transcription model in WebSocket session update configuration
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Mar 27, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Mar 27, 2025
codeflash-ai bot added a commit that referenced this pull request Mar 27, 2025
…-tts`)

To optimize the `get_tts_config` function for speed, we can reduce the number of lookups on the `tts_config_cache` dictionary and ensure that the instance creation is kept minimal. Here is the revised version.



In this optimized version.
1. Remove the unnecessary variable `msg`.
2. Reduced the lookup on `tts_config_cache` by using `dict.get()`, which avoids the need for an explicit dictionary lookup before accessing the value.
3. This approach still lazily initializes the `TTSConfig` and ensures thread safety by directly working on the cache after a single not-found check.

By minimizing the dictionary lookups and being more direct in our conditional checks, we should achieve a slight performance improvement, especially when cache misses are not very frequent.

if session_id not in tts_config_cache:
tts_config_cache[session_id] = TTSConfig(session_id, openai_key)
return tts_config_cache[session_id]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 22% (0.22x) speedup for get_tts_config

⏱️ Runtime : 1.04 millisecond 857 microseconds (best of 5 runs)

📝 Explanation and details

To optimize the get_tts_config function for speed, we can reduce the number of lookups on the tts_config_cache dictionary and ensure that the instance creation is kept minimal. Here is the revised version.

In this optimized version.

  1. Remove the unnecessary variable msg.
  2. Reduced the lookup on tts_config_cache by using dict.get(), which avoids the need for an explicit dictionary lookup before accessing the value.
  3. This approach still lazily initializes the TTSConfig and ensures thread safety by directly working on the cache after a single not-found check.

By minimizing the dictionary lookups and being more direct in our conditional checks, we should achieve a slight performance improvement, especially when cache misses are not very frequent.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 6 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage undefined
🌀 Generated Regression Tests Details
from typing import Any

# imports
import pytest  # used for our unit tests
from langflow.api.v1.voice_mode import get_tts_config
from openai import OpenAI


# function to test
class TTSConfig:
    def __init__(self, session_id: str, openai_key: str):
        self.session_id = session_id
        self.barge_in_enabled = False

        self.default_tts_session = {
            "type": "transcription_session.update",
            "session": {
                "input_audio_format": "pcm16",
                "input_audio_transcription": {
                    "model": "gpt-4o-mini-transcribe",
                    "language": "en",
                },
                "turn_detection": {
                    "type": "server_vad",
                    "threshold": 0.5,  # Placeholder value
                    "prefix_padding_ms": 300,  # Placeholder value
                    "silence_duration_ms": 500,  # Placeholder value
                },
                "input_audio_noise_reduction": {"type": "near_field"},
                "include": [],
            },
        }

        self.tts_session: dict[str, Any] = {}
        self.oai_client = OpenAI(api_key=openai_key)

    def get_session_dict(self):
        """Return a copy of the default session dictionary with current settings."""
        return dict(self.default_tts_session)

    def get_openai_client(self):
        return self.oai_client

tts_config_cache: dict[str, TTSConfig] = {}
from langflow.api.v1.voice_mode import get_tts_config


# unit tests

def test_session_id_none():
    # Test with session_id as None
    with pytest.raises(ValueError, match="session_id cannot be None"):
        get_tts_config(None, "key1")






def test_non_string_session_id():
    # Test with non-string session_id
    with pytest.raises(TypeError):
        get_tts_config(123, "key1")
    with pytest.raises(TypeError):
        get_tts_config(["list"], "key1")

def test_non_string_openai_key():
    # Test with non-string openai_key
    with pytest.raises(TypeError):
        get_tts_config("session1", 123)
    with pytest.raises(TypeError):
        get_tts_config("session1", ["list"])








from typing import Any

# imports
import pytest  # used for our unit tests
from langflow.api.v1.voice_mode import get_tts_config
# function to test
from openai import OpenAI

# Define constants used in the TTSConfig class
SILENCE_THRESHOLD = 0.5
PREFIX_PADDING_MS = 300
SILENCE_DURATION_MS = 700

class TTSConfig:
    def __init__(self, session_id: str, openai_key: str):
        self.session_id = session_id
        self.barge_in_enabled = False

        self.default_tts_session = {
            "type": "transcription_session.update",
            "session": {
                "input_audio_format": "pcm16",
                "input_audio_transcription": {
                    "model": "gpt-4o-mini-transcribe",
                    # "prompt": "expect words in english",
                    "language": "en",
                },
                "turn_detection": {
                    "type": "server_vad",
                    "threshold": SILENCE_THRESHOLD,
                    "prefix_padding_ms": PREFIX_PADDING_MS,
                    "silence_duration_ms": SILENCE_DURATION_MS,
                },
                "input_audio_noise_reduction": {"type": "near_field"},
                "include": [],
            },
        }

        self.tts_session: dict[str, Any] = {}
        self.oai_client = OpenAI(api_key=openai_key)

    def get_session_dict(self):
        """Return a copy of the default session dictionary with current settings."""
        return dict(self.default_tts_session)

    def get_openai_client(self):
        return self.oai_client

tts_config_cache: dict[str, TTSConfig] = {}
from langflow.api.v1.voice_mode import get_tts_config

# unit tests

# Valid Inputs

def test_none_session_id_raises_value_error():
    with pytest.raises(ValueError, match="session_id cannot be None"):
        get_tts_config(None, "valid_key_123")

To test or edit this optimization locally git merge codeflash/optimize-pr7294-2025-03-27T14.00.20

Suggested change
return tts_config_cache[session_id]
raise ValueError("session_id cannot be None")
# Use get with a default to reduce dictionary lookups
config = tts_config_cache.get(session_id)
if config is None:
config = TTSConfig(session_id, openai_key)
tts_config_cache[session_id] = config
return config
async def flow_tts_websocket_no_session(

…fig class for voice mode customization

♻️ (use-start-conversation.ts): refactor code to use "transcription_session.update" type and update session attributes based on audioSettings and audioLanguage variables
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Mar 27, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Mar 27, 2025
codeflash-ai bot added a commit that referenced this pull request Mar 27, 2025
…-mode-tts`)

Sure, we can optimize the given `create_event_logger` function for better runtime performance and memory usage..

1. Using `nonlocal` keyword for state variables to avoid dictionary key access overhead.
2. Simplifying the event count increment using `+=` operator and initializing it as 0 by default.

Let's rewrite the code.



Key Changes.
- Replaced `state` dictionary with local variables `last_event_type` and `event_count`.
- Used `nonlocal` to modify `last_event_type` and `event_count` from the inner function.
- Simplified event count initialization and increment process.
Args:
session_id: The session ID to include in log messages
"""
state = {"last_event_type": None, "event_count": 0}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 69% (0.69x) speedup for create_event_logger

⏱️ Runtime : 35.4 microseconds 21.0 microseconds (best of 19 runs)

📝 Explanation and details

Sure, we can optimize the given create_event_logger function for better runtime performance and memory usage..

  1. Using nonlocal keyword for state variables to avoid dictionary key access overhead.
  2. Simplifying the event count increment using += operator and initializing it as 0 by default.

Let's rewrite the code.

Key Changes.

  • Replaced state dictionary with local variables last_event_type and event_count.
  • Used nonlocal to modify last_event_type and event_count from the inner function.
  • Simplified event count initialization and increment process.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 14 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage undefined
🌀 Generated Regression Tests Details
from unittest.mock import patch

# imports
import pytest  # used for our unit tests
from langflow.api.v1.voice_mode import create_event_logger
# function to test
from langflow.logging import logger

# unit tests

# Basic Functionality



def test_mixed_event_types():
    with patch.object(logger, 'debug') as mock_debug:
        codeflash_output = create_event_logger("session1"); log_event = codeflash_output
        log_event({"type": "event1"}, "Client → OpenAI")
        log_event({"type": "event2"}, "Client → OpenAI")
        log_event({"type": "event1"}, "Client → OpenAI")

# Edge Cases
def test_empty_event_dictionary():
    with patch.object(logger, 'debug') as mock_debug:
        codeflash_output = create_event_logger("session1"); log_event = codeflash_output
        with pytest.raises(KeyError):
            log_event({}, "Client → OpenAI")

def test_missing_event_type_key():
    with patch.object(logger, 'debug') as mock_debug:
        codeflash_output = create_event_logger("session1"); log_event = codeflash_output
        with pytest.raises(KeyError):
            log_event({"data": "value"}, "Client → OpenAI")

def test_none_as_event():
    with patch.object(logger, 'debug') as mock_debug:
        codeflash_output = create_event_logger("session1"); log_event = codeflash_output
        with pytest.raises(TypeError):
            log_event(None, "Client → OpenAI")

def test_non_dict_event():
    with patch.object(logger, 'debug') as mock_debug:
        codeflash_output = create_event_logger("session1"); log_event = codeflash_output
        with pytest.raises(TypeError):
            log_event("event", "Client → OpenAI")

# Large Scale Test Cases
def test_high_volume_of_events():
    with patch.object(logger, 'debug') as mock_debug:
        codeflash_output = create_event_logger("session1"); log_event = codeflash_output
        for i in range(1000):
            log_event({"type": f"event{i}"}, "Client → OpenAI")

def test_high_volume_of_same_events():
    with patch.object(logger, 'debug') as mock_debug:
        codeflash_output = create_event_logger("session1"); log_event = codeflash_output
        for i in range(1000):
            log_event({"type": "event1"}, "Client → OpenAI")

# Direction Variations

def test_different_session_ids():
    with patch.object(logger, 'debug') as mock_debug:
        codeflash_output = create_event_logger("session1"); log_event1 = codeflash_output
        codeflash_output = create_event_logger("session2"); log_event2 = codeflash_output
        log_event1({"type": "event1"}, "Client → OpenAI")
        log_event2({"type": "event1"}, "Client → OpenAI")

# State Persistence
def test_state_reset():
    with patch.object(logger, 'debug') as mock_debug:
        codeflash_output = create_event_logger("session1"); log_event = codeflash_output
        log_event({"type": "event1"}, "Client → OpenAI")
        log_event({"type": "event2"}, "Client → OpenAI")
        log_event({"type": "event1"}, "Client → OpenAI")

# Invalid Inputs



from unittest.mock import call, patch

# imports
import pytest  # used for our unit tests
from langflow.api.v1.voice_mode import create_event_logger
# function to test
from langflow.logging import logger

# unit tests

# Basic Functionality



def test_empty_event_dictionary():
    logger_mock = patch('langflow.logging.logger').start()
    codeflash_output = create_event_logger("session1"); event_logger = codeflash_output
    with pytest.raises(KeyError):
        event_logger({}, "Client → OpenAI")

def test_missing_event_type():
    logger_mock = patch('langflow.logging.logger').start()
    codeflash_output = create_event_logger("session1"); event_logger = codeflash_output
    with pytest.raises(KeyError):
        event_logger({"data": "value"}, "Client → OpenAI")

def test_null_event():
    logger_mock = patch('langflow.logging.logger').start()
    codeflash_output = create_event_logger("session1"); event_logger = codeflash_output
    with pytest.raises(TypeError):
        event_logger(None, "Client → OpenAI")

# Event Deduplication



def test_count_increment_for_same_event_type():
    logger_mock = patch('langflow.logging.logger').start()
    codeflash_output = create_event_logger("session1"); event_logger = codeflash_output
    event_logger({"type": "event1"}, "Client → OpenAI")
    event_logger({"type": "event1"}, "Client → OpenAI")

def test_count_reset_on_different_event_type():
    logger_mock = patch('langflow.logging.logger').start()
    codeflash_output = create_event_logger("session1"); event_logger = codeflash_output
    event_logger({"type": "event1"}, "Client → OpenAI")
    event_logger({"type": "event2"}, "Client → OpenAI")

# Session ID Inclusion


def test_high_volume_of_events():
    logger_mock = patch('langflow.logging.logger').start()
    codeflash_output = create_event_logger("session1"); event_logger = codeflash_output
    for i in range(1000):
        event_logger({"type": f"event{i % 10}"}, "Client → OpenAI")

def test_high_frequency_of_type_changes():
    logger_mock = patch('langflow.logging.logger').start()
    codeflash_output = create_event_logger("session1"); event_logger = codeflash_output
    for i in range(1000):
        event_logger({"type": f"event{i % 2}"}, "Client → OpenAI")

# Direction Handling

To test or edit this optimization locally git merge codeflash/optimize-pr7294-2025-03-27T15.17.27

Suggested change
state = {"last_event_type": None, "event_count": 0}
last_event_type = None
event_count = 0
def log_event(event: dict, direction: str) -> None:
"""Log WebSocket events with deduplication and counting.
Args:
event: The event dictionary to log
direction: The direction of the event (e.g., "Client → OpenAI")
"""
nonlocal last_event_type, event_count
event_type = event.get("type")
logger.debug(f"Event (session - {session_id}): {direction} {event_type}")
if event_type != last_event_type:
last_event_type = event_type
event_count = 0
event_count += 1

codeflash-ai bot added a commit that referenced this pull request Mar 27, 2025
…-tts`)

To optimize the `get_tts_config` function, we need to ensure that our cache lookups and assignments are as efficient as possible. Here is a more optimized version of the code.



### Changes.
1. Replaced the check `if session_id not in tts_config_cache:` with cache lookup `tts_config = tts_config_cache.get(session_id)` to avoid redundant dictionary lookups and make the code cleaner.
2. Assign `TTSConfig` to a variable only if necessary, which keeps the function's logic straightforward and efficient. 

In terms of optimization, these changes minimize the number of dictionary lookups and streamline the cache retrieval and assignment process.
…Modal component for managing audio playback state

🔧 (voice-assistant.tsx): pass isPlayingRef prop to VoiceAssistant component for controlling audio playback state
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Mar 27, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Mar 27, 2025
codeflash-ai bot added a commit that referenced this pull request Mar 27, 2025
…(`voice-mode-tts`)

In the provided code, the primary focus should be on optimizing the `VoiceConfig` class and its methods to improve performance in terms of efficient memory usage and faster execution. To achieve this, we can streamline the initialization process and minimize repeated computations. Here’s an updated version of the `VoiceConfig` class with optimizations.



### Key Optimizations.

1. **Static Method for Default Session**: Used a static method `_create_default_session` to generate the default session dictionary. This ensures that the dictionary is only defined once, and avoids redefining it within each instance, which saves memory and processing time.
2. **Use of `.copy()`**: Instead of creating a new dictionary each time in `get_session_dict`, the `copy()` method is used to return a new dictionary based on the `default_openai_realtime_session` template. This way, the same dictionary template is reused, avoiding unnecessary computations.
3. **Minimized Default Attribute Initialization**: Avoiding instantiation of certain attributes unless necessary reduces initial memory usage.

These changes ensure that the initialization processes and attribute access in the `VoiceConfig` class are streamlined for better performance and efficient memory usage.
Copy link
Contributor

codeflash-ai bot commented Mar 27, 2025

⚡️ Codeflash found optimizations for this PR

📄 14% (0.14x) speedup for VoiceConfig.get_session_dict in src/backend/base/langflow/api/v1/voice_mode.py

⏱️ Runtime : 578 microseconds 508 microseconds (best of 151 runs)

I created a new dependent PR with the suggested changes. Please review:

If you approve, it will be merged into this PR (branch voice-mode-tts).

codeflash-ai bot added a commit that referenced this pull request Mar 27, 2025
…e-mode-tts`)

To optimize this program, we can reduce unnecessary data creation and improve the memory usage. `numpy` itself is already highly optimized for performance, but let's ensure that we are making the most efficient use of its capabilities.



### Changes Made.
1. Within the `astype` function, included `order='C'` in order to ensure that the operation is done in C-style memory order which is generally faster for contiguous arrays.
2. Added `copy=False` argument which prevents creating a new array if the dtype conversion can be done in-place, saving memory and time.
  
This should result in a more memory-efficient conversion process while maintaining the same functionality and data integrity.


def pcm16_to_float_array(pcm_data):
values = np.frombuffer(pcm_data, dtype=np.int16).astype(np.float32)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚡️Codeflash found 72% (0.72x) speedup for pcm16_to_float_array

⏱️ Runtime : 3.82 milliseconds 2.22 milliseconds (best of 202 runs)

📝 Explanation and details

To optimize this program, we can reduce unnecessary data creation and improve the memory usage. numpy itself is already highly optimized for performance, but let's ensure that we are making the most efficient use of its capabilities.

Changes Made.

  1. Within the astype function, included order='C' in order to ensure that the operation is done in C-style memory order which is generally faster for contiguous arrays.
  2. Added copy=False argument which prevents creating a new array if the dtype conversion can be done in-place, saving memory and time.

This should result in a more memory-efficient conversion process while maintaining the same functionality and data integrity.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 28 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage undefined
🌀 Generated Regression Tests Details
import numpy as np
# imports
import pytest  # used for our unit tests
from langflow.api.v1.voice_mode import pcm16_to_float_array

# unit tests

def test_basic_functionality():
    # Standard PCM Data
    pcm_data = b'\x00\x00\xff\x7f\x01\x80'  # [0, 32767, -32767]
    expected = np.array([0, 32767, -32767], dtype=np.float32) / 32768.0
    codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output

def test_edge_empty_buffer():
    # Empty Buffer
    pcm_data = b''
    expected = np.array([], dtype=np.float32)
    codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output

def test_edge_single_value():
    # Single Value
    pcm_data = b'\x00\x00'  # [0]
    expected = np.array([0], dtype=np.float32) / 32768.0
    codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output

def test_edge_max_positive_value():
    # Maximum Positive Value
    pcm_data = b'\xff\x7f'  # [32767]
    expected = np.array([32767], dtype=np.float32) / 32768.0
    codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output

def test_edge_max_negative_value():
    # Maximum Negative Value
    pcm_data = b'\x00\x80'  # [-32768]
    expected = np.array([-32768], dtype=np.float32) / 32768.0
    codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output

def test_boundary_alternating_extremes():
    # Alternating Extremes
    pcm_data = b'\xff\x7f\x00\x80\xff\x7f\x00\x80'  # [32767, -32768, 32767, -32768]
    expected = np.array([32767, -32768, 32767, -32768], dtype=np.float32) / 32768.0
    codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output

def test_large_scale_large_buffer():
    # Large Buffer
    pcm_data = b'\x00\x00' * 1000000  # Array of one million zeros
    expected = np.zeros(1000000, dtype=np.float32)
    codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output

def test_random_data():
    # Random PCM Data
    np.random.seed(0)  # Seed for reproducibility
    random_pcm = np.random.randint(-32768, 32767, size=100, dtype=np.int16)
    pcm_data = random_pcm.tobytes()
    expected = random_pcm.astype(np.float32) / 32768.0
    codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output

def test_special_all_zeros():
    # All Zeros
    pcm_data = b'\x00\x00\x00\x00\x00\x00'  # [0, 0, 0]
    expected = np.array([0, 0, 0], dtype=np.float32) / 32768.0
    codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output

def test_special_all_ones():
    # All Ones
    pcm_data = b'\x01\x00\x01\x00\x01\x00'  # [1, 1, 1]
    expected = np.array([1, 1, 1], dtype=np.float32) / 32768.0
    codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output

def test_invalid_non_pcm_data():
    # Non-PCM Data
    pcm_data = b'\x01\x02\x03\x04'  # Random bytes
    expected = np.array([513, 1027], dtype=np.float32) / 32768.0
    codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output

def test_mixed_values():
    # Mixed Positive and Negative Values
    pcm_data = b'\x00\x00\xff\x7f\x00\x80\x01\x00'  # [0, 32767, -32768, 1]
    expected = np.array([0, 32767, -32768, 1], dtype=np.float32) / 32768.0
    codeflash_output = pcm16_to_float_array(pcm_data); result = codeflash_output

# Run the tests
if __name__ == "__main__":
    pytest.main()
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

import numpy as np
# imports
import pytest  # used for our unit tests
from langflow.api.v1.voice_mode import pcm16_to_float_array

# unit tests

def test_basic_functionality():
    # Typical PCM Data
    pcm_data = b'\x00\x00\xff\x7f\x00\x80\x00@\xff\xbf'  # [0, 32767, -32768, 16384, -16384]
    expected = np.array([0, 32767, -32768, 16384, -16384], dtype=np.float32) / 32768.0
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected)

    # Repeating Pattern
    pcm_data = b'\xe8\x03\x18\xfc'  # [1000, -1000]
    expected = np.array([1000, -1000], dtype=np.float32) / 32768.0
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected)

def test_edge_cases():
    # Empty Buffer
    pcm_data = b''
    expected = np.array([], dtype=np.float32)
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected)

    # Single Value
    pcm_data = b'\x00\x00'  # [0]
    expected = np.array([0], dtype=np.float32) / 32768.0
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected)

    # Minimum and Maximum Values
    pcm_data = b'\x00\x80\xff\x7f'  # [-32768, 32767]
    expected = np.array([-32768, 32767], dtype=np.float32) / 32768.0
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected)

def test_invalid_inputs():
    # Non-Bytes Input
    with pytest.raises(TypeError):
        pcm16_to_float_array("not a byte buffer")

    # Odd-Length Buffer
    with pytest.raises(ValueError):
        pcm16_to_float_array(b'\x00\x00\x00')

def test_large_scale():
    # Large Buffer
    pcm_data = np.random.randint(-32768, 32767, size=1000000, dtype=np.int16).tobytes()
    expected = np.frombuffer(pcm_data, dtype=np.int16).astype(np.float32) / 32768.0
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected)

def test_special_values():
    # All Zeros
    pcm_data = b'\x00\x00' * 100
    expected = np.zeros(100, dtype=np.float32)
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected)

    # All Ones
    pcm_data = b'\x01\x00' * 100  # [1, 1, 1, ..., 1]
    expected = np.ones(100, dtype=np.float32) / 32768.0
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected)

    # Alternating Extremes
    pcm_data = b'\x00\x80\xff\x7f' * 50  # [-32768, 32767, -32768, 32767, ...]
    expected = np.array([-32768, 32767] * 50, dtype=np.float32) / 32768.0
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected)

def test_boundary_conditions():
    # Near Zero Values
    pcm_data = b'\x01\x00\xff\xff'  # [1, -1]
    expected = np.array([1, -1], dtype=np.float32) / 32768.0
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected)

def test_real_world_data():
    # Audio Snippet
    pcm_data = b'\x00\x00\xff\x7f\x00\x80\x00@\xff\xbf'  # [0, 32767, -32768, 16384, -16384]
    expected = np.array([0, 32767, -32768, 16384, -16384], dtype=np.float32) / 32768.0
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected)

def test_endianness():
    # Little-Endian and Big-Endian
    pcm_data_le = b'\x01\x00\x00\x80'  # [1, -32768] in little-endian
    pcm_data_be = b'\x00\x01\x80\x00'  # [1, -32768] in big-endian
    expected_le = np.array([1, -32768], dtype=np.float32) / 32768.0
    expected_be = np.array([256, -32768], dtype=np.float32) / 32768.0
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data_le), expected_le)
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data_be), expected_be)

def test_mixed_values():
    # Mixed Positive and Negative Values
    pcm_data = b'\x00\x00\xff\x7f\x00\x80'  # [0, 32767, -32768]
    expected = np.array([0, 32767, -32768], dtype=np.float32) / 32768.0
    np.testing.assert_array_almost_equal(pcm16_to_float_array(pcm_data), expected)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To test or edit this optimization locally git merge codeflash/optimize-pr7294-2025-03-27T21.14.20

Suggested change
values = np.frombuffer(pcm_data, dtype=np.int16).astype(np.float32)
# Use the same buffer for float conversion to save memory
values = np.frombuffer(pcm_data, dtype=np.int16)
return values.astype(np.float32, order="C", copy=False) / 32768.0

codeflash-ai bot added a commit that referenced this pull request Mar 27, 2025
…-mode-tts`)

To optimize the `create_event_logger` function, we can make a few changes to improve the performance, such as reducing the number of dictionary accesses and avoiding repeated calculations. Instead of repeatedly accessing dictionary keys, we can store their values in local variables.

Here is the optimized version.



Changes made.
1. Replaced the nested `state` dictionary with local variables `last_event_type` and `event_count`. The use of local variables is faster than accessing dictionary items.
2. Changed `state["event_count"] = int(state["event_count"]) + 1` to `event_count += 1` which is more direct and eliminates redundant conversion to `int`.

These changes should provide better performance by minimizing dictionary accesses and simplifying operations.
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Mar 28, 2025
Copy link
Member

@Cristhianzl Cristhianzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 28, 2025
@langflow-ai langflow-ai deleted a comment from codeflash-ai bot Mar 28, 2025
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Mar 28, 2025
@Cristhianzl Cristhianzl enabled auto-merge March 29, 2025 02:08
@Cristhianzl Cristhianzl disabled auto-merge March 29, 2025 02:08
@github-actions github-actions bot added enhancement New feature or request and removed enhancement New feature or request labels Mar 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lgtm This PR has been approved by a maintainer size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants