⚡️ Speed up method `OpenAIWhisperAudioTranscriptionConfig.transform_audio_transcription_request` by 18% #170

codeflash-ai · 2025-10-30T16:25:22Z

📄 18% (0.18x) speedup for `OpenAIWhisperAudioTranscriptionConfig.transform_audio_transcription_request` in `litellm/llms/openai/transcriptions/whisper_transformation.py`

⏱️ Runtime : 167 microseconds → 141 microseconds (best of 238 runs)

📝 Explanation and details

The optimized code achieves an 18% speedup by eliminating expensive dictionary operations and reducing redundant key lookups:

Key Optimizations:

Eliminated dictionary unpacking overhead: The original {"model": model, "file": audio_file, **optional_params} creates a new dictionary and performs an expensive merge operation. The optimized version uses optional_params.copy() followed by direct key assignment, which is significantly faster.
Reduced dictionary lookups: The original code performed multiple lookups on data["response_format"] within the conditional check. The optimized version uses data.get("response_format") once and stores the result, then uses membership testing in ("text", "json") which is more efficient than separate equality checks.
Streamlined conditional logic: Instead of checking "response_format" not in data or (data["response_format"] == "text" or data["response_format"] == "json"), the optimized version uses a single get() call and cleaner conditional structure.

Performance Impact by Test Case:

Large parameter sets: Shows dramatic improvements (up to 195% faster for 1000+ parameters) because the optimization eliminates the expensive dictionary merge operation that scales poorly with parameter count.
Simple cases: Shows modest gains (1-5% faster) due to reduced lookup overhead.
Empty parameter cases: Sometimes slightly slower (~8% in some cases) due to the overhead of the copy() call when there's nothing to copy, but this is outweighed by gains in typical usage patterns.

The optimization is particularly effective for scenarios with many optional parameters, which is common in ML API configurations.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 172 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	✅ 2 Passed
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

import pytest  # used for our unit tests
from litellm.llms.openai.transcriptions.whisper_transformation import \
    OpenAIWhisperAudioTranscriptionConfig


# function to test
class AudioTranscriptionRequestData:
    """Dummy class for test purposes, simulating the real return type."""
    def __init__(self, data):
        self.data = data

class FileTypes:
    """Dummy class for test purposes, simulating file types."""
    def __init__(self, filename, content):
        self.filename = filename
        self.content = content

class BaseAudioTranscriptionConfig:
    """Base class for test purposes."""
    pass
from litellm.llms.openai.transcriptions.whisper_transformation import \
    OpenAIWhisperAudioTranscriptionConfig

# unit tests

@pytest.fixture
def config():
    """Fixture to provide a fresh config instance for each test."""
    return OpenAIWhisperAudioTranscriptionConfig()

@pytest.fixture
def dummy_file():
    """Fixture to provide a dummy FileTypes instance."""
    return FileTypes(filename="audio.wav", content=b"dummydata")

# 1. Basic Test Cases

def test_basic_text_response_format(config, dummy_file):
    # Test with response_format explicitly set to 'text'
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params={"response_format": "text"},
        litellm_params={}
    ); result = codeflash_output # 2.17μs -> 2.13μs (2.26% faster)

def test_basic_json_response_format(config, dummy_file):
    # Test with response_format explicitly set to 'json'
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params={"response_format": "json"},
        litellm_params={}
    ); result = codeflash_output # 2.20μs -> 2.17μs (1.15% faster)

def test_basic_verbose_json_response_format(config, dummy_file):
    # Test with response_format set to 'verbose_json'
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params={"response_format": "verbose_json"},
        litellm_params={}
    ); result = codeflash_output # 2.00μs -> 1.93μs (3.58% faster)

def test_basic_no_response_format(config, dummy_file):
    # Test with no response_format provided
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params={},
        litellm_params={}
    ); result = codeflash_output # 1.92μs -> 2.04μs (6.21% slower)

def test_basic_other_params(config, dummy_file):
    # Test with additional parameters
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params={"language": "en", "temperature": 0.5},
        litellm_params={}
    ); result = codeflash_output # 1.86μs -> 2.03μs (8.41% slower)

# 2. Edge Test Cases

def test_edge_empty_model(config, dummy_file):
    # Test with empty model string
    codeflash_output = config.transform_audio_transcription_request(
        model="",
        audio_file=dummy_file,
        optional_params={},
        litellm_params={}
    ); result = codeflash_output # 1.76μs -> 1.93μs (9.11% slower)

def test_edge_none_model(config, dummy_file):
    # Test with model set to None
    codeflash_output = config.transform_audio_transcription_request(
        model=None,
        audio_file=dummy_file,
        optional_params={},
        litellm_params={}
    ); result = codeflash_output # 1.81μs -> 1.78μs (1.40% faster)

def test_edge_empty_optional_params(config, dummy_file):
    # Test with empty optional_params
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params={},
        litellm_params={}
    ); result = codeflash_output # 1.76μs -> 1.92μs (8.07% slower)


def test_edge_response_format_case_sensitive(config, dummy_file):
    # Test with response_format in different case
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params={"response_format": "Text"},
        litellm_params={}
    ); result = codeflash_output # 2.81μs -> 2.78μs (1.19% faster)

def test_edge_response_format_unexpected_value(config, dummy_file):
    # Test with response_format set to an unexpected value
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params={"response_format": "xml"},
        litellm_params={}
    ); result = codeflash_output # 2.12μs -> 2.17μs (1.98% slower)

def test_edge_file_none(config):
    # Test with audio_file set to None
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=None,
        optional_params={},
        litellm_params={}
    ); result = codeflash_output # 1.88μs -> 1.95μs (3.74% slower)

def test_edge_file_unusual_type(config):
    # Test with audio_file as a string (not FileTypes)
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file="not_a_filetype",
        optional_params={},
        litellm_params={}
    ); result = codeflash_output # 1.84μs -> 1.85μs (0.702% slower)

def test_edge_optional_params_overwrite_model(config, dummy_file):
    # Test if optional_params can overwrite model
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params={"model": "other-model"},
        litellm_params={}
    ); result = codeflash_output # 2.05μs -> 2.11μs (2.75% slower)

def test_edge_optional_params_with_none_values(config, dummy_file):
    # Test with optional_params containing None values
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params={"language": None},
        litellm_params={}
    ); result = codeflash_output # 1.93μs -> 2.04μs (5.50% slower)

# 3. Large Scale Test Cases

def test_large_scale_many_optional_params(config, dummy_file):
    # Test with many optional parameters (up to 1000)
    many_params = {f"param_{i}": i for i in range(1000)}
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params=many_params,
        litellm_params={}
    ); result = codeflash_output # 15.4μs -> 5.23μs (195% faster)
    # All params should be present
    for i in range(1000):
        pass

def test_large_scale_large_file_object(config):
    # Test with a large file object (simulate with large bytes)
    large_content = b"x" * 1000000  # 1MB
    large_file = FileTypes(filename="large_audio.wav", content=large_content)
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=large_file,
        optional_params={},
        litellm_params={}
    ); result = codeflash_output # 2.19μs -> 2.25μs (2.49% slower)

def test_large_scale_long_model_name(config, dummy_file):
    # Test with a very long model name
    long_model = "whisper-" + "x" * 500
    codeflash_output = config.transform_audio_transcription_request(
        model=long_model,
        audio_file=dummy_file,
        optional_params={},
        litellm_params={}
    ); result = codeflash_output # 1.87μs -> 1.94μs (3.60% slower)

def test_large_scale_multiple_calls(config, dummy_file):
    # Test performance and determinism with 100 calls
    for i in range(100):
        codeflash_output = config.transform_audio_transcription_request(
            model=f"whisper-{i}",
            audio_file=dummy_file,
            optional_params={"response_format": "text"},
            litellm_params={}
        ); result = codeflash_output # 59.9μs -> 59.1μs (1.37% faster)

def test_large_scale_optional_params_with_large_strings(config, dummy_file):
    # Test with optional_params containing large strings
    large_string = "a" * 10000
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params={"large_param": large_string},
        litellm_params={}
    ); result = codeflash_output # 1.66μs -> 1.67μs (1.02% slower)

# Additional edge: litellm_params is ignored, but should not affect output
def test_litellm_params_ignored(config, dummy_file):
    # Test that litellm_params does not affect output
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params={},
        litellm_params={"irrelevant": 123}
    ); result = codeflash_output # 1.59μs -> 1.78μs (10.8% slower)

# Edge: optional_params is not a dict
def test_edge_optional_params_not_dict(config, dummy_file):
    # Test with optional_params as a list (should raise TypeError)
    with pytest.raises(TypeError):
        config.transform_audio_transcription_request(
            model="whisper-1",
            audio_file=dummy_file,
            optional_params=["not", "a", "dict"],
            litellm_params={}
        ) # 2.58μs -> 1.71μs (50.8% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import pytest  # used for our unit tests
from litellm.llms.openai.transcriptions.whisper_transformation import \
    OpenAIWhisperAudioTranscriptionConfig


# Dummy classes to simulate dependencies for the function
class FileTypes:
    def __init__(self, filename, content):
        self.filename = filename
        self.content = content

class AudioTranscriptionRequestData:
    def __init__(self, data):
        self.data = data

class BaseAudioTranscriptionConfig:
    pass
from litellm.llms.openai.transcriptions.whisper_transformation import \
    OpenAIWhisperAudioTranscriptionConfig

# unit tests

# Helper fixture to create a dummy FileTypes object
@pytest.fixture
def dummy_file():
    return FileTypes(filename="audio.wav", content=b"dummy audio data")

# Basic Test Cases

def test_basic_minimal_request(dummy_file):
    """
    Basic: Test with minimal required arguments, no optional params.
    Should set response_format to 'verbose_json'.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params={},
        litellm_params={}
    ); result = codeflash_output # 1.94μs -> 2.13μs (8.83% slower)

def test_basic_with_optional_params(dummy_file):
    """
    Basic: Test with optional parameters included.
    Should merge optional params and set response_format to 'verbose_json'.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    optional = {"language": "en", "temperature": 0.5}
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-2",
        audio_file=dummy_file,
        optional_params=optional,
        litellm_params={}
    ); result = codeflash_output # 1.79μs -> 1.89μs (5.20% slower)

def test_basic_response_format_text(dummy_file):
    """
    Basic: If optional_params includes response_format='text', it should be overwritten to 'verbose_json'.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    optional = {"response_format": "text"}
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-3",
        audio_file=dummy_file,
        optional_params=optional,
        litellm_params={}
    ); result = codeflash_output # 2.04μs -> 1.99μs (2.11% faster)

def test_basic_response_format_json(dummy_file):
    """
    Basic: If optional_params includes response_format='json', it should be overwritten to 'verbose_json'.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    optional = {"response_format": "json"}
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-4",
        audio_file=dummy_file,
        optional_params=optional,
        litellm_params={}
    ); result = codeflash_output # 2.23μs -> 2.12μs (5.62% faster)

def test_basic_response_format_verbose_json(dummy_file):
    """
    Basic: If optional_params includes response_format='verbose_json', it should remain unchanged.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    optional = {"response_format": "verbose_json"}
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-5",
        audio_file=dummy_file,
        optional_params=optional,
        litellm_params={}
    ); result = codeflash_output # 2.06μs -> 1.96μs (4.94% faster)

# Edge Test Cases

def test_edge_empty_model(dummy_file):
    """
    Edge: Model name is empty string.
    Should still set response_format to 'verbose_json'.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    codeflash_output = config.transform_audio_transcription_request(
        model="",
        audio_file=dummy_file,
        optional_params={},
        litellm_params={}
    ); result = codeflash_output # 1.90μs -> 1.88μs (1.39% faster)


def test_edge_file_is_none():
    """
    Edge: audio_file is None.
    Should still set response_format to 'verbose_json' and file should be None.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=None,
        optional_params={},
        litellm_params={}
    ); result = codeflash_output # 2.50μs -> 2.65μs (5.63% slower)

def test_edge_response_format_other(dummy_file):
    """
    Edge: response_format is set to an unknown value (not text/json/verbose_json).
    Should remain unchanged.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    optional = {"response_format": "other_format"}
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params=optional,
        litellm_params={}
    ); result = codeflash_output # 2.22μs -> 2.14μs (3.50% faster)

def test_edge_optional_params_with_conflicting_keys(dummy_file):
    """
    Edge: optional_params contains 'model' and 'file', which should overwrite the originals.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    file2 = FileTypes(filename="audio2.wav", content=b"other audio data")
    optional = {"model": "conflict-model", "file": file2}
    codeflash_output = config.transform_audio_transcription_request(
        model="original-model",
        audio_file=dummy_file,
        optional_params=optional,
        litellm_params={}
    ); result = codeflash_output # 1.88μs -> 1.96μs (3.79% slower)

def test_edge_optional_params_with_nonstring_keys(dummy_file):
    """
    Edge: optional_params contains non-string keys.
    Should still merge them.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    optional = {42: "answer", (1,2): "tuplekey"}
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params=optional,
        litellm_params={}
    ); result = codeflash_output # 2.00μs -> 1.99μs (0.351% faster)

def test_edge_large_optional_params(dummy_file):
    """
    Edge: optional_params contains many keys.
    Should merge all and set response_format correctly.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    optional = {f"key{i}": i for i in range(50)}
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params=optional,
        litellm_params={}
    ); result = codeflash_output # 2.91μs -> 2.57μs (13.3% faster)
    for i in range(50):
        pass

# Large Scale Test Cases

def test_large_scale_many_optional_params(dummy_file):
    """
    Large Scale: optional_params contains 999 keys.
    Should merge all and set response_format correctly.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    optional = {f"param{i}": i for i in range(999)}
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=dummy_file,
        optional_params=optional,
        litellm_params={}
    ); result = codeflash_output # 15.4μs -> 5.42μs (185% faster)
    for i in range(999):
        pass

def test_large_scale_long_model_name(dummy_file):
    """
    Large Scale: Model name is a very long string.
    Should still work and set response_format correctly.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    long_model = "whisper-" + "x" * 500
    codeflash_output = config.transform_audio_transcription_request(
        model=long_model,
        audio_file=dummy_file,
        optional_params={},
        litellm_params={}
    ); result = codeflash_output # 1.86μs -> 1.95μs (4.42% slower)

def test_large_scale_large_file_object():
    """
    Large Scale: FileTypes object with large content.
    Should still set response_format correctly.
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    large_content = b"a" * 1024 * 500  # 500KB
    large_file = FileTypes(filename="large.wav", content=large_content)
    codeflash_output = config.transform_audio_transcription_request(
        model="whisper-1",
        audio_file=large_file,
        optional_params={},
        litellm_params={}
    ); result = codeflash_output # 1.93μs -> 2.07μs (6.58% slower)

def test_large_scale_combined(dummy_file):
    """
    Large Scale: All parameters large (long model, large optional_params, large file).
    """
    config = OpenAIWhisperAudioTranscriptionConfig()
    long_model = "whisper-" + "y" * 500
    large_optional = {f"opt{i}": i for i in range(500)}
    large_file = FileTypes(filename="huge.wav", content=b"x" * 1024 * 250)
    codeflash_output = config.transform_audio_transcription_request(
        model=long_model,
        audio_file=large_file,
        optional_params=large_optional,
        litellm_params={}
    ); result = codeflash_output # 8.77μs -> 3.57μs (146% faster)
    for i in range(500):
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from litellm.llms.openai.transcriptions.whisper_transformation import OpenAIWhisperAudioTranscriptionConfig

def test_OpenAIWhisperAudioTranscriptionConfig_transform_audio_transcription_request():
    OpenAIWhisperAudioTranscriptionConfig.transform_audio_transcription_request(OpenAIWhisperAudioTranscriptionConfig(), '', b'', {}, {})

🔎 Concolic Coverage Tests and Runtime

Test File::Test Function	Original ⏱️	Optimized ⏱️	Speedup
`codeflash_concolic_zbim32de/tmpc94afrui/test_concolic_coverage.py::test_OpenAIWhisperAudioTranscriptionConfig_transform_audio_transcription_request`	1.72μs	1.75μs	-1.77%⚠️

To edit these changes git checkout codeflash/optimize-OpenAIWhisperAudioTranscriptionConfig.transform_audio_transcription_request-mhdmy675 and push.

…ription_request The optimized code achieves an 18% speedup by eliminating expensive dictionary operations and reducing redundant key lookups: **Key Optimizations:** 1. **Eliminated dictionary unpacking overhead**: The original `{"model": model, "file": audio_file, **optional_params}` creates a new dictionary and performs an expensive merge operation. The optimized version uses `optional_params.copy()` followed by direct key assignment, which is significantly faster. 2. **Reduced dictionary lookups**: The original code performed multiple lookups on `data["response_format"]` within the conditional check. The optimized version uses `data.get("response_format")` once and stores the result, then uses membership testing `in ("text", "json")` which is more efficient than separate equality checks. 3. **Streamlined conditional logic**: Instead of checking `"response_format" not in data or (data["response_format"] == "text" or data["response_format"] == "json")`, the optimized version uses a single `get()` call and cleaner conditional structure. **Performance Impact by Test Case:** - **Large parameter sets**: Shows dramatic improvements (up to 195% faster for 1000+ parameters) because the optimization eliminates the expensive dictionary merge operation that scales poorly with parameter count. - **Simple cases**: Shows modest gains (1-5% faster) due to reduced lookup overhead. - **Empty parameter cases**: Sometimes slightly slower (~8% in some cases) due to the overhead of the `copy()` call when there's nothing to copy, but this is outweighed by gains in typical usage patterns. The optimization is particularly effective for scenarios with many optional parameters, which is common in ML API configurations.

codeflash-ai bot requested a review from mashraf-222 October 30, 2025 16:25

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up method `OpenAIWhisperAudioTranscriptionConfig.transform_audio_transcription_request` by 18% #170

⚡️ Speed up method `OpenAIWhisperAudioTranscriptionConfig.transform_audio_transcription_request` by 18% #170

Uh oh!

codeflash-ai bot commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method OpenAIWhisperAudioTranscriptionConfig.transform_audio_transcription_request by 18% #170

Are you sure you want to change the base?

⚡️ Speed up method OpenAIWhisperAudioTranscriptionConfig.transform_audio_transcription_request by 18% #170

Uh oh!

Conversation

codeflash-ai bot commented Oct 30, 2025

📄 18% (0.18x) speedup for OpenAIWhisperAudioTranscriptionConfig.transform_audio_transcription_request in litellm/llms/openai/transcriptions/whisper_transformation.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method `OpenAIWhisperAudioTranscriptionConfig.transform_audio_transcription_request` by 18% #170

⚡️ Speed up method `OpenAIWhisperAudioTranscriptionConfig.transform_audio_transcription_request` by 18% #170

📄 18% (0.18x) speedup for `OpenAIWhisperAudioTranscriptionConfig.transform_audio_transcription_request` in `litellm/llms/openai/transcriptions/whisper_transformation.py`