Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 30, 2025

📄 6% (0.06x) speedup for _transform_prompt in litellm/llms/openai/completion/utils.py

⏱️ Runtime : 1.13 milliseconds 1.07 milliseconds (best of 79 runs)

📝 Explanation and details

The optimized code achieves a 5% speedup through three key improvements:

1. Optimized is_tokens_or_list_of_tokens function:

  • Added early exit conditions (not isinstance(value, list) and not value) to avoid expensive all() operations on invalid inputs
  • Uses first element inspection to determine homogeneous type, reducing redundant type checks
  • This shows significant improvement in profiler results (749μs vs 797μs total time)

2. Replaced string concatenation with list joining in convert_content_list_to_str:

  • Changed from texts += text_content (O(n²) complexity) to collecting in a list and using "".join(text_parts) (O(n) complexity)
  • Eliminated unnecessary variable initialization (texts = "")
  • Added early return for empty content, reducing function overhead

3. Used list comprehension instead of manual loop in _transform_prompt:

  • Replaced the manual for loop with prompt_str_list = [convert_content_list_to_str(...) for m in messages]
  • Removed redundant try/except block that just re-raised exceptions
  • Eliminated unnecessary string initialization and concatenation in single message case

Performance benefits are most notable for:

  • Test cases with content as list of dicts (18-73% faster) - benefits from improved string joining
  • Large-scale operations (15-21% faster on 1000+ messages) - benefits from list comprehension efficiency
  • Token validation scenarios - benefits from optimized type checking

The optimizations particularly excel when processing structured content (lists of text dictionaries) and large message volumes, which are common in LLM applications.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 41 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 1 Passed
📊 Tests Coverage 87.5%
🌀 Generated Regression Tests and Runtime
from typing import Any, Dict, List, Union, cast

# imports
import pytest
from litellm.llms.openai.completion.utils import _transform_prompt

# --- Unit tests ---

# Basic Test Cases

def test_single_message_with_string_content():
    # Single message, content is a string
    messages = [{"role": "user", "content": "Hello, world!"}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 1.54μs -> 1.33μs (15.4% faster)

def test_single_message_with_content_list_of_dicts():
    # Single message, content is a list of dicts with 'text'
    messages = [{"role": "user", "content": [{"text": "Hello, "}, {"text": "world!"}]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 4.13μs -> 2.48μs (66.7% faster)

def test_multiple_messages_with_string_content():
    # Multiple messages, each content is a string
    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"},
        {"role": "assistant", "content": "Hi there!"}
    ]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 2.08μs -> 2.10μs (0.572% slower)

def test_single_message_with_content_list_of_tokens():
    # Single message, content is a list of ints (tokens)
    messages = [{"role": "user", "content": [1, 2, 3, 4]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 1.99μs -> 2.26μs (12.1% slower)

def test_single_message_with_content_list_of_list_of_tokens():
    # Single message, content is a list of lists of ints (list of tokens)
    messages = [{"role": "user", "content": [[1, 2], [3, 4]]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 3.42μs -> 2.81μs (21.6% faster)

def test_multiple_messages_with_content_list_of_dicts():
    # Multiple messages, each content is a list of dicts
    messages = [
        {"role": "system", "content": [{"text": "System prompt."}]},
        {"role": "user", "content": [{"text": "User prompt."}]}
    ]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 2.14μs -> 2.51μs (14.7% slower)



def test_message_with_empty_content():
    # Single message, content is empty string
    messages = [{"role": "user", "content": ""}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 1.63μs -> 1.46μs (11.8% faster)

def test_message_with_none_content():
    # Single message, content is None
    messages = [{"role": "user", "content": None}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 1.24μs -> 1.12μs (10.4% faster)

def test_message_with_content_list_of_empty_dicts():
    # Single message, content is a list of empty dicts
    messages = [{"role": "user", "content": [{}, {}]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 4.14μs -> 2.39μs (73.5% faster)

def test_message_with_content_list_of_dicts_missing_text():
    # Single message, content is a list of dicts missing 'text'
    messages = [{"role": "user", "content": [{"foo": "bar"}, {"baz": "qux"}]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 3.35μs -> 2.23μs (50.1% faster)


def test_multiple_messages_with_mixed_content_types():
    # Multiple messages, some with string, some with list of dicts, some with None
    messages = [
        {"role": "system", "content": "System prompt."},
        {"role": "user", "content": [{"text": "User prompt."}]},
        {"role": "assistant", "content": None}
    ]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 2.98μs -> 3.18μs (6.36% slower)



def test_message_with_content_list_of_dicts_some_missing_text():
    # Single message, content is a list of dicts, some with 'text', some without
    messages = [{"role": "user", "content": [{"text": "Hello"}, {}, {"text": "World"}]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 4.70μs -> 3.14μs (49.6% faster)

def test_multiple_messages_some_empty_content():
    # Multiple messages, some with empty content
    messages = [
        {"role": "system", "content": ""},
        {"role": "user", "content": [{"text": "User prompt."}]},
        {"role": "assistant", "content": ""}
    ]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 2.34μs -> 2.62μs (10.4% slower)


def test_large_single_message_content_string():
    # Single message, content is a large string
    large_text = "a" * 1000
    messages = [{"role": "user", "content": large_text}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 2.02μs -> 1.72μs (17.0% faster)

def test_large_single_message_content_list_of_tokens():
    # Single message, content is a large list of tokens
    tokens = list(range(1000))
    messages = [{"role": "user", "content": tokens}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 30.4μs -> 30.2μs (0.888% faster)

def test_large_single_message_content_list_of_dicts():
    # Single message, content is a list of 1000 dicts with text
    content_list = [{"text": str(i)} for i in range(1000)]
    messages = [{"role": "user", "content": content_list}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 69.1μs -> 58.4μs (18.2% faster)
    expected = "".join(str(i) for i in range(1000))

def test_large_multiple_messages_with_string_content():
    # 1000 messages, each content is a string
    messages = [{"role": "user", "content": f"msg{i}"} for i in range(1000)]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 178μs -> 147μs (21.1% faster)

def test_large_multiple_messages_with_content_list_of_dicts():
    # 1000 messages, each content is a list of dicts with text
    messages = [{"role": "user", "content": [{"text": f"msg{i}"}]} for i in range(1000)]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 206μs -> 227μs (9.15% slower)

def test_large_multiple_messages_with_mixed_content_types():
    # 1000 messages, alternating string and list of dicts
    messages = []
    for i in range(1000):
        if i % 2 == 0:
            messages.append({"role": "user", "content": f"msg{i}"})
        else:
            messages.append({"role": "user", "content": [{"text": f"msg{i}"}]})
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 193μs -> 190μs (1.54% faster)
    expected = []
    for i in range(1000):
        expected.append(f"msg{i}")
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import Any, Dict, List, Union

# imports
import pytest  # used for our unit tests
from litellm.llms.openai.completion.utils import _transform_prompt

# unit tests

# ------------------------ BASIC TEST CASES ------------------------

def test_single_message_with_string_content():
    # Basic: single message, content is a string
    messages = [{"role": "user", "content": "Hello, world!"}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 1.81μs -> 1.54μs (17.5% faster)

def test_single_message_with_content_list_of_text_dicts():
    # Basic: single message, content is a list of dicts with 'text' keys
    messages = [{"role": "user", "content": [{"text": "Hello"}, {"text": ", world!"}]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 4.23μs -> 2.53μs (67.2% faster)

def test_multiple_messages_with_string_content():
    # Basic: multiple messages, each with string content
    messages = [
        {"role": "user", "content": "Hello"},
        {"role": "assistant", "content": "Hi there!"}
    ]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 1.93μs -> 1.95μs (1.43% slower)

def test_multiple_messages_with_content_list_of_text_dicts():
    # Basic: multiple messages, each with content as list of dicts
    messages = [
        {"role": "user", "content": [{"text": "Hello"}]},
        {"role": "assistant", "content": [{"text": "Hi"}, {"text": " there!"}]}
    ]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 2.29μs -> 2.74μs (16.5% slower)

def test_single_message_with_content_list_of_tokens():
    # Basic: single message, content is a list of tokens (ints)
    messages = [{"role": "user", "content": [1, 2, 3, 4]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 2.22μs -> 2.21μs (0.361% faster)

def test_single_message_with_content_list_of_list_of_tokens():
    # Basic: single message, content is a list of lists of tokens (ints)
    messages = [{"role": "user", "content": [[1, 2], [3, 4]]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 3.68μs -> 2.78μs (32.3% faster)

# ------------------------ EDGE TEST CASES ------------------------


def test_message_with_no_content_key():
    # Edge: message with no 'content' key returns empty string
    messages = [{"role": "user"}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 1.53μs -> 1.45μs (5.65% faster)

def test_message_with_content_none():
    # Edge: message with content=None returns empty string
    messages = [{"role": "user", "content": None}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 1.22μs -> 1.16μs (4.56% faster)

def test_message_with_content_empty_list():
    # Edge: message with content=[]
    messages = [{"role": "user", "content": []}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 1.24μs -> 1.15μs (8.17% faster)

def test_message_with_content_list_of_dicts_missing_text_key():
    # Edge: message with content as list of dicts missing 'text' key
    messages = [{"role": "user", "content": [{"not_text": "abc"}, {"text": "def"}]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 4.11μs -> 2.50μs (64.2% faster)

def test_message_with_content_list_of_dicts_text_is_none():
    # Edge: message with content as list of dicts, 'text' is None
    messages = [{"role": "user", "content": [{"text": None}, {"text": "abc"}]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 3.36μs -> 2.25μs (49.2% faster)

def test_multiple_messages_mixed_content_types():
    # Edge: multiple messages, mixed content types (str, list of dicts)
    messages = [
        {"role": "user", "content": "Hi"},
        {"role": "assistant", "content": [{"text": "There"}]},
        {"role": "user", "content": [{"text": "!"}]}
    ]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 2.49μs -> 2.86μs (12.9% slower)


def test_message_with_content_list_of_dicts_some_text_missing():
    # Edge: message with content as list of dicts, some missing 'text'
    messages = [{"role": "user", "content": [{"text": "a"}, {}, {"text": "b"}]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 4.56μs -> 3.07μs (48.5% faster)

def test_message_with_content_list_of_dicts_text_is_empty_string():
    # Edge: message with content as list of dicts, 'text' is empty string
    messages = [{"role": "user", "content": [{"text": ""}, {"text": "abc"}]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 3.52μs -> 2.41μs (45.9% faster)

def test_multiple_messages_with_content_none():
    # Edge: multiple messages, some with content=None
    messages = [
        {"role": "user", "content": None},
        {"role": "assistant", "content": "Hi"},
        {"role": "user", "content": None}
    ]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 2.11μs -> 2.25μs (6.48% slower)

def test_message_with_content_list_of_dicts_text_is_falsey():
    # Edge: message with content as list of dicts, 'text' is falsey (0, False)
    messages = [{"role": "user", "content": [{"text": 0}, {"text": False}, {"text": "abc"}]}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 3.57μs -> 2.37μs (50.8% faster)

# ------------------------ LARGE SCALE TEST CASES ------------------------

def test_large_number_of_messages_with_string_content():
    # Large Scale: 1000 messages, each with unique string content
    messages = [{"role": "user", "content": f"msg{i}"} for i in range(1000)]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 180μs -> 150μs (20.1% faster)
    for i in range(1000):
        pass

def test_large_number_of_messages_with_content_list_of_dicts():
    # Large Scale: 500 messages, each with content as list of dicts
    messages = [{"role": "user", "content": [{"text": f"msg{i}"}]} for i in range(500)]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 104μs -> 114μs (8.28% slower)
    for i in range(500):
        pass

def test_single_message_with_large_content_list_of_tokens():
    # Large Scale: single message, content is a list of 1000 tokens
    tokens = list(range(1000))
    messages = [{"role": "user", "content": tokens}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 30.2μs -> 30.5μs (0.831% slower)

def test_single_message_with_large_content_list_of_list_of_tokens():
    # Large Scale: single message, content is a list of 100 lists of tokens
    tokens = [[i, i + 1] for i in range(0, 200, 2)]
    messages = [{"role": "user", "content": tokens}]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 22.9μs -> 21.6μs (5.76% faster)
    for sublist in result:
        pass

def test_large_number_of_messages_with_mixed_content_types():
    # Large Scale: 100 messages, alternating string and list-of-dict content
    messages = []
    for i in range(100):
        if i % 2 == 0:
            messages.append({"role": "user", "content": f"str{i}"})
        else:
            messages.append({"role": "user", "content": [{"text": f"txt{i}"}]})
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 22.2μs -> 21.8μs (1.79% faster)
    for i in range(100):
        expected = f"str{i}" if i % 2 == 0 else f"txt{i}"

def test_large_number_of_messages_with_content_none():
    # Large Scale: 100 messages, all with content=None
    messages = [{"role": "user", "content": None} for _ in range(100)]
    codeflash_output = _transform_prompt(messages); result = codeflash_output # 13.6μs -> 11.8μs (15.1% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from litellm.llms.openai.completion.utils import _transform_prompt

def test__transform_prompt():
    _transform_prompt([])
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_kt42dg31/tmpj9r77li5/test_concolic_coverage.py::test__transform_prompt 645ns 952ns -32.2%⚠️

To edit these changes git checkout codeflash/optimize-_transform_prompt-mhde6s6c and push.

Codeflash Static Badge

The optimized code achieves a **5% speedup** through three key improvements:

**1. Optimized `is_tokens_or_list_of_tokens` function:**
- Added early exit conditions (`not isinstance(value, list)` and `not value`) to avoid expensive `all()` operations on invalid inputs
- Uses first element inspection to determine homogeneous type, reducing redundant type checks
- This shows significant improvement in profiler results (749μs vs 797μs total time)

**2. Replaced string concatenation with list joining in `convert_content_list_to_str`:**
- Changed from `texts += text_content` (O(n²) complexity) to collecting in a list and using `"".join(text_parts)` (O(n) complexity)
- Eliminated unnecessary variable initialization (`texts = ""`)
- Added early return for empty content, reducing function overhead

**3. Used list comprehension instead of manual loop in `_transform_prompt`:**
- Replaced the manual `for` loop with `prompt_str_list = [convert_content_list_to_str(...) for m in messages]`
- Removed redundant `try/except` block that just re-raised exceptions
- Eliminated unnecessary string initialization and concatenation in single message case

**Performance benefits are most notable for:**
- Test cases with content as list of dicts (18-73% faster) - benefits from improved string joining
- Large-scale operations (15-21% faster on 1000+ messages) - benefits from list comprehension efficiency
- Token validation scenarios - benefits from optimized type checking

The optimizations particularly excel when processing structured content (lists of text dictionaries) and large message volumes, which are common in LLM applications.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 30, 2025 12:20
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant