Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 30, 2025

📄 26% (0.26x) speedup for OpenAIGuardrailBase.get_user_prompt in litellm/proxy/guardrails/guardrail_hooks/openai/base.py

⏱️ Runtime : 459 microseconds 364 microseconds (best of 92 runs)

📝 Explanation and details

The optimized code achieves a 26% speedup through two key optimizations:

1. Efficient String Concatenation in convert_content_list_to_str:

  • Original: Used repeated string concatenation (texts += text_content) which is O(n²) due to Python string immutability - each concatenation creates a new string object
  • Optimized: Uses list accumulation (text_parts.append()) followed by "".join() which is O(n) - strings are joined in a single operation

2. Improved Message Processing in get_user_prompt:

  • Original: Reversed the entire message list, appended to a new list, then reversed again - multiple O(n) operations plus manual string concatenation in a loop
  • Optimized: Uses direct list slicing (messages[start:end]) to extract the user message block in one operation, then processes with list comprehension and single join

Performance Benefits by Test Case:

  • Large consecutive user messages: Up to 47% faster (test with 500 user messages) - the O(n) vs O(n²) string joining makes the biggest difference here
  • No user messages at end: Up to 67% faster - the optimized approach stops scanning earlier and avoids unnecessary list operations
  • Mixed content types: 10-20% faster on average - benefits from both cleaner conditionals and efficient string handling
  • Empty message lists: 304% faster - minimal overhead from streamlined logic

The optimizations are particularly effective for scenarios with multiple user messages or complex content structures, where string concatenation overhead was most significant in the original implementation.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 100 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from typing import Dict, List, Optional, Union

# imports
import pytest  # used for our unit tests
from litellm.proxy.guardrails.guardrail_hooks.openai.base import \
    OpenAIGuardrailBase

# unit tests

@pytest.fixture
def guardrail():
    """Fixture to provide a fresh OpenAIGuardrailBase instance for each test."""
    return OpenAIGuardrailBase()

# 1. Basic Test Cases

def test_single_user_message(guardrail):
    """Test a single user message returns its content."""
    messages = [{"role": "user", "content": "Hello, world!"}]
    codeflash_output = guardrail.get_user_prompt(messages) # 3.65μs -> 3.48μs (4.88% faster)

def test_multiple_user_and_assistant_messages(guardrail):
    """Test extraction of last user message after assistant reply."""
    messages = [
        {"role": "user", "content": "Hi!"},
        {"role": "assistant", "content": "Hello!"},
        {"role": "user", "content": "How are you?"},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 3.62μs -> 3.26μs (11.0% faster)

def test_consecutive_user_messages_at_end(guardrail):
    """Test concatenation of consecutive user messages at the end."""
    messages = [
        {"role": "system", "content": "You are helpful."},
        {"role": "user", "content": "First."},
        {"role": "assistant", "content": "Hi."},
        {"role": "user", "content": "Second."},
        {"role": "user", "content": "Third."},
    ]
    # Should join "Second." and "Third." with newline
    codeflash_output = guardrail.get_user_prompt(messages) # 4.25μs -> 3.58μs (18.8% faster)

def test_no_user_messages(guardrail):
    """Test returns None if there are no user messages."""
    messages = [
        {"role": "system", "content": "You are helpful."},
        {"role": "assistant", "content": "Hi."},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 2.32μs -> 1.51μs (53.6% faster)

def test_user_message_with_content_as_list(guardrail):
    """Test user message with content as a list of dicts with 'text' keys."""
    messages = [
        {"role": "user", "content": [{"text": "Hello, "}, {"text": "world!"}]}
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 3.91μs -> 3.80μs (2.71% faster)

def test_consecutive_user_messages_with_list_content(guardrail):
    """Test multiple user messages at the end, with list content."""
    messages = [
        {"role": "assistant", "content": "Hi!"},
        {"role": "user", "content": [{"text": "Line 1."}]},
        {"role": "user", "content": [{"text": "Line 2."}]},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 4.48μs -> 4.08μs (9.75% faster)

def test_user_message_with_empty_content(guardrail):
    """Test user message with empty string content returns None."""
    messages = [{"role": "user", "content": ""}]
    codeflash_output = guardrail.get_user_prompt(messages) # 3.10μs -> 3.27μs (5.35% slower)

def test_user_message_with_none_content(guardrail):
    """Test user message with None content returns None."""
    messages = [{"role": "user", "content": None}]
    codeflash_output = guardrail.get_user_prompt(messages) # 3.20μs -> 3.12μs (2.56% faster)

def test_user_message_with_list_content_and_empty_texts(guardrail):
    """Test user message with content as list of dicts with empty/None text."""
    messages = [
        {"role": "user", "content": [{"text": ""}, {"text": None}, {"text": "foo"}]}
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 3.83μs -> 3.54μs (8.20% faster)

def test_user_message_with_content_list_missing_text_key(guardrail):
    """Test user message with content as list of dicts missing 'text' key."""
    messages = [
        {"role": "user", "content": [{"foo": "bar"}, {"text": "baz"}]}
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 3.75μs -> 3.40μs (10.6% faster)


def test_user_message_with_content_list_all_empty(guardrail):
    """Test user message with content as list, all empty/None text."""
    messages = [
        {"role": "user", "content": [{"text": ""}, {"text": None}]}
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 4.82μs -> 4.44μs (8.66% faster)

# 2. Edge Test Cases

def test_empty_messages_list(guardrail):
    """Test empty messages list returns None."""
    codeflash_output = guardrail.get_user_prompt([]) # 2.06μs -> 510ns (304% faster)

def test_messages_with_only_non_user_roles(guardrail):
    """Test messages with only system/assistant roles returns None."""
    messages = [
        {"role": "system", "content": "You are helpful."},
        {"role": "assistant", "content": "Hello!"},
        {"role": "system", "content": "Another system message."}
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 2.50μs -> 1.62μs (54.7% faster)

def test_user_message_with_whitespace_content(guardrail):
    """Test user message with whitespace content returns None (after strip)."""
    messages = [{"role": "user", "content": "   \n   "}]
    codeflash_output = guardrail.get_user_prompt(messages) # 3.80μs -> 3.48μs (9.28% faster)

def test_multiple_consecutive_user_messages_with_varied_content(guardrail):
    """Test consecutive user messages, some with empty/None/whitespace content."""
    messages = [
        {"role": "assistant", "content": "Hi!"},
        {"role": "user", "content": ""},
        {"role": "user", "content": None},
        {"role": "user", "content": "  "},
        {"role": "user", "content": "Final user message."}
    ]
    # Only "Final user message." should be returned
    codeflash_output = guardrail.get_user_prompt(messages) # 5.22μs -> 4.28μs (22.0% faster)


def test_user_message_with_content_as_list_of_empty_dicts(guardrail):
    """Test user message with content as list of empty dicts."""
    messages = [
        {"role": "user", "content": [{}, {}]}
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 4.72μs -> 4.43μs (6.45% faster)



def test_message_with_missing_role_key(guardrail):
    """Test message dict missing 'role' key is ignored."""
    messages = [
        {"role": "assistant", "content": "Hi!"},
        {"content": "Should be ignored."},
        {"role": "user", "content": "Valid user message."},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 4.95μs -> 4.19μs (18.1% faster)

def test_message_with_missing_content_key(guardrail):
    """Test message dict missing 'content' key is treated as None."""
    messages = [
        {"role": "user"},
        {"role": "user", "content": "Hello!"},
    ]
    # Only "Hello!" should be included
    codeflash_output = guardrail.get_user_prompt(messages) # 4.62μs -> 4.07μs (13.6% faster)

def test_message_with_content_none_and_other_valid_user(guardrail):
    """Test user message with None content, followed by valid user message."""
    messages = [
        {"role": "user", "content": None},
        {"role": "user", "content": "Next user message."}
    ]
    # Only "Next user message." should be returned
    codeflash_output = guardrail.get_user_prompt(messages) # 4.19μs -> 3.74μs (11.9% faster)

# 3. Large Scale Test Cases

def test_large_number_of_messages(guardrail):
    """Test function with a large number of messages (500), last 10 are user messages."""
    messages = []
    # Add 490 alternating system/assistant messages
    for i in range(245):
        messages.append({"role": "system", "content": f"System {i}"})
        messages.append({"role": "assistant", "content": f"Assistant {i}"})
    # Add 10 user messages at the end
    for i in range(10):
        messages.append({"role": "user", "content": f"User message {i}"})
    expected = "\n".join(f"User message {i}" for i in range(10))
    codeflash_output = guardrail.get_user_prompt(messages) # 6.93μs -> 5.71μs (21.4% faster)

def test_large_consecutive_user_messages_with_varied_content(guardrail):
    """Test function with 100 consecutive user messages with mixed content types."""
    messages = []
    # Add 100 user messages with alternating content types
    for i in range(100):
        if i % 3 == 0:
            content = f"Text {i}"
        elif i % 3 == 1:
            content = [{"text": f"ListText {i}"}]
        else:
            content = ""
        messages.append({"role": "user", "content": content})
    # Only non-empty, non-blank messages should be included
    expected_lines = []
    for i in range(100):
        if i % 3 == 0:
            expected_lines.append(f"Text {i}")
        elif i % 3 == 1:
            expected_lines.append(f"ListText {i}")
        # skip empty string
    expected = "\n".join(expected_lines)
    codeflash_output = guardrail.get_user_prompt(messages) # 29.6μs -> 22.9μs (29.4% faster)

def test_large_messages_with_no_user_at_end(guardrail):
    """Test large messages list where last messages are not user role."""
    messages = []
    for i in range(300):
        messages.append({"role": "user", "content": f"User {i}"})
    # Add 10 assistant messages at the end
    for i in range(10):
        messages.append({"role": "assistant", "content": f"Assistant {i}"})
    codeflash_output = guardrail.get_user_prompt(messages) # 2.53μs -> 1.56μs (62.6% faster)

def test_large_messages_with_sparse_user_blocks(guardrail):
    """Test large message list with user blocks separated by non-user messages."""
    messages = []
    # Block 1: 5 user messages
    for i in range(5):
        messages.append({"role": "user", "content": f"Block1-{i}"})
    # 10 assistant messages
    for i in range(10):
        messages.append({"role": "assistant", "content": f"Assistant {i}"})
    # Block 2: 7 user messages
    for i in range(7):
        messages.append({"role": "user", "content": f"Block2-{i}"})
    # 10 system messages
    for i in range(10):
        messages.append({"role": "system", "content": f"System {i}"})
    # Block 3: 3 user messages (should be returned)
    for i in range(3):
        messages.append({"role": "user", "content": f"Block3-{i}"})
    expected = "\n".join(f"Block3-{i}" for i in range(3))
    codeflash_output = guardrail.get_user_prompt(messages) # 4.66μs -> 3.54μs (31.6% faster)

def test_large_user_messages_with_content_list(guardrail):
    """Test large number of user messages with content as list of dicts."""
    messages = []
    for i in range(50):
        content = [{"text": f"TextA-{i}"}, {"text": f"TextB-{i}"}]
        messages.append({"role": "user", "content": content})
    expected = "\n".join(f"TextA-{i}TextB-{i}" for i in range(50))
    codeflash_output = guardrail.get_user_prompt(messages) # 22.9μs -> 19.4μs (18.5% faster)

def test_large_user_messages_with_mixed_valid_and_invalid_content(guardrail):
    """Test large number of user messages with some invalid content."""
    messages = []
    for i in range(25):
        if i % 2 == 0:
            messages.append({"role": "user", "content": f"Valid {i}"})
        else:
            messages.append({"role": "user", "content": None})
    expected = "\n".join(f"Valid {i}" for i in range(0, 25, 2))
    codeflash_output = guardrail.get_user_prompt(messages) # 9.57μs -> 7.61μs (25.7% faster)

def test_large_user_messages_all_empty_content(guardrail):
    """Test large number of user messages all with empty content."""
    messages = [{"role": "user", "content": ""} for _ in range(100)]
    codeflash_output = guardrail.get_user_prompt(messages) # 19.0μs -> 19.2μs (1.42% slower)

def test_large_user_messages_with_whitespace_content(guardrail):
    """Test large number of user messages all with whitespace content."""
    messages = [{"role": "user", "content": "   "} for _ in range(100)]
    codeflash_output = guardrail.get_user_prompt(messages) # 28.3μs -> 19.5μs (45.1% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import List, Optional, Union

# imports
import pytest  # used for our unit tests
from litellm.proxy.guardrails.guardrail_hooks.openai.base import \
    OpenAIGuardrailBase

# unit tests

@pytest.fixture
def guardrail():
    # Fixture to create an instance of OpenAIGuardrailBase for reuse
    return OpenAIGuardrailBase()

# --- Basic Test Cases ---

def test_single_user_message_string_content(guardrail):
    # Simple case: one user message, content is a string
    messages = [{"role": "user", "content": "Hello, how are you?"}]
    codeflash_output = guardrail.get_user_prompt(messages) # 3.92μs -> 3.54μs (10.6% faster)

def test_single_user_message_list_content(guardrail):
    # One user message, content is a list of dicts with 'text'
    messages = [{"role": "user", "content": [{"text": "Hello"}, {"text": ", world!"}]}]
    codeflash_output = guardrail.get_user_prompt(messages) # 3.90μs -> 3.75μs (3.97% faster)

def test_multiple_messages_last_user_only(guardrail):
    # Last message is user, previous is assistant
    messages = [
        {"role": "user", "content": "First"},
        {"role": "assistant", "content": "Hi!"},
        {"role": "user", "content": "Second"},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 3.70μs -> 3.09μs (19.8% faster)

def test_multiple_consecutive_user_messages(guardrail):
    # Last two messages are user, should be concatenated
    messages = [
        {"role": "assistant", "content": "Hi!"},
        {"role": "user", "content": "First"},
        {"role": "user", "content": "Second"},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 4.21μs -> 3.74μs (12.4% faster)

def test_no_user_messages(guardrail):
    # No user messages present
    messages = [
        {"role": "assistant", "content": "Hi!"},
        {"role": "system", "content": "Welcome"},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 2.33μs -> 1.48μs (57.0% faster)

def test_empty_messages_list(guardrail):
    # Empty input
    messages = []
    codeflash_output = guardrail.get_user_prompt(messages) # 1.85μs -> 458ns (304% faster)

# --- Edge Test Cases ---

def test_user_message_with_empty_string_content(guardrail):
    # User message with empty string content
    messages = [
        {"role": "user", "content": ""},
        {"role": "assistant", "content": "Hi!"},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 2.35μs -> 1.64μs (43.4% faster)

def test_user_message_with_none_content(guardrail):
    # User message with None content
    messages = [
        {"role": "user", "content": None},
        {"role": "assistant", "content": "Hi!"},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 2.24μs -> 1.55μs (44.3% faster)

def test_user_message_with_empty_list_content(guardrail):
    # User message with empty list content
    messages = [
        {"role": "user", "content": []},
        {"role": "assistant", "content": "Hi!"},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 2.24μs -> 1.37μs (63.2% faster)

def test_user_message_with_list_content_missing_text_key(guardrail):
    # List content dict missing 'text' key
    messages = [
        {"role": "user", "content": [{"not_text": "abc"}]},
        {"role": "assistant", "content": "Hi!"},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 2.23μs -> 1.46μs (52.0% faster)

def test_user_message_with_list_content_some_missing_text_key(guardrail):
    # Mixed list, some dicts missing 'text'
    messages = [
        {"role": "user", "content": [{"text": "A"}, {"not_text": "B"}, {"text": "C"}]},
        {"role": "assistant", "content": "Hi!"},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 2.27μs -> 1.50μs (50.9% faster)

def test_user_message_with_whitespace_content(guardrail):
    # User message with whitespace-only content
    messages = [
        {"role": "user", "content": "   "},
        {"role": "assistant", "content": "Hi!"},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 2.37μs -> 1.55μs (53.0% faster)

def test_user_message_with_mixed_content_types(guardrail):
    # User message with content as int, float, dict, etc.
    messages = [
        {"role": "user", "content": 123},
        {"role": "user", "content": 4.56},
        {"role": "user", "content": {"text": "abc"}},
        {"role": "assistant", "content": "Hi!"},
    ]
    # Only the last consecutive block of user messages is considered (the last three)
    # None of them are valid string or list, so should be None
    codeflash_output = guardrail.get_user_prompt(messages) # 2.25μs -> 1.42μs (58.8% faster)

def test_user_message_with_content_list_and_empty_texts(guardrail):
    # List content with empty 'text' values
    messages = [
        {"role": "user", "content": [{"text": ""}, {"text": None}, {"text": "X"}]},
        {"role": "assistant", "content": "Hi!"},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 2.34μs -> 1.48μs (58.5% faster)

def test_non_user_message_at_end(guardrail):
    # Last message is not user, should not include previous user messages
    messages = [
        {"role": "user", "content": "First"},
        {"role": "user", "content": "Second"},
        {"role": "assistant", "content": "Hi!"},
    ]
    codeflash_output = guardrail.get_user_prompt(messages) # 2.24μs -> 1.40μs (59.4% faster)

def test_consecutive_user_messages_with_varied_content(guardrail):
    # Several user messages, some with string, some with list, some empty
    messages = [
        {"role": "user", "content": "First"},
        {"role": "user", "content": [{"text": "Second"}, {"text": "Third"}]},
        {"role": "user", "content": ""},
        {"role": "assistant", "content": "Hi!"},
    ]
    # Only the last consecutive block of user messages (the third one) is considered
    codeflash_output = guardrail.get_user_prompt(messages) # 2.26μs -> 1.43μs (57.4% faster)

def test_consecutive_user_messages_with_mixed_valid_and_invalid_content(guardrail):
    # Last two user messages, one valid, one invalid
    messages = [
        {"role": "assistant", "content": "Hi!"},
        {"role": "user", "content": ""},
        {"role": "user", "content": "Second"},
    ]
    # Only the last two user messages are considered, but one is empty
    codeflash_output = guardrail.get_user_prompt(messages) # 4.50μs -> 3.84μs (17.3% faster)

# --- Large Scale Test Cases ---

def test_large_number_of_messages_last_user_block(guardrail):
    # 1000 messages, last 3 are user messages
    messages = []
    for i in range(997):
        messages.append({"role": "assistant", "content": f"A{i}"})
    messages.extend([
        {"role": "user", "content": "User998"},
        {"role": "user", "content": "User999"},
        {"role": "user", "content": "User1000"},
    ])
    expected = "User998\nUser999\nUser1000"
    codeflash_output = guardrail.get_user_prompt(messages) # 5.64μs -> 4.51μs (25.0% faster)

def test_large_number_of_messages_no_user_at_end(guardrail):
    # 1000 messages, last is assistant
    messages = []
    for i in range(999):
        messages.append({"role": "user", "content": f"User{i}"})
    messages.append({"role": "assistant", "content": "Final"})
    codeflash_output = guardrail.get_user_prompt(messages) # 2.65μs -> 1.58μs (67.3% faster)

def test_large_number_of_consecutive_user_messages(guardrail):
    # 500 consecutive user messages at the end
    messages = []
    for i in range(500):
        messages.append({"role": "assistant", "content": f"A{i}"})
    for i in range(500):
        messages.append({"role": "user", "content": f"User{i}"})
    expected = "\n".join([f"User{i}" for i in range(500)])
    codeflash_output = guardrail.get_user_prompt(messages) # 122μs -> 83.1μs (47.7% faster)

def test_large_number_of_user_messages_with_list_content(guardrail):
    # 100 user messages, each with list content
    messages = []
    for i in range(900):
        messages.append({"role": "assistant", "content": f"A{i}"})
    for i in range(100):
        messages.append({"role": "user", "content": [{"text": f"U{i}A"}, {"text": f"U{i}B"}]})
    expected = "\n".join([f"U{i}AU{i}B" for i in range(100)])
    codeflash_output = guardrail.get_user_prompt(messages) # 40.0μs -> 35.8μs (11.8% faster)

def test_large_number_of_user_messages_with_empty_and_valid_content(guardrail):
    # 100 user messages, half empty, half valid
    messages = []
    for i in range(900):
        messages.append({"role": "assistant", "content": f"A{i}"})
    for i in range(50):
        messages.append({"role": "user", "content": ""})
    for i in range(50):
        messages.append({"role": "user", "content": f"User{i}"})
    expected = "\n".join([f"User{i}" for i in range(50)])
    codeflash_output = guardrail.get_user_prompt(messages) # 25.1μs -> 20.6μs (22.2% faster)

def test_large_number_of_user_messages_all_empty(guardrail):
    # 100 user messages, all empty
    messages = []
    for i in range(900):
        messages.append({"role": "assistant", "content": f"A{i}"})
    for i in range(100):
        messages.append({"role": "user", "content": ""})
    codeflash_output = guardrail.get_user_prompt(messages) # 19.6μs -> 20.3μs (3.52% slower)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-OpenAIGuardrailBase.get_user_prompt-mhdd7ac6 and push.

Codeflash Static Badge

The optimized code achieves a **26% speedup** through two key optimizations:

**1. Efficient String Concatenation in `convert_content_list_to_str`:**
- **Original**: Used repeated string concatenation (`texts += text_content`) which is O(n²) due to Python string immutability - each concatenation creates a new string object
- **Optimized**: Uses list accumulation (`text_parts.append()`) followed by `"".join()` which is O(n) - strings are joined in a single operation

**2. Improved Message Processing in `get_user_prompt`:**
- **Original**: Reversed the entire message list, appended to a new list, then reversed again - multiple O(n) operations plus manual string concatenation in a loop
- **Optimized**: Uses direct list slicing (`messages[start:end]`) to extract the user message block in one operation, then processes with list comprehension and single join

**Performance Benefits by Test Case:**
- **Large consecutive user messages**: Up to 47% faster (test with 500 user messages) - the O(n) vs O(n²) string joining makes the biggest difference here
- **No user messages at end**: Up to 67% faster - the optimized approach stops scanning earlier and avoids unnecessary list operations  
- **Mixed content types**: 10-20% faster on average - benefits from both cleaner conditionals and efficient string handling
- **Empty message lists**: 304% faster - minimal overhead from streamlined logic

The optimizations are particularly effective for scenarios with multiple user messages or complex content structures, where string concatenation overhead was most significant in the original implementation.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 30, 2025 11:52
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant