⚡️ Speed up function `_get_response_headers` by 12% #166

codeflash-ai · 2025-10-30T15:31:09Z

📄 12% (0.12x) speedup for `_get_response_headers` in `litellm/litellm_core_utils/exception_mapping_utils.py`

⏱️ Runtime : 45.6 microseconds → 40.5 microseconds (best of 72 runs)

📝 Explanation and details

The optimized code achieves a 12% speedup through strategic restructuring and early returns that reduce unnecessary attribute lookups and assignments.

Key optimizations:

Early return pattern: Instead of always assigning to _response_headers and checking it later, the code now returns immediately when error_response.headers is found, eliminating subsequent unnecessary checks and assignments.
Reduced redundant operations: The original code performed multiple conditional checks on _response_headers even after finding headers. The optimized version uses early returns to skip these redundant operations.
Streamlined control flow: By restructuring the nested conditions and eliminating the intermediate variable assignment for error_response.headers, the code path is more direct when headers are found in the response object.

Performance impact by test case type:

Best improvements (40-70% faster): Cases where headers are found in response.headers benefit most from the early return pattern, avoiding the final litellm_response_headers lookup
Moderate improvements (20-35% faster): Cases with direct headers attributes benefit from reduced variable assignments
Minimal impact: Cases requiring fallback to litellm_response_headers see smaller gains since they still need to traverse the full logic path

The optimization maintains identical functionality while reducing the average number of operations per call, particularly benefiting the common case where headers are found in the response object.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 37 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

from typing import Optional

import httpx
# imports
import pytest  # used for our unit tests
from litellm.litellm_core_utils.exception_mapping_utils import \
    _get_response_headers

# unit tests

# ----------- BASIC TEST CASES -----------

def test_headers_direct_attribute():
    # Test: Exception has a 'headers' attribute directly
    headers = httpx.Headers({"X-Test": "value"})
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.headers = headers
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 1.53μs -> 1.25μs (22.5% faster)

def test_headers_from_response_attribute():
    # Test: Exception has a 'response' attribute with 'headers'
    headers = httpx.Headers({"X-Response": "response-value"})
    class Response:
        pass
    response = Response()
    response.headers = headers
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.response = response
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 1.41μs -> 954ns (47.8% faster)

def test_headers_from_litellm_response_headers():
    # Test: Exception has 'litellm_response_headers' attribute
    headers = httpx.Headers({"X-Litellm": "litellm-value"})
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.litellm_response_headers = headers
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 1.06μs -> 1.03μs (2.62% faster)

def test_no_headers_anywhere_returns_none():
    # Test: Exception has no relevant headers attributes
    class MyException(Exception):
        pass
    exc = MyException("error")
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 1.09μs -> 1.16μs (6.12% slower)

# ----------- EDGE TEST CASES -----------

def test_headers_is_none_and_response_headers_present():
    # Test: 'headers' is None, but 'response.headers' is present
    headers = httpx.Headers({"X-Edge": "edge-value"})
    class Response:
        pass
    response = Response()
    response.headers = headers
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.headers = None
    exc.response = response
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 1.16μs -> 755ns (53.6% faster)

def test_response_is_none_and_litellm_response_headers_present():
    # Test: 'response' is None, but 'litellm_response_headers' is present
    headers = httpx.Headers({"X-Litellm": "litellm-value"})
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.response = None
    exc.litellm_response_headers = headers
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 973ns -> 936ns (3.95% faster)

def test_all_headers_none_returns_none():
    # Test: All possible header attributes are None
    class Response:
        pass
    response = Response()
    response.headers = None
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.headers = None
    exc.response = response
    exc.litellm_response_headers = None
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 873ns -> 774ns (12.8% faster)

def test_headers_is_not_httpx_headers_type():
    # Test: 'headers' is present but not an httpx.Headers object
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.headers = {"X-Test": "not-httpx-headers"}
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 865ns -> 571ns (51.5% faster)

def test_response_attribute_missing():
    # Test: Exception does not have a 'response' attribute at all
    headers = httpx.Headers({"X-Test": "value"})
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.headers = None
    exc.litellm_response_headers = headers
    # No 'response' attribute
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 960ns -> 922ns (4.12% faster)

def test_headers_attribute_raises_exception():
    # Test: Accessing 'headers' raises an exception
    class MyException(Exception):
        @property
        def headers(self):
            raise RuntimeError("Access denied")
    exc = MyException("error")
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 1.65μs -> 1.75μs (5.83% slower)

def test_response_headers_raises_exception():
    # Test: Accessing 'response.headers' raises an exception
    class Response:
        @property
        def headers(self):
            raise RuntimeError("Access denied")
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.headers = None
    exc.response = Response()
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 1.56μs -> 1.69μs (7.92% slower)

# ----------- LARGE SCALE TEST CASES -----------

def test_large_number_of_headers_direct():
    # Test: Exception has a large number of headers directly
    large_headers_dict = {f"X-Header-{i}": f"value-{i}" for i in range(1000)}
    headers = httpx.Headers(large_headers_dict)
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.headers = headers
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 1.60μs -> 1.30μs (23.1% faster)

def test_large_number_of_headers_in_response():
    # Test: Exception has a large number of headers in response
    large_headers_dict = {f"X-Response-{i}": f"response-{i}" for i in range(1000)}
    headers = httpx.Headers(large_headers_dict)
    class Response:
        pass
    response = Response()
    response.headers = headers
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.headers = None
    exc.response = response
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 1.39μs -> 823ns (68.5% faster)

def test_large_number_of_headers_in_litellm_response_headers():
    # Test: Exception has a large number of headers in litellm_response_headers
    large_headers_dict = {f"X-Litellm-{i}": f"litellm-{i}" for i in range(1000)}
    headers = httpx.Headers(large_headers_dict)
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.headers = None
    exc.response = None
    exc.litellm_response_headers = headers
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 815ns -> 748ns (8.96% faster)

def test_performance_large_headers(monkeypatch):
    # Test: Performance with large headers, ensure no crash and reasonable speed
    import time
    large_headers_dict = {f"X-Perf-{i}": f"perf-{i}" for i in range(1000)}
    headers = httpx.Headers(large_headers_dict)
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.headers = headers
    start = time.time()
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 1.52μs -> 1.20μs (26.6% faster)
    end = time.time()

# ----------- MISCELLANEOUS CASES -----------

def test_headers_is_empty_httpx_headers():
    # Test: headers is an empty httpx.Headers object
    headers = httpx.Headers({})
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.headers = headers
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 1.50μs -> 1.61μs (6.96% slower)

def test_headers_is_empty_dict():
    # Test: headers is an empty dict
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.headers = {}
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 1.01μs -> 1.14μs (11.8% slower)

def test_headers_is_string():
    # Test: headers is a string (unusual, but possible)
    class MyException(Exception):
        pass
    exc = MyException("error")
    exc.headers = "not-a-header-object"
    codeflash_output = _get_response_headers(exc); result = codeflash_output # 867ns -> 638ns (35.9% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from typing import Optional

import httpx
# imports
import pytest
from litellm.litellm_core_utils.exception_mapping_utils import \
    _get_response_headers

# unit tests

# 1. Basic Test Cases

def test_headers_directly_on_exception():
    # Exception has 'headers' attribute directly
    headers = httpx.Headers({"X-Test": "foo", "Y-Test": "bar"})
    class MyException(Exception):
        pass
    ex = MyException("error")
    ex.headers = headers
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.67μs -> 1.22μs (36.4% faster)

def test_headers_on_response_of_exception():
    # Exception has 'response' attribute with 'headers'
    headers = httpx.Headers({"A": "1"})
    class FakeResponse:
        pass
    resp = FakeResponse()
    resp.headers = headers
    class MyException(Exception):
        pass
    ex = MyException("error")
    ex.response = resp
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.43μs -> 1.02μs (40.0% faster)

def test_litellm_response_headers_fallback():
    # Exception has 'litellm_response_headers' attribute only
    headers = httpx.Headers({"Z": "last"})
    class MyException(Exception):
        pass
    ex = MyException("error")
    ex.litellm_response_headers = headers
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.03μs -> 1.06μs (2.82% slower)

def test_no_headers_anywhere_returns_none():
    # Exception has no headers anywhere
    class MyException(Exception):
        pass
    ex = MyException("error")
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.04μs -> 1.09μs (4.66% slower)

# 2. Edge Test Cases

def test_headers_is_none_on_exception_and_response_and_litellm():
    # All header attributes are present but set to None
    class FakeResponse:
        pass
    resp = FakeResponse()
    resp.headers = None
    class MyException(Exception):
        pass
    ex = MyException("error")
    ex.headers = None
    ex.response = resp
    ex.litellm_response_headers = None
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 872ns -> 833ns (4.68% faster)

def test_headers_on_response_but_not_on_exception():
    # Exception.headers is None, but response.headers exists
    headers = httpx.Headers({"foo": "bar"})
    class FakeResponse:
        pass
    resp = FakeResponse()
    resp.headers = headers
    class MyException(Exception):
        pass
    ex = MyException("error")
    ex.headers = None
    ex.response = resp
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.16μs -> 786ns (47.6% faster)

def test_headers_on_litellm_only():
    # Only litellm_response_headers is set
    headers = httpx.Headers({"L": "LITELLM"})
    class MyException(Exception):
        pass
    ex = MyException("error")
    ex.litellm_response_headers = headers
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.08μs -> 1.07μs (1.03% faster)

def test_headers_on_exception_and_response_both():
    # Both exception.headers and response.headers are present, should prefer exception.headers
    headers1 = httpx.Headers({"A": "B"})
    headers2 = httpx.Headers({"C": "D"})
    class FakeResponse:
        pass
    resp = FakeResponse()
    resp.headers = headers2
    class MyException(Exception):
        pass
    ex = MyException("error")
    ex.headers = headers1
    ex.response = resp
    ex.litellm_response_headers = httpx.Headers({"E": "F"})
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.13μs -> 1.05μs (7.54% faster)

def test_headers_on_response_and_litellm_only():
    # exception.headers is None, response.headers is None, litellm_response_headers set
    headers = httpx.Headers({"Q": "R"})
    class FakeResponse:
        pass
    resp = FakeResponse()
    resp.headers = None
    class MyException(Exception):
        pass
    ex = MyException("error")
    ex.headers = None
    ex.response = resp
    ex.litellm_response_headers = headers
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 851ns -> 743ns (14.5% faster)

def test_exception_without_any_attributes():
    # Exception has no extra attributes
    ex = Exception("plain")
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 720ns -> 834ns (13.7% slower)

def test_non_exception_input_returns_none():
    # Passing a non-exception object (should still work, as getattr is used)
    class Dummy:
        pass
    dummy = Dummy()
    codeflash_output = _get_response_headers(dummy); result = codeflash_output # 1.08μs -> 988ns (9.11% faster)

def test_headers_attribute_raises_exception():
    # headers attribute raises exception on access
    class MyException(Exception):
        @property
        def headers(self):
            raise ValueError("fail")
    ex = MyException("error")
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.66μs -> 1.58μs (4.86% faster)

def test_response_attribute_raises_exception():
    # response attribute raises exception on access
    class MyException(Exception):
        @property
        def headers(self):
            return None
        @property
        def response(self):
            raise ValueError("fail")
    ex = MyException("error")
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.69μs -> 1.59μs (6.42% faster)

def test_litellm_attribute_raises_exception():
    # litellm_response_headers attribute raises exception on access
    class MyException(Exception):
        @property
        def headers(self):
            return None
        @property
        def response(self):
            return None
        @property
        def litellm_response_headers(self):
            raise ValueError("fail")
    ex = MyException("error")
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.87μs -> 1.75μs (6.80% faster)

# 3. Large Scale Test Cases

def test_large_number_of_headers_direct():
    # Exception.headers contains a large number of headers
    headers_dict = {f"X-Key-{i}": str(i) for i in range(1000)}
    headers = httpx.Headers(headers_dict)
    class MyException(Exception):
        pass
    ex = MyException("error")
    ex.headers = headers
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.58μs -> 1.28μs (24.2% faster)
    for i in range(0, 1000, 100):  # spot check some keys
        pass

def test_large_number_of_headers_on_response():
    # Exception.response.headers contains a large number of headers
    headers_dict = {f"Y-Key-{i}": str(i) for i in range(1000)}
    headers = httpx.Headers(headers_dict)
    class FakeResponse:
        pass
    resp = FakeResponse()
    resp.headers = headers
    class MyException(Exception):
        pass
    ex = MyException("error")
    ex.headers = None
    ex.response = resp
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.40μs -> 814ns (72.2% faster)
    for i in range(0, 1000, 100):  # spot check some keys
        pass

def test_large_number_of_headers_on_litellm():
    # Exception.litellm_response_headers contains a large number of headers
    headers_dict = {f"Z-Key-{i}": str(i) for i in range(1000)}
    headers = httpx.Headers(headers_dict)
    class MyException(Exception):
        pass
    ex = MyException("error")
    ex.litellm_response_headers = headers
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.14μs -> 1.17μs (2.91% slower)
    for i in range(0, 1000, 100):  # spot check some keys
        pass

def test_large_scale_no_headers_anywhere():
    # Exception with many unrelated attributes, but no headers
    class MyException(Exception):
        pass
    ex = MyException("error")
    # Add many unrelated attributes
    for i in range(1000):
        setattr(ex, f"attr_{i}", i)
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 1.45μs -> 1.48μs (2.17% slower)

def test_large_scale_headers_and_fallback():
    # Exception with headers=None, response.headers=None, then litellm_response_headers set
    class FakeResponse:
        pass
    resp = FakeResponse()
    resp.headers = None
    class MyException(Exception):
        pass
    ex = MyException("error")
    ex.headers = None
    ex.response = resp
    headers_dict = {f"L-Key-{i}": str(i) for i in range(1000)}
    headers = httpx.Headers(headers_dict)
    ex.litellm_response_headers = headers
    codeflash_output = _get_response_headers(ex); result = codeflash_output # 926ns -> 931ns (0.537% slower)
    for i in range(0, 1000, 200):  # spot check some keys
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-_get_response_headers-mhdl0gzc and push.

The optimized code achieves a 12% speedup through strategic restructuring and early returns that reduce unnecessary attribute lookups and assignments. **Key optimizations:** 1. **Early return pattern**: Instead of always assigning to `_response_headers` and checking it later, the code now returns immediately when `error_response.headers` is found, eliminating subsequent unnecessary checks and assignments. 2. **Reduced redundant operations**: The original code performed multiple conditional checks on `_response_headers` even after finding headers. The optimized version uses early returns to skip these redundant operations. 3. **Streamlined control flow**: By restructuring the nested conditions and eliminating the intermediate variable assignment for `error_response.headers`, the code path is more direct when headers are found in the response object. **Performance impact by test case type:** - **Best improvements (40-70% faster)**: Cases where headers are found in `response.headers` benefit most from the early return pattern, avoiding the final `litellm_response_headers` lookup - **Moderate improvements (20-35% faster)**: Cases with direct `headers` attributes benefit from reduced variable assignments - **Minimal impact**: Cases requiring fallback to `litellm_response_headers` see smaller gains since they still need to traverse the full logic path The optimization maintains identical functionality while reducing the average number of operations per call, particularly benefiting the common case where headers are found in the response object.

codeflash-ai bot requested a review from mashraf-222 October 30, 2025 15:31

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up function `_get_response_headers` by 12% #166

⚡️ Speed up function `_get_response_headers` by 12% #166

Uh oh!

codeflash-ai bot commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function _get_response_headers by 12% #166

Are you sure you want to change the base?

⚡️ Speed up function _get_response_headers by 12% #166

Uh oh!

Conversation

codeflash-ai bot commented Oct 30, 2025

📄 12% (0.12x) speedup for _get_response_headers in litellm/litellm_core_utils/exception_mapping_utils.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up function `_get_response_headers` by 12% #166

⚡️ Speed up function `_get_response_headers` by 12% #166

📄 12% (0.12x) speedup for `_get_response_headers` in `litellm/litellm_core_utils/exception_mapping_utils.py`