Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 29, 2025

📄 7% (0.07x) speedup for URL.include_query_params in starlette/datastructures.py

⏱️ Runtime : 9.61 milliseconds 8.97 milliseconds (best of 186 runs)

📝 Explanation and details

The optimized code achieves a 7% speedup through several key performance improvements:

Multi-dict operations optimization: The most significant bottleneck was in MultiDict.update(), which had expensive list comprehensions for filtering existing items. The optimization replaces k not in value.keys() (43% of time) with set(value._dict) for O(1) membership testing instead of O(n), and flattens all input arguments upfront to avoid repeated MultiDict instantiations.

Reduced object creation overhead: In ImmutableMultiDict.__init__, the original code created unnecessary intermediate ImmutableMultiDict instances when both args and kwargs were present. The optimized version processes everything as a flat list first, eliminating redundant object creation.

Micro-optimizations in hot paths:

  • multi_items() now returns self._list directly when it's already a list, avoiding the list() copy (50% time reduction in this method)
  • include_query_params() uses a generator expression instead of dict comprehension for parameter conversion, reducing intermediate dict creation
  • String operations in replace() use direct rfind() calls instead of rpartition() for slightly better performance

Test case performance gains: The optimizations are most effective for:

  • Large-scale operations (500+ parameters): 7-10% faster
  • Multi-parameter updates: 3-7% faster
  • URLs with existing query strings: 2-5% faster

The optimizations particularly benefit scenarios with many query parameters or frequent URL manipulations, making them ideal for web frameworks handling multiple request parameters.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 329 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 7 Passed
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from __future__ import annotations

from collections.abc import Iterable, Iterator, KeysView, Mapping
from typing import Any, TypeVar, cast
from urllib.parse import parse_qsl, urlencode, urlparse, urlunparse

# imports
import pytest
from starlette.datastructures import URL

# unit tests for include_query_params

# ---------------- BASIC TEST CASES ----------------

def test_add_single_param_to_empty_query():
    # Adding a single param to a URL with no query string
    url = URL("http://example.com/path")
    codeflash_output = url.include_query_params(foo="bar"); new_url = codeflash_output # 31.8μs -> 31.0μs (2.33% faster)

def test_add_multiple_params_to_empty_query():
    # Adding multiple params to a URL with no query string
    url = URL("http://example.com/path")
    codeflash_output = url.include_query_params(a="1", b="2"); new_url = codeflash_output # 26.7μs -> 25.9μs (3.19% faster)

def test_add_param_to_existing_query():
    # Adding a param to a URL that already has a query string
    url = URL("http://example.com/path?foo=bar")
    codeflash_output = url.include_query_params(baz="qux"); new_url = codeflash_output # 34.5μs -> 33.7μs (2.36% faster)

def test_override_existing_param():
    # Overriding an existing param
    url = URL("http://example.com/path?foo=bar")
    codeflash_output = url.include_query_params(foo="baz"); new_url = codeflash_output # 26.2μs -> 25.6μs (2.30% faster)

def test_add_param_with_int_value():
    # Adding a param with an int value (should be converted to str)
    url = URL("http://example.com/path")
    codeflash_output = url.include_query_params(num=123); new_url = codeflash_output # 23.1μs -> 23.0μs (0.574% faster)

def test_add_param_with_bool_value():
    # Adding a param with a bool value (should be converted to str)
    url = URL("http://example.com/path")
    codeflash_output = url.include_query_params(flag=True); new_url = codeflash_output # 23.7μs -> 23.0μs (3.06% faster)

def test_add_param_with_special_chars():
    # Adding a param with special characters (should be URL-encoded)
    url = URL("http://example.com/path")
    codeflash_output = url.include_query_params(q="hello world"); new_url = codeflash_output # 24.7μs -> 23.7μs (4.25% faster)

def test_add_param_with_blank_value():
    # Adding a param with a blank value
    url = URL("http://example.com/path")
    codeflash_output = url.include_query_params(empty=""); new_url = codeflash_output # 22.4μs -> 21.9μs (2.50% faster)

def test_add_param_to_url_with_fragment():
    # Adding a param to a URL with a fragment
    url = URL("http://example.com/path#section")
    codeflash_output = url.include_query_params(foo="bar"); new_url = codeflash_output # 29.5μs -> 29.3μs (0.854% faster)

# ---------------- EDGE TEST CASES ----------------

def test_param_with_unicode_characters():
    # Unicode should be encoded in the query
    url = URL("http://example.com/path")
    codeflash_output = url.include_query_params(name="Jöhn Dœ"); new_url = codeflash_output # 34.3μs -> 34.3μs (0.099% faster)

def test_override_multiple_existing_params():
    # Overriding multiple params at once
    url = URL("http://example.com/path?a=1&b=2")
    codeflash_output = url.include_query_params(a="x", b="y"); new_url = codeflash_output # 35.3μs -> 34.7μs (1.79% faster)

def test_preserve_blank_query_values():
    # Blank query values should be preserved
    url = URL("http://example.com/path?a=&b=2")
    codeflash_output = url.include_query_params(c="3"); new_url = codeflash_output # 35.8μs -> 34.6μs (3.61% faster)

def test_empty_kwargs_returns_same_url():
    # If no kwargs, the URL should not change
    url = URL("http://example.com/path?foo=bar")
    codeflash_output = url.include_query_params(); new_url = codeflash_output # 24.6μs -> 24.8μs (1.03% slower)


def test_param_value_is_none():
    # None values should be converted to 'None'
    url = URL("http://example.com/path")
    codeflash_output = url.include_query_params(foo=None); new_url = codeflash_output # 32.2μs -> 30.7μs (4.75% faster)

def test_param_with_reserved_chars():
    # Reserved URL chars in param name/value should be encoded
    url = URL("http://example.com/path")
    codeflash_output = url.include_query_params(q="a&b=c"); new_url = codeflash_output # 34.6μs -> 33.6μs (2.84% faster)

def test_url_with_multiple_same_key_params():
    # Only the last value for a key is kept (since MultiDict replaces)
    url = URL("http://example.com/path?a=1&a=2")
    codeflash_output = url.include_query_params(a="3"); new_url = codeflash_output # 35.7μs -> 35.1μs (1.63% faster)

def test_url_with_no_path():
    # URL with no path, just domain
    url = URL("http://example.com")
    codeflash_output = url.include_query_params(foo="bar"); new_url = codeflash_output # 28.8μs -> 28.0μs (2.96% faster)

def test_url_with_port_and_params():
    # URL with port number and params
    url = URL("http://example.com:8080/path?x=1")
    codeflash_output = url.include_query_params(y="2"); new_url = codeflash_output # 34.1μs -> 32.1μs (6.20% faster)

def test_url_with_empty_query_string():
    # URL with empty query string (trailing '?')
    url = URL("http://example.com/path?")
    codeflash_output = url.include_query_params(foo="bar"); new_url = codeflash_output # 28.7μs -> 28.3μs (1.35% faster)

# ---------------- LARGE SCALE TEST CASES ----------------

def test_add_100_params_to_empty_query():
    # Adding 100 params to a URL with no query string
    url = URL("http://example.com/path")
    params = {f"key{i}": f"value{i}" for i in range(100)}
    codeflash_output = url.include_query_params(**params); new_url = codeflash_output # 175μs -> 161μs (8.39% faster)
    query = str(new_url).split("?", 1)[1]
    pairs = dict(pair.split("=") for pair in query.split("&"))
    # All keys/values should be present
    for i in range(100):
        pass

def test_override_100_existing_params():
    # Overriding 100 existing params
    params = {f"key{i}": f"old{i}" for i in range(100)}
    url = URL("http://example.com/path?" + urlencode(params))
    new_params = {f"key{i}": f"new{i}" for i in range(100)}
    codeflash_output = url.include_query_params(**new_params); new_url = codeflash_output # 256μs -> 240μs (6.49% faster)
    query = str(new_url).split("?", 1)[1]
    pairs = dict(pair.split("=") for pair in query.split("&"))
    for i in range(100):
        pass

def test_add_params_to_url_with_long_query():
    # Add params to a URL with a long query string (500 keys)
    base_params = {f"k{i}": f"v{i}" for i in range(500)}
    url = URL("http://example.com/path?" + urlencode(base_params))
    codeflash_output = url.include_query_params(extra="yes"); new_url = codeflash_output # 984μs -> 927μs (6.10% faster)
    query = str(new_url).split("?", 1)[1]
    pairs = dict(pair.split("=") for pair in query.split("&"))
    for i in range(500):
        pass

def test_add_params_with_large_values():
    # Add params with large values (value length 500)
    url = URL("http://example.com/path")
    long_value = "x" * 500
    codeflash_output = url.include_query_params(big=long_value); new_url = codeflash_output # 28.3μs -> 27.8μs (2.00% faster)

def test_add_params_with_large_keys():
    # Add params with large key names (key length 100)
    url = URL("http://example.com/path")
    long_key = "k" * 100
    codeflash_output = url.include_query_params(**{long_key: "val"}); new_url = codeflash_output # 25.0μs -> 24.3μs (3.08% faster)

def test_performance_with_999_params():
    # Add 999 params, check that all are present
    url = URL("http://example.com/path")
    params = {f"p{i}": str(i) for i in range(999)}
    codeflash_output = url.include_query_params(**params); new_url = codeflash_output # 1.45ms -> 1.31ms (9.97% faster)
    query = str(new_url).split("?", 1)[1]
    pairs = dict(pair.split("=") for pair in query.split("&"))
    for i in range(999):
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from urllib.parse import parse_qsl, urlencode, urlparse, urlunparse

# imports
import pytest  # used for our unit tests
from starlette.datastructures import URL

# unit tests

# 1. Basic Test Cases

def test_basic_add_param_to_empty_url():
    # Adding a single query param to a URL with no query
    url = URL("https://example.com/path")
    codeflash_output = url.include_query_params(foo="bar"); new_url = codeflash_output # 33.4μs -> 32.7μs (2.16% faster)

def test_basic_add_multiple_params():
    # Adding multiple params
    url = URL("https://example.com/path")
    codeflash_output = url.include_query_params(a="1", b="2"); new_url = codeflash_output # 27.5μs -> 26.6μs (3.56% faster)

def test_basic_update_existing_param():
    # Updating an existing param
    url = URL("https://example.com/path?foo=bar")
    codeflash_output = url.include_query_params(foo="baz"); new_url = codeflash_output # 33.7μs -> 33.1μs (1.54% faster)

def test_basic_preserve_existing_and_add_new():
    # Preserving existing params and adding new ones
    url = URL("https://example.com/path?foo=bar")
    codeflash_output = url.include_query_params(bar="baz"); new_url = codeflash_output # 28.6μs -> 27.4μs (4.38% faster)
    # Order may vary, but both params must be present
    qs = urlparse(str(new_url)).query
    params = dict(parse_qsl(qs))

def test_basic_overwrite_and_add():
    # Overwrite one param, add another
    url = URL("https://example.com/path?x=1&y=2")
    codeflash_output = url.include_query_params(x="10", z="30"); new_url = codeflash_output # 37.3μs -> 34.9μs (6.68% faster)
    qs = urlparse(str(new_url)).query
    params = dict(parse_qsl(qs))

# 2. Edge Test Cases

def test_edge_empty_param_value():
    # Add param with empty value
    url = URL("https://example.com/path")
    codeflash_output = url.include_query_params(empty=""); new_url = codeflash_output # 23.1μs -> 22.4μs (3.19% faster)
    qs = urlparse(str(new_url)).query
    params = dict(parse_qsl(qs, keep_blank_values=True))

def test_edge_param_with_special_characters():
    # Add param with special chars
    url = URL("https://example.com/path")
    codeflash_output = url.include_query_params(q="hello world&foo=bar"); new_url = codeflash_output # 35.3μs -> 35.0μs (0.839% faster)

def test_edge_existing_query_with_blank_value():
    # Existing blank value, update it
    url = URL("https://example.com/path?foo=")
    codeflash_output = url.include_query_params(foo="bar"); new_url = codeflash_output # 31.5μs -> 30.4μs (3.52% faster)

def test_edge_param_numeric_and_bool():
    # Add numeric and boolean values
    url = URL("https://example.com/path")
    codeflash_output = url.include_query_params(num=42, flag=True); new_url = codeflash_output # 26.2μs -> 25.4μs (3.12% faster)
    qs = urlparse(str(new_url)).query
    params = dict(parse_qsl(qs))

def test_edge_param_none_value():
    # Add param with None value (should be converted to "None")
    url = URL("https://example.com/path")
    codeflash_output = url.include_query_params(none_val=None); new_url = codeflash_output # 23.2μs -> 22.9μs (1.07% faster)
    qs = urlparse(str(new_url)).query
    params = dict(parse_qsl(qs))

def test_edge_param_unicode_characters():
    # Unicode characters in param
    url = URL("https://example.com/path")
    codeflash_output = url.include_query_params(u="üñîçødë"); new_url = codeflash_output # 33.0μs -> 32.4μs (2.12% faster)
    qs = urlparse(str(new_url)).query
    params = dict(parse_qsl(qs))

def test_edge_existing_multiple_params_update_one():
    # Multiple existing params, update one
    url = URL("https://example.com/path?a=1&b=2&c=3")
    codeflash_output = url.include_query_params(b="20"); new_url = codeflash_output # 37.0μs -> 35.2μs (4.90% faster)
    qs = urlparse(str(new_url)).query
    params = dict(parse_qsl(qs))


def test_edge_param_value_is_list():
    # Param value is a list, should be converted to str
    url = URL("https://example.com/path")
    codeflash_output = url.include_query_params(mylist=[1,2,3]); new_url = codeflash_output # 40.9μs -> 39.6μs (3.49% faster)
    qs = urlparse(str(new_url)).query
    params = dict(parse_qsl(qs))

def test_edge_existing_query_with_duplicate_keys():
    # Existing query with duplicate keys, only last is kept
    url = URL("https://example.com/path?a=1&a=2")
    codeflash_output = url.include_query_params(a="3"); new_url = codeflash_output # 35.6μs -> 34.5μs (3.36% faster)
    qs = urlparse(str(new_url)).query
    params = dict(parse_qsl(qs))

def test_edge_update_preserves_path_and_scheme():
    # Update query, path and scheme must be preserved
    url = URL("http://host:8080/some/path?foo=bar")
    codeflash_output = url.include_query_params(foo="baz", new="yes"); new_url = codeflash_output # 35.2μs -> 33.6μs (4.75% faster)
    parsed = urlparse(str(new_url))
    params = dict(parse_qsl(parsed.query))

def test_edge_empty_url():
    # Edge case: empty URL
    url = URL("")
    codeflash_output = url.include_query_params(a="1"); new_url = codeflash_output # 23.4μs -> 23.1μs (1.24% faster)

def test_edge_url_with_fragment():
    # URL with fragment, fragment should be preserved
    url = URL("https://example.com/path?foo=bar#section1")
    codeflash_output = url.include_query_params(foo="baz"); new_url = codeflash_output # 32.2μs -> 31.2μs (3.46% faster)
    parsed = urlparse(str(new_url))
    params = dict(parse_qsl(parsed.query))

# 3. Large Scale Test Cases

def test_large_number_of_params():
    # Add 500 params to a URL
    url = URL("https://example.com/")
    params = {f"key{i}": f"value{i}" for i in range(500)}
    codeflash_output = url.include_query_params(**params); new_url = codeflash_output # 761μs -> 701μs (8.55% faster)
    qs = urlparse(str(new_url)).query
    parsed_params = dict(parse_qsl(qs))
    for i in range(500):
        pass

def test_large_existing_and_new_params():
    # Start with 500, update 250, add 250 new
    base_params = {f"p{i}": f"v{i}" for i in range(500)}
    url = URL("https://example.com/?" + urlencode(base_params))
    update_params = {f"p{i}": f"new{i}" for i in range(250)}
    add_params = {f"q{i}": f"add{i}" for i in range(250)}
    all_params = dict(base_params)
    all_params.update(update_params)
    all_params.update(add_params)
    codeflash_output = url.include_query_params(**update_params, **add_params); new_url = codeflash_output # 1.43ms -> 1.33ms (7.58% faster)
    qs = urlparse(str(new_url)).query
    parsed_params = dict(parse_qsl(qs))
    for i in range(250):
        pass
    for i in range(250, 500):
        pass

def test_large_params_with_long_values():
    # Add 100 params with long string values
    long_str = "x" * 100
    url = URL("https://example.com/")
    params = {f"k{i}": long_str for i in range(100)}
    codeflash_output = url.include_query_params(**params); new_url = codeflash_output # 211μs -> 197μs (7.06% faster)
    qs = urlparse(str(new_url)).query
    parsed_params = dict(parse_qsl(qs))
    for i in range(100):
        pass

def test_large_update_of_existing_long_query():
    # Update 100 params in a URL with 1000 params
    base_params = {f"p{i}": f"v{i}" for i in range(1000)}
    url = URL("https://example.com/?" + urlencode(base_params))
    update_params = {f"p{i}": f"u{i}" for i in range(100, 200)}
    codeflash_output = url.include_query_params(**update_params); new_url = codeflash_output # 1.99ms -> 1.85ms (7.56% faster)
    qs = urlparse(str(new_url)).query
    parsed_params = dict(parse_qsl(qs))
    for i in range(100, 200):
        pass
    for i in range(0, 100):
        pass
    for i in range(200, 1000):
        pass

def test_large_url_with_existing_fragment_and_params():
    # Large query string with fragment, update and add params
    base_params = {f"a{i}": f"b{i}" for i in range(500)}
    url = URL("https://example.com/path?" + urlencode(base_params) + "#frag")
    update_params = {f"a{i}": f"c{i}" for i in range(100)}
    add_params = {f"new{i}": f"val{i}" for i in range(100)}
    codeflash_output = url.include_query_params(**update_params, **add_params); new_url = codeflash_output # 1.18ms -> 1.09ms (7.39% faster)
    parsed = urlparse(str(new_url))
    qs = parsed.query
    parsed_params = dict(parse_qsl(qs))
    for i in range(100):
        pass
    for i in range(100, 500):
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from starlette.datastructures import URL

def test_URL_include_query_params():
    URL.include_query_params(URL(url='', scope=None))
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_xzaz2m9_/tmpai5sccoa/test_concolic_coverage.py::test_URL_include_query_params 30.5μs 30.9μs -1.42%⚠️

To edit these changes git checkout codeflash/optimize-URL.include_query_params-mhbqbwkw and push.

Codeflash

The optimized code achieves a 7% speedup through several key performance improvements:

**Multi-dict operations optimization**: The most significant bottleneck was in `MultiDict.update()`, which had expensive list comprehensions for filtering existing items. The optimization replaces `k not in value.keys()` (43% of time) with `set(value._dict)` for O(1) membership testing instead of O(n), and flattens all input arguments upfront to avoid repeated `MultiDict` instantiations.

**Reduced object creation overhead**: In `ImmutableMultiDict.__init__`, the original code created unnecessary intermediate `ImmutableMultiDict` instances when both args and kwargs were present. The optimized version processes everything as a flat list first, eliminating redundant object creation.

**Micro-optimizations in hot paths**: 
- `multi_items()` now returns `self._list` directly when it's already a list, avoiding the `list()` copy (50% time reduction in this method)
- `include_query_params()` uses a generator expression instead of dict comprehension for parameter conversion, reducing intermediate dict creation
- String operations in `replace()` use direct `rfind()` calls instead of `rpartition()` for slightly better performance

**Test case performance gains**: The optimizations are most effective for:
- Large-scale operations (500+ parameters): 7-10% faster
- Multi-parameter updates: 3-7% faster  
- URLs with existing query strings: 2-5% faster

The optimizations particularly benefit scenarios with many query parameters or frequent URL manipulations, making them ideal for web frameworks handling multiple request parameters.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 29, 2025 08:24
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant