Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 29, 2025

📄 30% (0.30x) speedup for BackgroundColorVisualizationBlockV1.getAnnotator in inference/core/workflows/core_steps/visualizations/background_color/v1.py

⏱️ Runtime : 3.02 milliseconds 2.33 milliseconds (best of 174 runs)

📝 Explanation and details

The optimized code achieves a 29% speedup through two key optimizations:

1. Precomputed Color Name Set (str_to_color function):

  • What: Replaced hasattr(sv.Color, color.upper()) with a precomputed set _COLOR_NAMES containing all valid color names
  • Why faster: The original code performed expensive attribute lookup (hasattr) and string conversion (upper()) on every call. The optimized version does a single set membership check against a precomputed set, which is O(1) and much faster than reflection-based attribute checking
  • Impact: Named color tests show 42-47% speedups, with the largest gains on invalid color names (122% faster) since set lookup fails immediately

2. Tuple-based Cache Keys (getAnnotator method):

  • What: Changed cache key from "_".join(map(str, [color, opacity])) to simple tuple (color, opacity)
  • Why faster: Eliminates string conversion (map(str, ...)) and string concatenation ("_".join) operations. Tuples are immutable and natively hashable, making them ideal dict keys with minimal overhead
  • Impact: Cache operations are 99-112% faster on cache hits, with 12-28% improvements on cache misses

Test Case Performance:

  • Named colors: Biggest gains (42-47%) due to the precomputed set optimization
  • Cache hits: Massive improvements (99-112%) from tuple keys
  • Large scale tests: 18-128% faster, showing the optimizations scale well
  • Error cases: 29-129% faster since invalid names fail quickly in set lookup

The optimizations are particularly effective for workloads with frequent color validation and cache reuse.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 2355 Passed
⏪ Replay Tests 🔘 None Found
🔎 Concolic Coverage Tests 🔘 None Found
📊 Tests Coverage 100.0%
🌀 Generated Regression Tests and Runtime
from abc import ABC

# imports
import pytest
from inference.core.workflows.core_steps.visualizations.background_color.v1 import \
    BackgroundColorVisualizationBlockV1


# Minimal stub for supervision.annotators.base.BaseAnnotator
class BaseAnnotator:
    pass

# Minimal stub for BackgroundColorAnnotator
class BackgroundColorAnnotator(BaseAnnotator):
    def __init__(self, color, opacity):
        self.color = color
        self.opacity = opacity

# Minimal stub for VisualizationBlock
class VisualizationBlock:
    def __init__(self, *args, **kwargs):
        pass

class PredictionsVisualizationBlock(VisualizationBlock, ABC):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
from inference.core.workflows.core_steps.visualizations.background_color.v1 import \
    BackgroundColorVisualizationBlockV1

# --- Unit Tests ---

@pytest.fixture
def block():
    # Provide a fresh block for each test
    return BackgroundColorVisualizationBlockV1()

# ------------------ BASIC TEST CASES ------------------

def test_hex_color_basic(block):
    # Basic: Hex color, standard opacity
    codeflash_output = block.getAnnotator("#FF0000", 0.5); annotator = codeflash_output # 7.50μs -> 6.13μs (22.2% faster)

def test_rgb_color_basic(block):
    # Basic: rgb color string
    codeflash_output = block.getAnnotator("rgb(10,20,30)", 1.0); annotator = codeflash_output # 5.42μs -> 4.83μs (12.3% faster)

def test_bgr_color_basic(block):
    # Basic: bgr color string
    codeflash_output = block.getAnnotator("bgr(30,20,10)", 0.75); annotator = codeflash_output # 5.91μs -> 4.63μs (27.6% faster)

def test_named_color_basic(block):
    # Basic: Named color (case insensitive)
    codeflash_output = block.getAnnotator("white", 0.3); annotator = codeflash_output # 12.6μs -> 8.58μs (46.9% faster)

def test_named_color_uppercase(block):
    # Basic: Named color (uppercase)
    codeflash_output = block.getAnnotator("BLACK", 0.8); annotator = codeflash_output # 11.3μs -> 7.97μs (42.3% faster)

def test_named_color_mixed_case(block):
    # Basic: Named color (mixed case)
    codeflash_output = block.getAnnotator("Blue", 0.6); annotator = codeflash_output # 11.5μs -> 7.90μs (45.0% faster)

# ------------------ EDGE TEST CASES ------------------

def test_invalid_color_format(block):
    # Edge: Invalid color string should raise ValueError
    with pytest.raises(ValueError):
        block.getAnnotator("notacolor", 0.5) # 4.37μs -> 1.96μs (122% faster)


def test_invalid_rgb_values(block):
    # Edge: RGB string with wrong number of values
    with pytest.raises(ValueError):
        block.getAnnotator("rgb(10,20)", 0.5) # 7.26μs -> 4.92μs (47.4% faster)

def test_invalid_bgr_values(block):
    # Edge: BGR string with wrong number of values
    with pytest.raises(ValueError):
        block.getAnnotator("bgr(10,20)", 0.5) # 5.50μs -> 3.67μs (49.6% faster)

def test_rgb_non_integer_values(block):
    # Edge: RGB string with non-integer values
    with pytest.raises(ValueError):
        block.getAnnotator("rgb(a,b,c)", 0.5) # 5.72μs -> 4.03μs (41.9% faster)

def test_bgr_non_integer_values(block):
    # Edge: BGR string with non-integer values
    with pytest.raises(ValueError):
        block.getAnnotator("bgr(a,b,c)", 0.5) # 5.50μs -> 3.77μs (46.0% faster)

def test_opacity_zero(block):
    # Edge: Opacity at lower bound
    codeflash_output = block.getAnnotator("#00FF00", 0.0); annotator = codeflash_output # 11.5μs -> 10.1μs (13.9% faster)

def test_opacity_one(block):
    # Edge: Opacity at upper bound
    codeflash_output = block.getAnnotator("#00FF00", 1.0); annotator = codeflash_output # 8.30μs -> 7.12μs (16.5% faster)

def test_opacity_negative(block):
    # Edge: Negative opacity (should be accepted as per code, but test for correct assignment)
    codeflash_output = block.getAnnotator("#00FF00", -0.5); annotator = codeflash_output # 8.62μs -> 6.74μs (27.9% faster)

def test_opacity_above_one(block):
    # Edge: Opacity above 1 (should be accepted as per code, but test for correct assignment)
    codeflash_output = block.getAnnotator("#00FF00", 1.5); annotator = codeflash_output # 8.14μs -> 6.57μs (23.9% faster)

def test_cache_is_used(block):
    # Edge: Cache should return same object for same key
    codeflash_output = block.getAnnotator("#123456", 0.7); a1 = codeflash_output # 8.36μs -> 6.51μs (28.4% faster)
    codeflash_output = block.getAnnotator("#123456", 0.7); a2 = codeflash_output # 1.08μs -> 541ns (99.4% faster)

def test_cache_different_opacity(block):
    # Edge: Different opacity should give different object
    codeflash_output = block.getAnnotator("#123456", 0.7); a1 = codeflash_output # 7.66μs -> 6.32μs (21.1% faster)
    codeflash_output = block.getAnnotator("#123456", 0.8); a2 = codeflash_output # 3.72μs -> 3.15μs (18.2% faster)

def test_cache_different_color(block):
    # Edge: Different color should give different object
    codeflash_output = block.getAnnotator("#123456", 0.7); a1 = codeflash_output # 7.13μs -> 5.96μs (19.7% faster)
    codeflash_output = block.getAnnotator("#654321", 0.7); a2 = codeflash_output # 3.58μs -> 3.12μs (14.4% faster)

def test_cache_key_collision(block):
    # Edge: Test that "rgb(1,2,3)_0.5" and "rgb(1,2,3)_0.5" are the same key
    codeflash_output = block.getAnnotator("rgb(1,2,3)", 0.5); a1 = codeflash_output # 5.90μs -> 4.77μs (23.5% faster)
    codeflash_output = block.getAnnotator("rgb(1,2,3)", 0.5); a2 = codeflash_output # 1.03μs -> 506ns (103% faster)

def test_cache_key_uniqueness(block):
    # Edge: Different color string representations should be unique
    codeflash_output = block.getAnnotator("#010203", 0.5); a1 = codeflash_output # 7.71μs -> 6.02μs (27.9% faster)
    codeflash_output = block.getAnnotator("rgb(1,2,3)", 0.5); a2 = codeflash_output # 3.76μs -> 3.35μs (12.0% faster)

# ------------------ LARGE SCALE TEST CASES ------------------

def test_many_unique_annotators(block):
    # Large Scale: Create many unique annotators, ensure all are unique
    colors = [f"#%02X%02X%02X" % (i, i, i) for i in range(100)]
    opacities = [i / 100 for i in range(100)]
    annotators = []
    for i in range(100):
        annotators.append(block.getAnnotator(colors[i], opacities[i])) # 245μs -> 207μs (18.4% faster)

def test_cache_memory_efficiency(block):
    # Large Scale: Repeated calls for same color/opacity do not increase cache size
    color = "#ABCDEF"
    opacity = 0.25
    for _ in range(200):
        block.getAnnotator(color, opacity) # 109μs -> 48.6μs (124% faster)

def test_large_number_of_colors_and_opacities(block):
    # Large Scale: Mix colors and opacities, ensure cache size matches
    colors = [f"rgb({i},{i+1},{i+2})" for i in range(50)]
    opacities = [round(0.01 * i, 2) for i in range(20)]
    for color in colors:
        for opacity in opacities:
            block.getAnnotator(color, opacity)

def test_performance_under_load(block):
    # Large Scale: Ensure function completes quickly for 500 unique calls
    import time
    colors = [f"#%02X%02X%02X" % (i, i, i) for i in range(500)]
    start = time.time()
    for i, color in enumerate(colors):
        block.getAnnotator(color, 0.1 * (i % 10))
    elapsed = time.time() - start
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
from abc import ABC

# imports
import pytest
from inference.core.workflows.core_steps.visualizations.background_color.v1 import \
    BackgroundColorVisualizationBlockV1


# Stub for sv.annotators.base.BaseAnnotator
class BaseAnnotator:
    pass

# Stub for BackgroundColorAnnotator
class BackgroundColorAnnotator(BaseAnnotator):
    def __init__(self, color, opacity):
        self.color = color
        self.opacity = opacity

# --- Minimal base class stub ---

class VisualizationBlock:
    def __init__(self, *args, **kwargs):
        pass


class PredictionsVisualizationBlock(VisualizationBlock, ABC):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
from inference.core.workflows.core_steps.visualizations.background_color.v1 import \
    BackgroundColorVisualizationBlockV1

# --- Unit tests ---

# ----------- BASIC TEST CASES -----------

def test_basic_hex_color():
    # Test with a valid hex color
    block = BackgroundColorVisualizationBlockV1()
    codeflash_output = block.getAnnotator("#FF0000", 0.5); annotator = codeflash_output # 12.3μs -> 10.3μs (19.7% faster)

def test_basic_rgb_color():
    # Test with a valid rgb color string
    block = BackgroundColorVisualizationBlockV1()
    codeflash_output = block.getAnnotator("rgb(10,20,30)", 0.7); annotator = codeflash_output # 6.76μs -> 5.37μs (25.8% faster)

def test_basic_bgr_color():
    # Test with a valid bgr color string
    block = BackgroundColorVisualizationBlockV1()
    codeflash_output = block.getAnnotator("bgr(1,2,3)", 1.0); annotator = codeflash_output # 5.79μs -> 4.78μs (20.9% faster)

def test_basic_named_color():
    # Test with a valid named color
    block = BackgroundColorVisualizationBlockV1()
    codeflash_output = block.getAnnotator("white", 0.25); annotator = codeflash_output # 14.1μs -> 9.90μs (42.6% faster)

def test_basic_named_color_case_insensitive():
    # Named color should work case-insensitively
    block = BackgroundColorVisualizationBlockV1()
    codeflash_output = block.getAnnotator("Black", 0.9); annotator = codeflash_output # 11.7μs -> 8.20μs (43.0% faster)

def test_basic_cache_reuse():
    # Should reuse annotator from cache for same key
    block = BackgroundColorVisualizationBlockV1()
    codeflash_output = block.getAnnotator("#00FF00", 0.8); a1 = codeflash_output # 7.76μs -> 6.59μs (17.8% faster)
    codeflash_output = block.getAnnotator("#00FF00", 0.8); a2 = codeflash_output # 1.02μs -> 480ns (112% faster)

def test_basic_cache_different_opacity():
    # Different opacity should create a new annotator
    block = BackgroundColorVisualizationBlockV1()
    codeflash_output = block.getAnnotator("#00FF00", 0.8); a1 = codeflash_output # 7.44μs -> 6.21μs (19.7% faster)
    codeflash_output = block.getAnnotator("#00FF00", 0.9); a2 = codeflash_output # 3.53μs -> 3.06μs (15.5% faster)

def test_basic_cache_different_color():
    # Different color should create a new annotator
    block = BackgroundColorVisualizationBlockV1()
    codeflash_output = block.getAnnotator("#00FF00", 0.8); a1 = codeflash_output # 6.91μs -> 5.87μs (17.8% faster)
    codeflash_output = block.getAnnotator("#0000FF", 0.8); a2 = codeflash_output # 3.57μs -> 3.00μs (18.9% faster)

# ----------- EDGE TEST CASES -----------

def test_edge_invalid_hex():
    # Invalid hex string should raise ValueError
    block = BackgroundColorVisualizationBlockV1()
    with pytest.raises(ValueError):
        block.getAnnotator("#GGHHII", 0.5) # 4.62μs -> 2.97μs (55.3% faster)

def test_edge_invalid_rgb():
    # Invalid rgb string (wrong format)
    block = BackgroundColorVisualizationBlockV1()
    with pytest.raises(ValueError):
        block.getAnnotator("rgb(10,20)", 0.5) # 5.12μs -> 4.03μs (27.3% faster)

def test_edge_invalid_bgr():
    # Invalid bgr string (not enough values)
    block = BackgroundColorVisualizationBlockV1()
    with pytest.raises(ValueError):
        block.getAnnotator("bgr(1,2)", 0.5) # 4.67μs -> 3.35μs (39.4% faster)

def test_edge_unknown_named_color():
    # Unknown named color should raise ValueError
    block = BackgroundColorVisualizationBlockV1()
    with pytest.raises(ValueError):
        block.getAnnotator("purple", 0.5) # 4.19μs -> 1.83μs (129% faster)

def test_edge_empty_color_string():
    # Empty color string should raise ValueError
    block = BackgroundColorVisualizationBlockV1()
    with pytest.raises(ValueError):
        block.getAnnotator("", 0.5) # 3.77μs -> 1.79μs (111% faster)

def test_edge_negative_opacity():
    # Negative opacity is technically allowed by the function, but let's check it is passed through
    block = BackgroundColorVisualizationBlockV1()
    codeflash_output = block.getAnnotator("#000000", -0.1); annotator = codeflash_output # 10.3μs -> 8.80μs (17.1% faster)

def test_edge_opacity_zero():
    # Opacity of zero should be accepted
    block = BackgroundColorVisualizationBlockV1()
    codeflash_output = block.getAnnotator("rgb(1,2,3)", 0.0); annotator = codeflash_output # 5.59μs -> 5.10μs (9.63% faster)

def test_edge_opacity_one():
    # Opacity of one should be accepted
    block = BackgroundColorVisualizationBlockV1()
    codeflash_output = block.getAnnotator("rgb(1,2,3)", 1.0); annotator = codeflash_output # 5.09μs -> 4.50μs (12.9% faster)

def test_edge_float_color_string():
    # Color strings must be valid, floats in color string should fail
    block = BackgroundColorVisualizationBlockV1()
    with pytest.raises(ValueError):
        block.getAnnotator("rgb(1.0,2.0,3.0)", 0.5) # 5.69μs -> 4.20μs (35.5% faster)

def test_edge_str_opacity():
    # Opacity as a string should be converted to key, but passed as float to BackgroundColorAnnotator
    block = BackgroundColorVisualizationBlockV1()
    codeflash_output = block.getAnnotator("#123456", 0.75); annotator = codeflash_output # 9.79μs -> 8.01μs (22.2% faster)

# ----------- LARGE SCALE TEST CASES -----------

def test_large_scale_many_unique_annotators():
    # Create many unique annotators to test cache scaling
    block = BackgroundColorVisualizationBlockV1()
    colors = ["#%02X%02X%02X" % (i, i, i) for i in range(0, 100, 10)]
    opacities = [round(0.1 * i, 2) for i in range(10)]
    annotators = set()
    for color in colors:
        for opacity in opacities:
            codeflash_output = block.getAnnotator(color, opacity); a = codeflash_output
            annotators.add(a)
            # Check that cache key is unique
            key = "_".join([color, str(opacity)])

def test_large_scale_cache_memory_efficiency():
    # Ensure cache does not grow for repeated requests
    block = BackgroundColorVisualizationBlockV1()
    color = "#ABCDEF"
    opacity = 0.5
    for _ in range(500):
        codeflash_output = block.getAnnotator(color, opacity); a = codeflash_output # 254μs -> 111μs (128% faster)

def test_large_scale_mixed_color_formats():
    # Mix hex, rgb, bgr, named colors in a large batch
    block = BackgroundColorVisualizationBlockV1()
    color_formats = [
        "#%02X%02X%02X" % (i, i, i) for i in range(0, 100, 20)
    ] + [
        "rgb(%d,%d,%d)" % (i, i+1, i+2) for i in range(0, 100, 20)
    ] + [
        "bgr(%d,%d,%d)" % (i, i+1, i+2) for i in range(0, 100, 20)
    ] + ["white", "black", "blue"]
    opacities = [0.1, 0.5, 1.0]
    annotators = []
    for color in color_formats:
        for opacity in opacities:
            codeflash_output = block.getAnnotator(color, opacity); a = codeflash_output
            annotators.append(a)

def test_large_scale_key_collision():
    # Test that keys are unique even if color and opacity as strings could collide
    block = BackgroundColorVisualizationBlockV1()
    color1 = "#010203"
    color2 = "#0102" + "03"
    opacity1 = 0.5
    opacity2 = 0.50
    codeflash_output = block.getAnnotator(color1, opacity1); a1 = codeflash_output # 7.12μs -> 6.00μs (18.6% faster)
    codeflash_output = block.getAnnotator(color2, opacity2); a2 = codeflash_output # 928ns -> 464ns (100% faster)
    # Even if color2 is constructed differently, its string representation is the same
    if color1 == color2 and opacity1 == opacity2:
        pass
    else:
        pass

def test_large_scale_cache_eviction_not_supported():
    # The cache should not evict items, so after many entries, all should remain
    block = BackgroundColorVisualizationBlockV1()
    for i in range(50):
        color = "#%02X%02X%02X" % (i, i, i)
        opacity = round(i / 50, 2)
        block.getAnnotator(color, opacity) # 126μs -> 106μs (18.7% faster)
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-BackgroundColorVisualizationBlockV1.getAnnotator-mhbwkm49 and push.

Codeflash

The optimized code achieves a **29% speedup** through two key optimizations:

**1. Precomputed Color Name Set (`str_to_color` function):**
- **What**: Replaced `hasattr(sv.Color, color.upper())` with a precomputed set `_COLOR_NAMES` containing all valid color names
- **Why faster**: The original code performed expensive attribute lookup (`hasattr`) and string conversion (`upper()`) on every call. The optimized version does a single set membership check against a precomputed set, which is O(1) and much faster than reflection-based attribute checking
- **Impact**: Named color tests show 42-47% speedups, with the largest gains on invalid color names (122% faster) since set lookup fails immediately

**2. Tuple-based Cache Keys (`getAnnotator` method):**
- **What**: Changed cache key from `"_".join(map(str, [color, opacity]))` to simple tuple `(color, opacity)`
- **Why faster**: Eliminates string conversion (`map(str, ...)`) and string concatenation (`"_".join`) operations. Tuples are immutable and natively hashable, making them ideal dict keys with minimal overhead
- **Impact**: Cache operations are 99-112% faster on cache hits, with 12-28% improvements on cache misses

**Test Case Performance:**
- **Named colors**: Biggest gains (42-47%) due to the precomputed set optimization
- **Cache hits**: Massive improvements (99-112%) from tuple keys
- **Large scale tests**: 18-128% faster, showing the optimizations scale well
- **Error cases**: 29-129% faster since invalid names fail quickly in set lookup

The optimizations are particularly effective for workloads with frequent color validation and cache reuse.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 29, 2025 11:19
@codeflash-ai codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant