⚡️ Speed up method `PathDeviationAnalyticsBlockV2.run` by 17% #634

codeflash-ai · 2025-10-29T11:51:54Z

📄 17% (0.17x) speedup for `PathDeviationAnalyticsBlockV2.run` in `inference/core/workflows/core_steps/analytics/path_deviation/v2.py`

⏱️ Runtime : 69.1 microseconds → 59.1 microseconds (best of 40 runs)

📝 Explanation and details

The optimized code achieves a 17% speedup through several key performance optimizations:

1. Reduced Dictionary Lookups

Caches object_paths[video_id] as object_paths_video to avoid repeated dictionary lookups in the detection loop
Pre-stores PATH_DEVIATION_KEY_IN_SV_DETECTIONS as output_key to eliminate string constant lookups

2. Memory-Efficient Array Construction

Replaces np.array(obj_path) with np.fromiter(obj_path, dtype=np.float64).reshape(-1, 2) for faster conversion from list of tuples to numpy array
Uses np.ascontiguousarray() to ensure C-contiguous memory layout for faster access patterns during computation

3. Optimized Distance Matrix Operations

Changes from np.ones() * -1 to np.full(-1.0) for more efficient matrix initialization
Ensures consistent np.float64 dtype throughout to avoid type conversion overhead

4. Inlined Critical Path Operations

Inlines Euclidean distance calculation within _compute_distance() to eliminate function call overhead in the hot recursive path
Manually optimizes the min() operation with explicit comparisons to avoid Python builtin overhead

5. Enhanced Edge Case Handling

Adds early return for empty paths with float("inf") to prevent unnecessary computation

The optimizations are particularly effective for workloads with many tracked objects (as seen in test cases with multiple detections), where the reduced dictionary lookups and memory-efficient array operations compound. The 17% improvement comes primarily from eliminating repeated lookups and optimizing the memory-intensive Fréchet distance computation.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 30 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

import numpy as np
# imports
import pytest
from inference.core.workflows.core_steps.analytics.path_deviation.v2 import \
    PathDeviationAnalyticsBlockV2

# --- Mocks and minimal stubs for dependencies ---

# Constants for keys
PATH_DEVIATION_KEY_IN_SV_DETECTIONS = "path_deviation"
OUTPUT_KEY = "path_deviation_detections"

# Minimal Detections class to simulate supervision.Detections
class MockDetection(dict):
    """A dict subclass to allow attribute access and indexing."""
    def __getitem__(self, item):
        return dict.__getitem__(self, item)
    def __setitem__(self, key, value):
        dict.__setitem__(self, key, value)

class MockDetections:
    def __init__(self, tracker_id, anchors):
        # tracker_id: list of int or str
        # anchors: list of (float, float)
        self.tracker_id = tracker_id
        self._anchors = anchors
        self._dets = [MockDetection() for _ in tracker_id]
    def get_anchors_coordinates(self, anchor):
        # anchor param is ignored in this mock
        return self._anchors
    def __getitem__(self, idx):
        return self._dets[idx]
    def __len__(self):
        return len(self.tracker_id)
    @staticmethod
    def merge(detections):
        # For simplicity, return a new MockDetections with merged tracker_ids and anchors
        merged = MockDetections(
            [d.get('tracker_id', i) for i, d in enumerate(detections)],
            [d.get('anchor', (0, 0)) for d in detections]
        )
        merged._dets = detections
        return merged

# Minimal WorkflowImageData and metadata
class MockVideoMetadata:
    def __init__(self, video_identifier):
        self.video_identifier = video_identifier

class MockWorkflowImageData:
    def __init__(self, video_identifier):
        self.video_metadata = MockVideoMetadata(video_identifier)

# --- Unit tests ---

# 1. Basic Test Cases





def test_empty_reference_path():
    """Edge case: empty reference path should raise or return inf."""
    block = PathDeviationAnalyticsBlockV2()
    dets = MockDetections([1], [(1, 2)])
    image = MockWorkflowImageData("video5")
    ref_path = []
    # Should raise IndexError or ValueError due to empty path
    with pytest.raises(Exception):
        block.run(dets, image, "center", ref_path) # 26.6μs -> 12.5μs (112% faster)

def test_empty_detections():
    """Edge case: no detections, should return empty output."""
    block = PathDeviationAnalyticsBlockV2()
    dets = MockDetections([], [])
    image = MockWorkflowImageData("video6")
    ref_path = [(0, 0)]
    codeflash_output = block.run(dets, image, "center", ref_path); result = codeflash_output # 19.1μs -> 21.8μs (12.4% slower)
    output = result[OUTPUT_KEY]









#------------------------------------------------
import numpy as np
# imports
import pytest
from inference.core.workflows.core_steps.analytics.path_deviation.v2 import \
    PathDeviationAnalyticsBlockV2

# --- Minimal stubs/mocks for external dependencies ---

# Simulate the PATH_DEVIATION_KEY_IN_SV_DETECTIONS constant
PATH_DEVIATION_KEY_IN_SV_DETECTIONS = "path_deviation"

# Minimal WorkflowImageData stub
class WorkflowImageData:
    def __init__(self, video_identifier="video1"):
        class Meta:
            pass
        self.video_metadata = Meta()
        self.video_metadata.video_identifier = video_identifier

# Minimal Detections stub
class Detection(dict):
    # Inherit from dict to allow key assignment
    pass

class Detections:
    def __init__(self, tracker_id=None, anchors=None):
        # tracker_id: list of int/str or None
        # anchors: list of (float, float)
        self.tracker_id = tracker_id
        self._anchors = anchors or []
        self._detections = [Detection() for _ in (tracker_id or [])]

    def __getitem__(self, idx):
        return self._detections[idx]

    def __len__(self):
        return len(self._detections)

    def get_anchors_coordinates(self, anchor):
        # Always return self._anchors
        return self._anchors

    @staticmethod
    def merge(detections):
        # Return a Detections instance with merged detections
        merged = Detections()
        merged._detections = detections
        merged.tracker_id = [d.get("tracker_id", i) for i, d in enumerate(detections)]
        merged._anchors = [d.get("anchor", (0, 0)) for d in detections]
        return merged

# --- The function to test: PathDeviationAnalyticsBlockV2.run ---

OUTPUT_KEY = "path_deviation_detections"

# --- Unit tests for PathDeviationAnalyticsBlockV2.run ---

# 1. BASIC TEST CASES




def test_no_tracker_id_raises():
    """Test that run raises ValueError if tracker_id is None."""
    block = PathDeviationAnalyticsBlockV2()
    detections = Detections(tracker_id=None, anchors=None)
    image = WorkflowImageData("vidD")
    reference_path = [(0, 0)]
    with pytest.raises(ValueError):
        block.run(detections, image, "center", reference_path) # 1.79μs -> 1.53μs (17.0% faster)

def test_empty_anchors_and_reference_path():
    """Test with empty anchors and empty reference path."""
    block = PathDeviationAnalyticsBlockV2()
    tracker_id = []
    anchors = []
    detections = Detections(tracker_id=tracker_id, anchors=anchors)
    image = WorkflowImageData("vidE")
    reference_path = []
    # Should not raise, but output should be empty
    codeflash_output = block.run(detections, image, "center", reference_path); result = codeflash_output # 21.6μs -> 23.1μs (6.74% slower)
    out = result[OUTPUT_KEY]

To edit these changes git checkout codeflash/optimize-PathDeviationAnalyticsBlockV2.run-mhbxqnde and push.

The optimized code achieves a **17% speedup** through several key performance optimizations: **1. Reduced Dictionary Lookups** - Caches `object_paths[video_id]` as `object_paths_video` to avoid repeated dictionary lookups in the detection loop - Pre-stores `PATH_DEVIATION_KEY_IN_SV_DETECTIONS` as `output_key` to eliminate string constant lookups **2. Memory-Efficient Array Construction** - Replaces `np.array(obj_path)` with `np.fromiter(obj_path, dtype=np.float64).reshape(-1, 2)` for faster conversion from list of tuples to numpy array - Uses `np.ascontiguousarray()` to ensure C-contiguous memory layout for faster access patterns during computation **3. Optimized Distance Matrix Operations** - Changes from `np.ones() * -1` to `np.full(-1.0)` for more efficient matrix initialization - Ensures consistent `np.float64` dtype throughout to avoid type conversion overhead **4. Inlined Critical Path Operations** - Inlines Euclidean distance calculation within `_compute_distance()` to eliminate function call overhead in the hot recursive path - Manually optimizes the `min()` operation with explicit comparisons to avoid Python builtin overhead **5. Enhanced Edge Case Handling** - Adds early return for empty paths with `float("inf")` to prevent unnecessary computation The optimizations are particularly effective for **workloads with many tracked objects** (as seen in test cases with multiple detections), where the reduced dictionary lookups and memory-efficient array operations compound. The 17% improvement comes primarily from eliminating repeated lookups and optimizing the memory-intensive Fréchet distance computation.

codeflash-ai bot requested a review from mashraf-222 October 29, 2025 11:51

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: Medium Optimization Quality according to Codeflash labels Oct 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up method `PathDeviationAnalyticsBlockV2.run` by 17% #634

⚡️ Speed up method `PathDeviationAnalyticsBlockV2.run` by 17% #634

Uh oh!

codeflash-ai bot commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method PathDeviationAnalyticsBlockV2.run by 17% #634

Are you sure you want to change the base?

⚡️ Speed up method PathDeviationAnalyticsBlockV2.run by 17% #634

Uh oh!

Conversation

codeflash-ai bot commented Oct 29, 2025

📄 17% (0.17x) speedup for PathDeviationAnalyticsBlockV2.run in inference/core/workflows/core_steps/analytics/path_deviation/v2.py

📝 Explanation and details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method `PathDeviationAnalyticsBlockV2.run` by 17% #634

⚡️ Speed up method `PathDeviationAnalyticsBlockV2.run` by 17% #634

📄 17% (0.17x) speedup for `PathDeviationAnalyticsBlockV2.run` in `inference/core/workflows/core_steps/analytics/path_deviation/v2.py`