Skip to content

Improve performance of Arr::dot method - 300x in some cases #55495

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 21, 2025

Conversation

cyppe
Copy link
Contributor

@cyppe cyppe commented Apr 21, 2025

Array::dot Method Performance Optimization

Overview

This PR optimizes the Arr::dot() method, replacing the recursive implementation with a closure-based iterative approach. The optimization significantly improves performance when flattening large nested arrays, which is particularly beneficial for Laravel's validator when processing large datasets.

Optimization Details

The original implementation used recursive array_merge() calls, which can lead to O(n²) behavior for large arrays:

public static function dot($array, $prepend = '')
{
    $results = [];

    foreach ($array as $key => $value) {
        if (is_array($value) && ! empty($value)) {
            $results = array_merge($results, static::dot($value, $prepend.$key.'.'));
        } else {
            $results[$prepend.$key] = $value;
        }
    }

    return $results;
}

The optimized implementation uses a nested closure with references to avoid the expensive recursive array_merge() calls:

public static function dot($array, $prepend = '')
{
    $results = [];

    $flatten = function ($data, $prefix) use (&$results, &$flatten): void {
        foreach ($data as $key => $value) {
            $newKey = $prefix.$key;

            if (is_array($value) && ! empty($value)) {
                $flatten($value, $newKey.'.');
            } else {
                $results[$newKey] = $value;
            }
        }
    };

    $flatten($array, $prepend);

    return $results;
}

Benchmark Results

Benchmarks were performed on arrays of different sizes, simulating real-world validation scenarios. Each benchmark ran 5 iterations and recorded the average execution time.

Testing Environment

  • PHP Version: 8.3.20
  • Laravel Framework Version: v12.9.2
  • Machine: MacBook Pro (M-series chip)

Test Data Structure

The benchmark used nested arrays similar to those processed by Laravel's validator:

[
    'items' => [
        [
            'id' => 0,
            'name' => 'Name 0',
            'email' => 'email0@example.com',
            'date' => '2023-05-20',
            'another_sub_array' => [
                'street' => 'Street 0',
                'city' => 'City 0',
                'country' => 'Country 0',
            ],
        ],
        // ... more items
    ]
]

Performance Comparison

Array Size (items) Original Implementation Optimized Implementation Improvement Speedup Factor
500 5.2261 ms 0.4425 ms 91.5% 11.8x
1,000 20.2319 ms 0.6762 ms 96.7% 30x
5,000 556.6993 ms 3.6952 ms 99.3% 150x
10,000 3,591.2605 ms 10.6625 ms 99.7% 337x

Memory Usage

Memory consumption remained consistent between the original and optimized implementations, showing no additional memory overhead.

Array Size (items) Memory Usage
500 0.41 MB
1,000 0.65 MB
5,000 4.21 MB
10,000 8.43 MB

Impact on Large Datasets

The performance improvement is particularly significant for large datasets. As demonstrated in the benchmark, processing a 10,000-item array improved from over 3.5 seconds to just 10.6 milliseconds - an improvement of 99.7%.

This optimization addresses the performance issues reported in #49375, where users experienced significant slowdowns when validating large datasets with wildcard rules.

Testing Methodology

  1. A benchmark script was created to compare both implementations using the same input data
  2. Each implementation was tested with arrays of increasing size (500, 1000, 5000, and 10000 items)
  3. Each test ran 5 iterations to ensure consistent results
  4. The script measured execution time and memory usage for both implementations
  5. All core Laravel tests continue to pass with the optimized implementation

Conclusion

The optimized Arr::dot() method provides substantial performance improvements for large arrays while maintaining identical functionality. This optimization benefits any Laravel feature that uses dot notation to work with nested arrays, particularly the validator when processing large datasets.

The changes are non-breaking and fully backward compatible with the existing API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants