Skip to content

Conversation

@not-matthias
Copy link
Member

@not-matthias not-matthias commented Nov 12, 2025

Big performance improvements the more benchmarks and entries we have:

before:

Timer precision: 20 ns
go_runner                 fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ bench_go_runner        16.68 s       │ 16.68 s       │ 16.68 s       │ 16.68 s       │ 1       │ 1
╰─ bench_collect_results                │               │               │               │         │
   ├─ 100000                            │               │               │               │         │
   │  ├─ 5                38.24 ms      │ 47.53 ms      │ 40.12 ms      │ 40.45 ms      │ 92      │ 92
   │  ├─ 10               77.19 ms      │ 85.46 ms      │ 79.63 ms      │ 80.08 ms      │ 47      │ 47
   │  ╰─ 25               195.7 ms      │ 206.9 ms      │ 200.1 ms      │ 199.9 ms      │ 19      │ 19
   ├─ 500000                            │               │               │               │         │
   │  ├─ 5                198.4 ms      │ 217 ms        │ 203.4 ms      │ 204.7 ms      │ 19      │ 19
   │  ├─ 10               401 ms        │ 412.7 ms      │ 403.7 ms      │ 405.3 ms      │ 10      │ 10
   │  ╰─ 25               1.003 s       │ 1.025 s       │ 1.011 s       │ 1.013 s       │ 4       │ 4
   ├─ 1000000                           │               │               │               │         │
   │  ├─ 5                412.4 ms      │ 431 ms        │ 421.6 ms      │ 422.5 ms      │ 9       │ 9
   │  ├─ 10               810.2 ms      │ 838.9 ms      │ 821.7 ms      │ 823.2 ms      │ 5       │ 5
   │  ╰─ 25               2.061 s       │ 2.063 s       │ 2.062 s       │ 2.062 s       │ 2       │ 2
   ├─ 5000000                           │               │               │               │         │
   │  ├─ 5                2.43 s        │ 2.47 s        │ 2.45 s        │ 2.45 s        │ 2       │ 2
   │  ├─ 10               4.804 s       │ 4.804 s       │ 4.804 s       │ 4.804 s       │ 1       │ 1
   │  ╰─ 25               12.09 s       │ 12.09 s       │ 12.09 s       │ 12.09 s       │ 1       │ 1
   ╰─ 10000000                          │               │               │               │         │
      ├─ 5                4.771 s       │ 4.771 s       │ 4.771 s       │ 4.771 s       │ 1       │ 1
      ├─ 10               9.558 s       │ 9.558 s       │ 9.558 s       │ 9.558 s       │ 1       │ 1
      ╰─ 25               23.96 s       │ 23.96 s       │ 23.96 s       │ 23.96 s       │ 1       │ 1

after:

Timer precision: 20 ns
go_runner                 fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ bench_go_runner        12.16 s       │ 12.16 s       │ 12.16 s       │ 12.16 s       │ 1       │ 1
╰─ bench_collect_results                │               │               │               │         │
   ├─ 100000                            │               │               │               │         │
   │  ├─ 5                9.125 ms      │ 16.77 ms      │ 10.94 ms      │ 11.31 ms      │ 100     │ 100
   │  ├─ 10               13.01 ms      │ 17.25 ms      │ 14.44 ms      │ 14.62 ms      │ 100     │ 100
   │  ╰─ 25               27.97 ms      │ 47.19 ms      │ 31.05 ms      │ 31.63 ms      │ 49      │ 49
   ├─ 500000                            │               │               │               │         │
   │  ├─ 5                45.61 ms      │ 66.81 ms      │ 51.07 ms      │ 51.11 ms      │ 35      │ 35
   │  ├─ 10               66.21 ms      │ 102.9 ms      │ 71.06 ms      │ 72.31 ms      │ 20      │ 20
   │  ╰─ 25               133.9 ms      │ 150.4 ms      │ 144.2 ms      │ 143.6 ms      │ 9       │ 9
   ├─ 1000000                           │               │               │               │         │
   │  ├─ 5                90.82 ms      │ 110.8 ms      │ 96.57 ms      │ 98.19 ms      │ 20      │ 20
   │  ├─ 10               132.7 ms      │ 150.8 ms      │ 141 ms        │ 142.3 ms      │ 11      │ 11
   │  ╰─ 25               290.6 ms      │ 301.3 ms      │ 295.7 ms      │ 295.3 ms      │ 5       │ 5
   ├─ 5000000                           │               │               │               │         │
   │  ├─ 5                549.9 ms      │ 584.4 ms      │ 557.2 ms      │ 562.2 ms      │ 4       │ 4
   │  ├─ 10               724.1 ms      │ 728.1 ms      │ 726.1 ms      │ 726.1 ms      │ 2       │ 2
   │  ╰─ 25               1.762 s       │ 1.762 s       │ 1.762 s       │ 1.762 s       │ 1       │ 1
   ╰─ 10000000                          │               │               │               │         │
      ├─ 5                1.072 s       │ 1.076 s       │ 1.074 s       │ 1.074 s       │ 2       │ 2
      ├─ 10               2.73 s        │ 2.73 s        │ 2.73 s        │ 2.73 s        │ 1       │ 1
      ╰─ 25               3.315 s       │ 3.315 s       │ 3.315 s       │ 3.315 s       │ 1       │ 1

These are the results before adding the file_count parameter (i added it to observe how the performance scales with more files): https://codspeed.io/CodSpeedHQ/codspeed-go/runs/compare/6914aa55ba7672787291aa34..6914ad113449e87ac0ad578e

Benchmark BASE HEAD Change
bench_collect_results[10000000] 21.7 s 2.4 s ×9
bench_collect_results[1000000] 1,844 ms 217.1 ms ×8.5
bench_collect_results[100000] 186.3 ms 19.3 ms ×9.7
bench_collect_results[5000000] 11 s 1.2 s ×9.2
bench_collect_results[500000] 928.6 ms 102.4 ms ×9.1
bench_go_runner 73.1 s 24.3 s ×3

@not-matthias not-matthias force-pushed the cod-1666-codspeed-go-is-unable-to-run-benchmarks-on-go-modules-with branch from d785401 to 1679125 Compare November 12, 2025 15:14
@not-matthias not-matthias force-pushed the cod-1669-codspeed-go-execution-takes-a-long-time branch from 0f1f993 to 1bef8cb Compare November 12, 2025 15:14
@codspeed-hq
Copy link

codspeed-hq bot commented Nov 12, 2025

CodSpeed Performance Report

Merging #33 will degrade performances by 25%

Comparing cod-1669-codspeed-go-execution-takes-a-long-time (71ef596) with main (1679125)

Summary

⚡ 1 improvement
❌ 1 regression
✅ 23 untouched
🆕 15 new
⏩ 5 skipped1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
BenchmarkLargeSetupInLoop 24 ns 32 ns -25%
🆕 bench_collect_results[100000, 10] N/A 19.7 ms N/A
🆕 bench_collect_results[100000, 25] N/A 40.2 ms N/A
🆕 bench_collect_results[100000, 5] N/A 18.1 ms N/A
🆕 bench_collect_results[1000000, 10] N/A 218.3 ms N/A
🆕 bench_collect_results[1000000, 25] N/A 445.5 ms N/A
🆕 bench_collect_results[1000000, 5] N/A 190.9 ms N/A
🆕 bench_collect_results[10000000, 10] N/A 2.4 s N/A
🆕 bench_collect_results[10000000, 25] N/A 5.2 s N/A
🆕 bench_collect_results[10000000, 5] N/A 2 s N/A
🆕 bench_collect_results[500000, 10] N/A 104.5 ms N/A
🆕 bench_collect_results[500000, 25] N/A 228.9 ms N/A
🆕 bench_collect_results[500000, 5] N/A 90.1 ms N/A
🆕 bench_collect_results[5000000, 10] N/A 1.2 s N/A
🆕 bench_collect_results[5000000, 25] N/A 2.6 s N/A
🆕 bench_collect_results[5000000, 5] N/A 1 s N/A
bench_go_runner 73.1 s 26.4 s ×2.8

Footnotes

  1. 5 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@not-matthias not-matthias changed the base branch from cod-1666-codspeed-go-is-unable-to-run-benchmarks-on-go-modules-with to main November 12, 2025 15:36
@not-matthias not-matthias requested a review from Copilot November 12, 2025 15:37
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR optimizes walltime result parsing by introducing parallel processing via the rayon crate and refactoring data structures to use borrowed slices instead of owned vectors.

Key changes:

  • Adds rayon dependency for parallel iteration across benchmark data processing
  • Changes WalltimeBenchmark::from_runtime_data signature to accept slice references (&[u64]) instead of owned vectors
  • Refactors RawResult::parse_folder to combine parsing and conversion into a single pipeline with parallel processing

Reviewed Changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
go-runner/Cargo.toml Adds rayon dependency for parallel processing
Cargo.lock Updates lock file with rayon dependency tree
go-runner/src/results/walltime_results.rs Introduces parallel iterators for sum/filter operations and changes function signature to use slices
go-runner/src/results/raw_result.rs Refactors parse_folder to use parallel processing and combines parsing with benchmark conversion
go-runner/src/lib.rs Updates imports and simplifies collect_walltime_results to use refactored API

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@not-matthias not-matthias force-pushed the cod-1669-codspeed-go-execution-takes-a-long-time branch from cb4b981 to 4b2789b Compare November 12, 2025 16:08
@not-matthias not-matthias force-pushed the cod-1669-codspeed-go-execution-takes-a-long-time branch from df0ed28 to bbcaab5 Compare November 12, 2025 17:14
Copy link

@GuillaumeLagrange GuillaumeLagrange left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

olgtm

@not-matthias not-matthias force-pushed the cod-1669-codspeed-go-execution-takes-a-long-time branch from d635326 to cc77d35 Compare November 13, 2025 11:59
@not-matthias not-matthias force-pushed the cod-1669-codspeed-go-execution-takes-a-long-time branch from cc77d35 to 71ef596 Compare November 13, 2025 12:02
@not-matthias not-matthias merged commit 71ef596 into main Nov 13, 2025
23 of 26 checks passed
@not-matthias not-matthias deleted the cod-1669-codspeed-go-execution-takes-a-long-time branch November 13, 2025 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants