feat[scan]: gpu scan #6199
feat[scan]: gpu scan #6199
Performance Regression: -29.9%
⚠️ Unknown Walltime execution environment detected
Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.
For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.
⚡ 7 improved benchmarks
❌ 11 regressed benchmarks
✅ 1143 untouched benchmarks
🆕 18 new benchmarks
⏩ 1323 skipped benchmarks1
⚠️ Please fix the performance issues or acknowledge them on CodSpeed.
Performance Changes
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| 🆕 | WallTime | 10M_90pct[10000000] |
N/A | 200.7 µs | N/A |
| 🆕 | WallTime | 1M_10pct[100000] |
N/A | 21.8 µs | N/A |
| 🆕 | WallTime | 1M_10pct[100000] |
N/A | 46.7 µs | N/A |
| 🆕 | WallTime | 1M_90pct[1000000] |
N/A | 29.2 µs | N/A |
| 🆕 | WallTime | 10M_90pct[10000000] |
N/A | 367.1 µs | N/A |
| ❌ | WallTime | u32_values_u8_codes[10M] |
133.7 µs | 170.9 µs | -21.75% |
| 🆕 | WallTime | 1M_50pct[500000] |
N/A | 22.8 µs | N/A |
| 🆕 | WallTime | 10M_10pct[1000000] |
N/A | 222.9 µs | N/A |
| 🆕 | WallTime | 10M_10pct[1000000] |
N/A | 218.7 µs | N/A |
| 🆕 | WallTime | 10M_90pct[10000000] |
N/A | 368.3 µs | N/A |
| 🆕 | WallTime | 10M_50pct[5000000] |
N/A | 157.9 µs | N/A |
| 🆕 | WallTime | 1M_10pct[100000] |
N/A | 47.3 µs | N/A |
| 🆕 | WallTime | 10M_50pct[5000000] |
N/A | 282.8 µs | N/A |
| 🆕 | WallTime | 10M_10pct[1000000] |
N/A | 137 µs | N/A |
| 🆕 | WallTime | 1M_90pct[1000000] |
N/A | 56.1 µs | N/A |
| 🆕 | WallTime | 10M_50pct[5000000] |
N/A | 280.4 µs | N/A |
| 🆕 | WallTime | 1M_50pct[500000] |
N/A | 52 µs | N/A |
| 🆕 | WallTime | 1M_90pct[1000000] |
N/A | 58.6 µs | N/A |
| 🆕 | WallTime | 1M_50pct[500000] |
N/A | 51.6 µs | N/A |
| ❌ | Simulation | canonical_into_non_nullable[(10000, 1, 0.1)] |
48 µs | 57.2 µs | -16.06% |
| ... | ... | ... | ... | ... | ... |
ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.
Comparing ji/gpu-scan-2 (d1da424) with develop (68130ce)2
Footnotes
-
1323 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
-
No successful run was found on
develop(1e401b2) during the generation of this report, so 68130ce was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩