fix: benchmark black box handling #5558

0ax1 · 2025-11-27T12:01:30Z

No description provided.

0ax1 · 2025-11-27T12:02:16Z

Not sure if this is the reason for perf swings in slice_arrow_buffer but black-boxing seems reasonable in any case.

codspeed-hq · 2025-11-27T12:06:58Z

CodSpeed Performance Report

Merging #5558 will degrade performances by 34.24%

_{Comparing ad/black-box-inputs-for-slice_arrow_buffer (a96e962) with develop (563dc61)}

Summary

⚡ 38 improvements
❌ 11 regressions
✅ 1429 untouched
⏩ 256 skipped¹

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
⚡	`append_arrow_buffer[1024]`	11.4 µs	9.4 µs	+22.29%
⚡	`append_arrow_buffer[128]`	1.7 µs	1.4 µs	+21.89%
⚡	`append_arrow_buffer[16384]`	178.2 µs	145.7 µs	+22.3%
⚡	`append_arrow_buffer[2048]`	22.6 µs	18.4 µs	+22.29%
⚡	`append_arrow_buffer[65536]`	711.8 µs	582 µs	+22.3%
⚡	`bitwise_and_arrow_buffer[1024]`	3.9 µs	3.4 µs	+13.79%
⚡	`bitwise_and_arrow_buffer[2048]`	4.4 µs	4 µs	+11.85%
⚡	`bitwise_and_vortex_buffer[1024]`	5.9 µs	4.1 µs	+44.64%
⚡	`bitwise_and_vortex_buffer[128]`	5.4 µs	4.5 µs	+20.37%
⚡	`bitwise_and_vortex_buffer[16384]`	14.8 µs	13.4 µs	+10.07%
⚡	`bitwise_and_vortex_buffer[2048]`	6.4 µs	4.6 µs	+37.94%
⚡	`bitwise_not_vortex_buffer_mut[1024]`	1,525.6 ns	623.6 ns	×2.4
⚡	`bitwise_not_vortex_buffer_mut[128]`	1,373.9 ns	501.1 ns	×2.7
⚡	`bitwise_not_vortex_buffer_mut[16384]`	4.4 µs	3.1 µs	+41.97%
⚡	`bitwise_not_vortex_buffer_mut[2048]`	1,690.6 ns	817.8 ns	×2.1
⚡	`bitwise_not_vortex_buffer_mut[65536]`	12.4 µs	11 µs	+11.88%
⚡	`bitwise_or_arrow_buffer[1024]`	4 µs	3.4 µs	+15.36%
⚡	`bitwise_or_arrow_buffer[128]`	5.7 µs	5.2 µs	+10.2%
⚡	`bitwise_or_vortex_buffer[1024]`	5.9 µs	4.1 µs	+42.9%
⚡	`bitwise_or_vortex_buffer[128]`	5.3 µs	4.5 µs	+18.94%
...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

256 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

codecov · 2025-11-27T12:14:31Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.48%. Comparing base (563dc61) to head (a96e962).
⚠️ Report is 1 commits behind head on develop.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>

joseph-isaacs · 2025-11-27T13:29:55Z

vortex-array/benches/scalar_subtract.rs

    bencher.with_inputs(|| &chunked).bench_refs(|chunked| {
-        let array = vortex_array::compute::sub_scalar(*chunked, to_subtract.into()).unwrap();
-        divan::black_box(array);
+        vortex_array::compute::sub_scalar(*chunked, to_subtract.into()).unwrap()


Claude finished @joseph-isaacs's task —— View job

Code Review: Benchmark Black Box Handling

I've reviewed the changes to benchmark black box handling across the 5 modified files. Here's my analysis:

Summary of Changes

This PR systematically removes explicit divan::black_box() calls from benchmark outputs and instead relies on Divan's implicit black-boxing behavior through bench_refs() and bench_values(). The key pattern is:

Before:

bencher.bench_refs(|x| { let result = expensive_operation(x); divan::black_box(result); });

After:

bencher.bench_refs(|x| expensive_operation(x));

Analysis

✅ Correct Approach

The changes are generally correct. According to Divan's documentation, bench_refs() and bench_values() already black-box their return values automatically. The explicit black_box() calls on outputs were redundant.

✅ Input Handling Improvement

The more significant improvement is moving loop parameters into with_inputs():

// Before - length captured from outer scope, harder to optimize .with_inputs(|| buffer) .bench_refs(|buffer| { for idx in 0..length { ... } // length from closure capture }) // After - length as explicit input, properly black-boxed .with_inputs(|| (buffer, length)) .bench_refs(|(buffer, length)| { for idx in 0..*length { ... } // length dereferenced from tuple })

This ensures loop bounds are treated as benchmark inputs and prevents the compiler from constant-folding based on knowing length at compile time.

Issues & Concerns

⚠️ Performance Regression Mystery

The CodSpeed report shows 11 regressions despite 31 improvements. The most notable regression is in from_iter[Arrow<BooleanBuffer>, 128] (-31.48%). This is puzzling because:

The from_iter benchmark itself wasn't modified in the diff

The regression might indicate measurement instability or compiler optimization differences

No explicit black-boxing was removed from from_iter - it never had any

🔍 Specific File Review

vortex-array/benches/scalar_subtract.rs (Line 38):

bencher.with_inputs(|| &chunked).bench_refs(|chunked| { vortex_array::compute::sub_scalar(*chunked, to_subtract.into()).unwrap() });

✅ Correct - The result is now implicitly black-boxed by bench_refs()

vortex-buffer/benches/vortex_bitbuffer.rs (slice_arrow_buffer):
The changes to slice_arrow_buffer at lines 147-154 are particularly relevant given the PR title mentions this benchmark:

// After bencher .with_inputs(|| (&buffer, length / 2)) .bench_refs(|(buffer, mid)| { let mid = *mid; buffer.0.slice(mid / 2, mid / 2) });

✅ Much better - Now mid is properly an input parameter rather than being captured from the outer scope. This should make measurements more stable.

Recommendations

Monitor the regressions: The from_iter regression seems unrelated to the changes and might indicate benchmark instability. Consider running benchmarks multiple times to verify stability.

Consider one more optimization: In several places, you're still capturing length from the outer scope for buffer creation:

let buffer = BitBuffer::from_iter((0..length).map(|i| i % 2 == 0)); bencher.with_inputs(|| (&buffer, length / 2))

This is fine since the buffer creation happens outside the benchmark timing.

Documentation: Consider adding a comment explaining Divan's implicit black-boxing behavior so future contributors don't re-add explicit black_box() calls.

Verdict

LGTM with monitoring recommended ✅

The changes correctly remove redundant black-boxing and improve input handling. The improvements in slice_arrow_buffer (the PR's focus) look good with +18.05% performance gains. The unrelated regressions warrant investigation but aren't necessarily caused by these changes.

0ax1 added the changelog/fix A bug fix label Nov 27, 2025

0ax1 force-pushed the ad/black-box-inputs-for-slice_arrow_buffer branch from 7e7aacc to 32a44e2 Compare November 27, 2025 12:17

0ax1 changed the title ~~fix: black box all inputs for slice_arrow_buffer~~ fix: black box input handling Nov 27, 2025

0ax1 requested a review from joseph-isaacs November 27, 2025 12:19

fix: black box input handling

66b4b51

Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>

0ax1 force-pushed the ad/black-box-inputs-for-slice_arrow_buffer branch from 32a44e2 to 66b4b51 Compare November 27, 2025 12:23

0ax1 changed the title ~~fix: black box input handling~~ fix: benchmark black box handling Nov 27, 2025

0ax1 enabled auto-merge (squash) November 27, 2025 13:03

hoist

a96e962

Signed-off-by: Alexander Droste <alexander.droste@protonmail.com>

joseph-isaacs reviewed Nov 27, 2025

View reviewed changes

joseph-isaacs approved these changes Nov 27, 2025

View reviewed changes

0ax1 disabled auto-merge November 27, 2025 13:35

0ax1 enabled auto-merge (squash) November 27, 2025 13:36

0ax1 merged commit e3922f2 into develop Nov 27, 2025
50 of 51 checks passed

0ax1 deleted the ad/black-box-inputs-for-slice_arrow_buffer branch November 27, 2025 13:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: benchmark black box handling #5558

fix: benchmark black box handling #5558

Uh oh!

0ax1 commented Nov 27, 2025

Uh oh!

0ax1 commented Nov 27, 2025

Uh oh!

codspeed-hq bot commented Nov 27, 2025 •

edited

Loading

Uh oh!

codecov bot commented Nov 27, 2025 •

edited

Loading

Uh oh!

joseph-isaacs Nov 27, 2025

Uh oh!

claude bot Nov 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: benchmark black box handling #5558

fix: benchmark black box handling #5558

Uh oh!

Conversation

0ax1 commented Nov 27, 2025

Uh oh!

0ax1 commented Nov 27, 2025

Uh oh!

codspeed-hq bot commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #5558 will degrade performances by 34.24%

Summary

Benchmarks breakdown

Footnotes

Uh oh!

codecov bot commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

joseph-isaacs Nov 27, 2025

Choose a reason for hiding this comment

Uh oh!

claude bot Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Code Review: Benchmark Black Box Handling

Summary of Changes

Analysis

Issues & Concerns

Recommendations

Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codspeed-hq bot commented Nov 27, 2025 •

edited

Loading

codecov bot commented Nov 27, 2025 •

edited

Loading

claude bot Nov 27, 2025 •

edited

Loading