-
Notifications
You must be signed in to change notification settings - Fork 130
fix: benchmark black box handling #5558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Not sure if this is the reason for perf swings in |
CodSpeed Performance ReportMerging #5558 will degrade performances by 34.24%Comparing Summary
Benchmarks breakdown
Footnotes
|
Codecov Report✅ All modified and coverable lines are covered by tests. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
7e7aacc to
32a44e2
Compare
slice_arrow_bufferSigned-off-by: Alexander Droste <alexander.droste@protonmail.com>
32a44e2 to
66b4b51
Compare
| bencher.with_inputs(|| &chunked).bench_refs(|chunked| { | ||
| let array = vortex_array::compute::sub_scalar(*chunked, to_subtract.into()).unwrap(); | ||
| divan::black_box(array); | ||
| vortex_array::compute::sub_scalar(*chunked, to_subtract.into()).unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Claude finished @joseph-isaacs's task —— View job
Code Review: Benchmark Black Box Handling
I've reviewed the changes to benchmark black box handling across the 5 modified files. Here's my analysis:
Summary of Changes
This PR systematically removes explicit divan::black_box() calls from benchmark outputs and instead relies on Divan's implicit black-boxing behavior through bench_refs() and bench_values(). The key pattern is:
Before:
bencher.bench_refs(|x| {
let result = expensive_operation(x);
divan::black_box(result);
});After:
bencher.bench_refs(|x| expensive_operation(x));Analysis
✅ Correct Approach
The changes are generally correct. According to Divan's documentation, bench_refs() and bench_values() already black-box their return values automatically. The explicit black_box() calls on outputs were redundant.
✅ Input Handling Improvement
The more significant improvement is moving loop parameters into with_inputs():
// Before - length captured from outer scope, harder to optimize
.with_inputs(|| buffer)
.bench_refs(|buffer| {
for idx in 0..length { ... } // length from closure capture
})
// After - length as explicit input, properly black-boxed
.with_inputs(|| (buffer, length))
.bench_refs(|(buffer, length)| {
for idx in 0..*length { ... } // length dereferenced from tuple
})This ensures loop bounds are treated as benchmark inputs and prevents the compiler from constant-folding based on knowing length at compile time.
Issues & Concerns
The CodSpeed report shows 11 regressions despite 31 improvements. The most notable regression is in from_iter[Arrow<BooleanBuffer>, 128] (-31.48%). This is puzzling because:
- The
from_iterbenchmark itself wasn't modified in the diff - The regression might indicate measurement instability or compiler optimization differences
- No explicit black-boxing was removed from
from_iter- it never had any
🔍 Specific File Review
vortex-array/benches/scalar_subtract.rs (Line 38):
bencher.with_inputs(|| &chunked).bench_refs(|chunked| {
vortex_array::compute::sub_scalar(*chunked, to_subtract.into()).unwrap()
});✅ Correct - The result is now implicitly black-boxed by bench_refs()
vortex-buffer/benches/vortex_bitbuffer.rs (slice_arrow_buffer):
The changes to slice_arrow_buffer at lines 147-154 are particularly relevant given the PR title mentions this benchmark:
// After
bencher
.with_inputs(|| (&buffer, length / 2))
.bench_refs(|(buffer, mid)| {
let mid = *mid;
buffer.0.slice(mid / 2, mid / 2)
});✅ Much better - Now mid is properly an input parameter rather than being captured from the outer scope. This should make measurements more stable.
Recommendations
-
Monitor the regressions: The
from_iterregression seems unrelated to the changes and might indicate benchmark instability. Consider running benchmarks multiple times to verify stability. -
Consider one more optimization: In several places, you're still capturing
lengthfrom the outer scope for buffer creation:let buffer = BitBuffer::from_iter((0..length).map(|i| i % 2 == 0)); bencher.with_inputs(|| (&buffer, length / 2))
This is fine since the buffer creation happens outside the benchmark timing.
-
Documentation: Consider adding a comment explaining Divan's implicit black-boxing behavior so future contributors don't re-add explicit
black_box()calls.
Verdict
LGTM with monitoring recommended ✅
The changes correctly remove redundant black-boxing and improve input handling. The improvements in slice_arrow_buffer (the PR's focus) look good with +18.05% performance gains. The unrelated regressions warrant investigation but aren't necessarily caused by these changes.
No description provided.