-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Good iterations-per-second experiments include a control job to factor out common setup code that is present in all other jobs. Here is a benchmark without a control:
require "benchmark"
x = [] of Int32
y = [] of Int32
Benchmark.ips do |b|
b.report("push") do
x = (0..99).to_a
y = (100..999).to_a
y.each { |v| x.push(v) }
end
b.report("concat") do
x = (0..99).to_a
y = (100..999).to_a
x.concat(y)
end
end
push 187.13k ( 5.34µs) (± 0.73%) 20.7kB/op 1.60× slower
concat 300.22k ( 3.33µs) (± 0.61%) 16.0kB/op fastest
It would be incorrect to conclude that concat
is 60% faster than push
and allocates 23% less memory, because both jobs include code that initializes x
and y
. Let's add a control job:
b.report("control") do
x = (0..99).to_a
y = (100..999).to_a
end
control 328.60k ( 3.04µs) (± 1.53%) 12.0kB/op fastest
push 184.54k ( 5.42µs) (± 0.98%) 20.7kB/op 1.78× slower
concat 284.15k ( 3.52µs) (± 0.60%) 16.0kB/op 1.16× slower
We can see that push
and concat
take 5.42 - 3.04 = 2.38µs
and 3.52 - 3.04 = 0.48µs
to run respectively, so concat
is actually 396% faster than push
; they also allocate 8.7kB
and 4.0kB
of memory per operation respectively, so concat
actually allocates 54% less memory. The actual IPSs and speed ratios relative to fastest can be calculated similarly. There is probably a way to calculate the combined standard deviation as well.
I always do the above calculations manually, but it would be nice to see Benchmark.ips
incorporate this kind of functionality. One way is to allow Benchmark::IPS::Job#report(label, &action)
to indicate a job as a control, so that the control is excluded from #report
(why are they two overloads of the same method...?) and used only for the calculation of other item statistics. Another way is:
require "benchmark"
x = [] of Int32
y = [] of Int32
Benchmark.ips do |b|
b.before_each do
x = (0..99).to_a
y = (100..999).to_a
end
b.report("push") do
y.each { |v| x.push(v) }
end
b.report("concat") do
x.concat(y)
end
end
This is more in line with Spec
and requires no special calculation, but now every benchmark has to run the #before_each
block and the item's own block alternately, and I wonder if that would have an adverse effect on the iteration times themselves.
(Smart readers may notice that y
can be turned into a constant, but benchmarks may opt to seed y
with random values on every run.)