-
Notifications
You must be signed in to change notification settings - Fork 88
Description
The following code compares two functions for summing over a vector:
{-# LANGUAGE BangPatterns #-}
import Control.DeepSeq
import Criterion
import Criterion.Main
import Data.Vector.Unboxed as VU
import Data.Vector.Generic as VG
import qualified Data.Vector.Fusion.Stream as Stream
sumV :: (VG.Vector v a, Num a) => v a -> a
sumV = Stream.foldl' (+) 0 . VG.stream
main = do
let v10 = VU.fromList [0..9] :: VU.Vector Double
deepseq v10 $ return ()
defaultMain
[ bench "sumV" $ nf sumV v10
]
But suppose I change the last few lines to the following:
defaultMain
[ bench "sumV" $ nf sumV v10
, bench "VU.sum" $ nf VU.sum v10 -- Added this line
]
This, surprisingly, affects the runtime of the sumV
benchmark. It makes it about 20% faster. Similarly, if we remove the sumV
benchmark and leave the VU.sum
benchmark, the VU.sum
benchmark becomes about 20% slower. Tests were run with the patched criterion-1.0.0.2 I sent on ghc-7.8.3 with the -O2 -fllvm
flags.
What's going on is that different core is generated for the sumV
and VU.sum
benchmarks depending on whether the other benchmark is present. Essentially, the common bits are being factored out and placed in a function, and this function is getting called in both benchmarks. This happens to make both benchmarks faster.
I'm not sure if this should be considered a "proper bug," but it confused me for a an hour or so. It's something that criterion users (especially those performing really small benchmarks) probably should be aware of.