Skip to content

A benchmark's runtime can depend on the presence/absence of other benchmarks #60

@mikeizbicki

Description

@mikeizbicki

The following code compares two functions for summing over a vector:

{-# LANGUAGE BangPatterns #-}

import Control.DeepSeq
import Criterion
import Criterion.Main
import Data.Vector.Unboxed as VU
import Data.Vector.Generic as VG
import qualified Data.Vector.Fusion.Stream as Stream

sumV :: (VG.Vector v a, Num a) => v a -> a
sumV = Stream.foldl' (+) 0 . VG.stream

main = do   
    let v10 = VU.fromList [0..9]  :: VU.Vector Double
    deepseq v10 $ return ()

    defaultMain 
        [ bench "sumV"   $ nf sumV v10
        ]

But suppose I change the last few lines to the following:

    defaultMain 
        [ bench "sumV"   $ nf sumV v10
        , bench "VU.sum" $ nf VU.sum v10    -- Added this line
        ]

This, surprisingly, affects the runtime of the sumV benchmark. It makes it about 20% faster. Similarly, if we remove the sumV benchmark and leave the VU.sum benchmark, the VU.sum benchmark becomes about 20% slower. Tests were run with the patched criterion-1.0.0.2 I sent on ghc-7.8.3 with the -O2 -fllvm flags.

What's going on is that different core is generated for the sumV and VU.sum benchmarks depending on whether the other benchmark is present. Essentially, the common bits are being factored out and placed in a function, and this function is getting called in both benchmarks. This happens to make both benchmarks faster.

I'm not sure if this should be considered a "proper bug," but it confused me for a an hour or so. It's something that criterion users (especially those performing really small benchmarks) probably should be aware of.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions