Skip to content

Commit

Permalink
CSV parse and README update
Browse files Browse the repository at this point in the history
  • Loading branch information
zypeh committed Jun 12, 2022
1 parent ac65ef4 commit f5e71d7
Show file tree
Hide file tree
Showing 7 changed files with 151 additions and 1 deletion.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
dist-newstyle
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Revision history for nika

## 0.1.0.0 -- YYYY-mm-dd

* First version. Released on an unsuspecting world.
27 changes: 26 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,29 @@

<h3 align="center">Continuous benchmarking for performance regression</h3>

<img align="center" src="https://miro.medium.com/max/1200/1*AqDkZGzbxf_ygqCGm0MlEQ.jpeg" />
<img align="center" src="https://miro.medium.com/max/1200/1*AqDkZGzbxf_ygqCGm0MlEQ.jpeg" />

## Problem statement

We want to find out the performance of the program and integrate in the CI infrastructure. We have trying to find a way to visualise the performance of the program as precise as possible.

## Solution

We treat the performance of the program as a mean of execution time (aka wall time). To do this we first make a CLI tool to visualise the wall time in a bell curve by statistics. (** TODO: we will elaborate deeper into this **).

The program will need to concat the time into a csv file. And we will calculate the confidence interval. If it is exceeding the __predefined__ critical value, we will raise certain warnings for CI to react.

And because of running the program in a noisy environment (imagine non-dedicated CI server...), we will get a relatively big variance in the wall time data. We need to calculate the variance plus mean, and calculate the __confidence interval__.


## Mission and non mission

* This is not a microbenchmarking libraries, if you want this, you can find something like criterion on both rust and haskell.
* This is not a haskell libraries, it is language-agnostic tool
* This is going to be a fun project.

## TODO
* Graph drawing
* Profiling?
* SoMeThInG LiKe the rustc perf bot? (calculate the CPU instructions, cache misses)
* SoMeThIng lIkE the zig perf page? (visualisation instead of showing data)
67 changes: 67 additions & 0 deletions app/CsvParser.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
{-# language OverloadedStrings #-}
{-# language ScopedTypeVariables #-}

module CsvParser where

import Data.ByteString (ByteString)
import Data.Vector (Vector)

import qualified Data.ByteString.Lazy as ByteStringLazy
import qualified Data.Csv as Csv
import qualified Data.Vector as Vector
import Control.Monad (when)

data PerfData = PerfData
{ _branchName :: ByteString
, _wallTime :: Int
} deriving (Eq, Show)

data ComparisonBranch = ComparisonBranch (ByteString, ByteString) deriving (Show)

data CsvError
= TooLittleInput
| TooLittleDifferentBranchNames
| BranchTooLittleInput ByteString
| MoreThanTwoBranches

instance Show CsvError where
show TooLittleInput = "There are too little data to calculate the mean of the wall time."
show TooLittleDifferentBranchNames = "There are less than 2 different branch names in the csv file."
show (BranchTooLittleInput branchName) = "There are too little data to calculate the mean of the wall time for branch: " ++ show branchName
show MoreThanTwoBranches = "There are more than 2 different branch names in the csv file."

instance Csv.FromNamedRecord PerfData where
parseNamedRecord r = PerfData <$> r Csv..: "branchName" <*> r Csv..: "wallTime"

main1 :: IO ()
main1 = do
csvData <- ByteStringLazy.readFile "perf.csv"
case Csv.decodeByName csvData of
Left err -> putStrLn err
Right (_, v :: Vector PerfData) -> print (checkCsv v)

numOfSample :: Int
numOfSample = 2

checkCsv :: Vector PerfData -> Either CsvError (ComparisonBranch, Vector PerfData)
checkCsv xs = if Vector.length xs < numOfSample * 2 then Left TooLittleInput else checkIfThirdBranchNameExist xs
where
checkIfThirdBranchNameExist :: Vector PerfData -> Either CsvError (ComparisonBranch, Vector PerfData)
checkIfThirdBranchNameExist xs = do
let branchNames = fmap _branchName xs
let firstBranchName = Vector.head branchNames
case Vector.find (/= firstBranchName) branchNames of
Nothing -> Left TooLittleDifferentBranchNames
Just secondBranchName -> case Vector.find (\bn -> (bn /= firstBranchName) && (bn /= secondBranchName)) branchNames of
Just x -> Left MoreThanTwoBranches
Nothing -> do
let branchSamples = (checkBranchInputIsSufficient numOfSample xs) <$> [firstBranchName, secondBranchName]
case branchSamples of
[True, False] -> Left $ BranchTooLittleInput firstBranchName
[False, True] -> Left $ BranchTooLittleInput secondBranchName
[False, False] -> Left $ BranchTooLittleInput firstBranchName
[True, True] -> Right (ComparisonBranch (firstBranchName, secondBranchName), xs)
otherwise -> undefined

checkBranchInputIsSufficient :: Int -> Vector PerfData -> ByteString -> Bool
checkBranchInputIsSufficient n xs branchName = Vector.length (Vector.filter (\x -> _branchName x == branchName) xs) >= n
6 changes: 6 additions & 0 deletions app/Main.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
module Main where

import CsvParser ( main1 )

main :: IO ()
main = main1
41 changes: 41 additions & 0 deletions nika.cabal
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
cabal-version: 2.4
name: nika
version: 0.1.0.0

-- A short (one-line) description of the package.
-- synopsis:

-- A longer description of the package.
-- description:

-- A URL where users can report bugs.
-- bug-reports:

-- The license under which the package is released.
-- license:
author: zypeh
maintainer: zypeh@users.noreply.github.com

-- A copyright notice.
-- copyright:
-- category:
extra-source-files:
CHANGELOG.md
README.md

executable nika
main-is: Main.hs

-- Modules included in this executable, other than Main.
other-modules:
CsvParser

-- LANGUAGE extensions used by modules in this package.
-- other-extensions:
build-depends: base ^>=4.16.0.0
, bytestring
, cassava
, optparse-applicative
, vector
hs-source-dirs: app
default-language: Haskell2010
5 changes: 5 additions & 0 deletions tests/perf.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
branchName,wallTime
feature,100
main,200
feature,104
main,201

0 comments on commit f5e71d7

Please sign in to comment.