Kwxm/nofib size info (SCP-3428) #4376

kwxm · 2022-02-04T00:18:11Z

This adds a sizes-and-budgets option to the nofib-exe command which will print out the sizes and budget requirements of the standard nofib benchmarks. This might be useful for evaluating the end-to-end performance of the Plutus compiler, although we should really do SCP-2275. It take about 10-15 seconds to run on my laptop. The output looks like this:

$ cabal exec nofib-exe sizes-and-budgets
Script                     Size     CPU budget      Memory budget
-----------------------------------------------------------------
clausify/F1                5190     52200152258       174669548
clausify/F2                5190     64038012612       214125276
clausify/F3                5190    175970661101       588207506
clausify/F4                5190    276195627284       911723736
clausify/F5                5190   1127149641516      3771317880
knights/4x4                3669    123600843812       389897530
knights/6x6                3669    364411045942      1170799002
knights/8x8                3669    617528054149      1991766816
primes/05digits            2141     60166853948       124576525
primes/08digits            2141    110484348568       221530395
primes/10digits            2141    155008971902       305421029
primes/20digits            2141    311608230455       619618016
primes/30digits            2141    455094629786       904715821
primes/40digits            2141    610120624302      1214364547
primes/50digits            2141    598553596442      1164707493
queens4x4/bt               3127     19730440366        62692642
queens4x4/bm               3127     27486821791        86938518
queens4x4/bjbt1            3127     25057170139        80397052
queens4x4/bjbt2            3127     26893372938        86377662
queens4x4/fc               3127     69621054671       226697350
queens5x5/bt               3127    263250282408       828489180
queens5x5/bm               3127    316745014099       998447360
queens5x5/bjbt1            3127    315504023354      1002195392
queens5x5/bjbt2            3127    335731305476      1068332804
queens5x5/fc               3127    892876360204      2904782050

There's also a script called nofib-compare in plutus-benchmark which will compare the outputs of two runs. Here's a comparison of the results for this branch against the results for the UPLC simplifier branch:

$ ./plutus-benchmark/nofib-compare info1 info2
Script                     Size         CPU budget    Memory budget
-------------------------------------------------------------------
clausify/F1               -9.4%           -7.3%           -7.3%
clausify/F2               -9.4%           -7.5%           -7.5%
clausify/F3               -9.4%           -7.5%           -7.6%
clausify/F4               -9.4%           -9.8%          -10.0%
clausify/F5               -9.4%           -7.3%           -7.3%
knights/4x4              -10.1%          -14.8%          -15.8%
knights/6x6              -10.1%          -16.6%          -17.3%
knights/8x8              -10.1%          -17.1%          -17.8%
primes/05digits          -16.6%           -4.5%           -7.3%
primes/08digits          -16.6%           -4.0%           -6.7%
primes/10digits          -16.6%           -3.9%           -6.6%
primes/20digits          -16.6%           -3.7%           -6.2%
primes/30digits          -16.6%           -3.5%           -6.0%
primes/40digits          -16.6%           -3.5%           -6.0%
primes/50digits          -16.6%           -3.5%           -6.1%
queens4x4/bt             -13.4%          -12.5%          -13.2%
queens4x4/bm             -13.4%          -12.1%          -12.9%
queens4x4/bjbt1          -13.4%          -12.8%          -13.4%
queens4x4/bjbt2          -13.4%          -12.8%          -13.4%
queens4x4/fc             -13.4%          -13.1%          -13.5%
queens5x5/bt             -13.4%          -12.4%          -13.2%
queens5x5/bm             -13.4%          -12.1%          -12.9%
queens5x5/bjbt1          -13.4%          -12.6%          -13.3%
queens5x5/bjbt2          -13.4%          -12.6%          -13.3%
queens5x5/fc             -13.4%          -13.0%          -13.4%

It just shows you the changes because including the data from the input files makes the table extremely wide. I haven't made any attempt to automate this, but presumably we could do so if it's useful.

Pre-submit checklist:

Branch
- Tests are provided (if possible)
- Commit sequence broadly makes sense
- Key commits have useful messages
- Relevant tickets are mentioned in commit messages
- Formatting, materialized Nix files, PNG optimization, etc. are updated
PR
- (For external contributions) Corresponding issue exists and is linked in the description
- Self-reviewed the diff
- Useful pull request description
- Reviewer requested

…mark

michaelpj · 2022-02-04T10:38:30Z

So I was thinking "why do we need two scripts for this, can't we get better output?" and it turns out that criterion can export results as CSV if you use the --csv flag. So maybe we should write a CSV-comparing script and use it for both...

michaelpj

Fine, except I do think it might be nice to start the glorious future of outputting CSV today. We depend on casssava elsewhere, it's pretty easy to use.

michaelpj · 2022-02-04T10:40:28Z

plutus-benchmark/nofib/exe/Main.hs

                   ++ "You'll probably want to redirect the output to a file.")

+
+-- Copied pretty much directly from plutus-tx/testlib/PlutusTx/Test.hs


argh, need to centralize these :(

michaelpj · 2022-02-04T10:40:58Z

plutus-benchmark/nofib/exe/Main.hs

+printSizesAndBudgets :: IO ()
+printSizesAndBudgets = do
+  -- The applied programs to measure, which are the same as the ones in the benchmarks.
+   -- We can't put all of these in one list because the 'a's in 'CompiledCode a' are different


You could do it with an existential, but maybe not worth it.

kwxm · 2022-02-04T11:00:43Z

So maybe we should write a CSV-comparing script and use it for both...

This script and the benchmarking one do different things though. In the benchmarking script you've only got one item of data per benchmark (the time) but here we've got three. I did consider producing three different table (size, cpu budget, memory budget) and then comparing them one by one, but that'd make it hard to see all the information about a single benchmark.

The CSV output from Criterion has lots if irrelevant stuff in it too. Here's an excerpt from the benchmark results for the builtins:

Name,Mean,MeanLB,MeanUB,Stddev,StddevLB,StddevUB
...
MultiplyInteger/ExMemory 29/ExMemory 11,1.7389750741689113e-6,1.7013633475641947e-6,1.7952985778693707e-6,1.466484816942319e-7,9.950858949034676e-8,2.1763453434231392e-7
MultiplyInteger/ExMemory 29/ExMemory 13,1.7100947810020254e-6,1.6574549873775534e-6,1.7807547206801181e-6,2.0495712031084943e-7,1.6080988529009743e-7,2.861485873603396e-7
MultiplyInteger/ExMemory 29/ExMemory 15,1.7272383735857425e-6,1.7244582331404776e-6,1.7323589279935075e-6,1.1724564396369767e-8,6.991366518903856e-9,2.0849524837797534e-8
MultiplyInteger/ExMemory 29/ExMemory 17,1.9333776451946716e-6,1.873866207628954e-6,2.0109144514208065e-6,2.2141608798988463e-7,1.7851280336270343e-7,2.8756898185597617e-7

I'm not convinced that we can process that uniformly with the output from this PR.

Also, maybe we want to process time figures from execution benchmarks differently to make them human-readable. The bench-compare script has some code to do that, but we don't want to do it for script sizes (maybe we could: 1.127T or 1127G would be a lot more readable than 1127149641516).

michaelpj · 2022-02-04T11:10:13Z

Okay, I guess I won't be fussy about it. I just don't like proliferating these scripts too much and I wish we could simplify things somehow...

kwxm · 2022-02-04T12:41:26Z

Seems to be stuck in CI.

michaelpj · 2022-02-04T12:57:06Z

CI seems stuck.

michaelpj · 2022-02-04T12:58:47Z

I think this is safe, though.

* Add command to nofib-exe to print size and budget info for each benchmark * Update script * Realign header * Update comment * Update comment * Remove accidental imports * updateMaterialized * Some awk reformatting

kwxm added 7 commits February 3, 2022 23:39

Add command to nofib-exe to print size and budget info for each bench…

81edb8a

…mark

Update script

50d86d7

Realign header

2acb89e

Update comment

5ef49f9

Update comment

b28cc76

Remove accidental imports

eb7f392

updateMaterialized

85a0592

kwxm requested review from bezirg and michaelpj February 4, 2022 00:36

Some awk reformatting

432d919

michaelpj approved these changes Feb 4, 2022

View reviewed changes

michaelpj merged commit 0ec4d6b into master Feb 4, 2022

kwxm deleted the kwxm/nofib-size-info branch March 22, 2022 14:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Kwxm/nofib size info (SCP-3428) #4376

Kwxm/nofib size info (SCP-3428) #4376

Uh oh!

kwxm commented Feb 4, 2022 •

edited

Loading

Uh oh!

michaelpj commented Feb 4, 2022

Uh oh!

michaelpj left a comment

Uh oh!

michaelpj Feb 4, 2022

Uh oh!

michaelpj Feb 4, 2022

Uh oh!

kwxm commented Feb 4, 2022 •

edited

Loading

Uh oh!

michaelpj commented Feb 4, 2022

Uh oh!

kwxm commented Feb 4, 2022

Uh oh!

michaelpj commented Feb 4, 2022

Uh oh!

michaelpj commented Feb 4, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		++ "You'll probably want to redirect the output to a file.")


		-- Copied pretty much directly from plutus-tx/testlib/PlutusTx/Test.hs

Kwxm/nofib size info (SCP-3428) #4376

Kwxm/nofib size info (SCP-3428) #4376

Uh oh!

Conversation

kwxm commented Feb 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michaelpj commented Feb 4, 2022

Uh oh!

michaelpj left a comment

Choose a reason for hiding this comment

Uh oh!

michaelpj Feb 4, 2022

Choose a reason for hiding this comment

Uh oh!

michaelpj Feb 4, 2022

Choose a reason for hiding this comment

Uh oh!

kwxm commented Feb 4, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michaelpj commented Feb 4, 2022

Uh oh!

kwxm commented Feb 4, 2022

Uh oh!

michaelpj commented Feb 4, 2022

Uh oh!

michaelpj commented Feb 4, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kwxm commented Feb 4, 2022 •

edited

Loading

kwxm commented Feb 4, 2022 •

edited

Loading