You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: apr19.md
+24-49Lines changed: 24 additions & 49 deletions
Original file line number
Diff line number
Diff line change
@@ -412,79 +412,54 @@ infrastructure can be more generally useful in the wider community. We hope
412
412
that by sharing our experiences with benchmarking OCaml code we can help people
413
413
quickly realise good benchmarking experiments and avoid non-trivial pitfalls.
414
414
415
-
<!-- Footnotes themselves at the bottom. -->
416
415
## Notes
417
416
418
-
[^1]:
419
-
[http://bench.ocamllabs.io](http://bench.ocamllabs.io) which presents operf-micro benchmark experiments and [http://bench2.ocamllabs.io](http://bench2.ocamllabs.io) which presents sandmark based benchmark experiments.
417
+
[^1]: [http://bench.ocamllabs.io](http://bench.ocamllabs.io) which presents operf-micro benchmark experiments and [http://bench2.ocamllabs.io](http://bench2.ocamllabs.io) which presents sandmark based benchmark experiments.
[https://hal.inria.fr/hal-01245844/document](https://hal.inria.fr/hal-01245844/document) - “Operf: Benchmarking the OCaml Compiler”, Chambart et al
424
421
425
-
[^3]:
426
-
Code for core_bench [https://github.com/janestreet/core_bench](https://github.com/janestreet/core_bench) and blog post describing core_bench [https://blog.janestreet.com/core_bench-micro-benchmarking-for-ocaml/](https://blog.janestreet.com/core_bench-micro-benchmarking-for-ocaml/)
422
+
[^3]: Code for core_bench [https://github.com/janestreet/core_bench](https://github.com/janestreet/core_bench) and blog post describing core_bench [https://blog.janestreet.com/core_bench-micro-benchmarking-for-ocaml/](https://blog.janestreet.com/core_bench-micro-benchmarking-for-ocaml/)
427
423
428
-
[^4]:
429
-
Haskell code for criterion [https://github.com/bos/criterion](https://github.com/bos/criterion) and tutorial on using criterion [http://www.serpentine.com/criterion/tutorial.html](http://www.serpentine.com/criterion/tutorial.html)
424
+
[^4]: Haskell code for criterion [https://github.com/bos/criterion](https://github.com/bos/criterion) and tutorial on using criterion [http://www.serpentine.com/criterion/tutorial.html](http://www.serpentine.com/criterion/tutorial.html)
430
425
431
-
[^5]:
432
-
Code and documentation for operf-macro [https://github.com/OCamlPro/operf-macro](https://github.com/OCamlPro/operf-macro)
426
+
[^5]: Code and documentation for operf-macro [https://github.com/OCamlPro/operf-macro](https://github.com/OCamlPro/operf-macro)
433
427
434
-
[^6]:
435
-
Code for sandmark [https://github.com/ocamllabs/sandmark](https://github.com/ocamllabs/sandmark)
428
+
[^6]: Code for sandmark [https://github.com/ocamllabs/sandmark](https://github.com/ocamllabs/sandmark)
436
429
437
-
[^7]:
438
-
CPython continuous benchmarking site [https://speed.python.org/](https://speed.python.org/)
430
+
[^7]: CPython continuous benchmarking site [https://speed.python.org/](https://speed.python.org/)
439
431
440
-
[^8]:
441
-
PyPy continuous benchmarking site [http://speed.pypy.org/](http://speed.pypy.org/)
432
+
[^8]: PyPy continuous benchmarking site [http://speed.pypy.org/](http://speed.pypy.org/)
442
433
443
-
[^9]:
444
-
Python Performance Benchmark Suite [https://pyperformance.readthedocs.io/](https://pyperformance.readthedocs.io/)
434
+
[^9]: Python Performance Benchmark Suite [https://pyperformance.readthedocs.io/](https://pyperformance.readthedocs.io/)
445
435
446
-
[^10]:
447
-
Codespeed web app for visualization of performance data [https://github.com/tobami/codespeed](https://github.com/tobami/codespeed)
436
+
[^10]: Codespeed web app for visualization of performance data [https://github.com/tobami/codespeed](https://github.com/tobami/codespeed)
448
437
449
-
[^11]:
450
-
LNT software [http://llvm.org/docs/lnt/](http://llvm.org/docs/lnt/) and the performance site [https://lnt.llvm.org/](https://lnt.llvm.org/)
438
+
[^11]: LNT software [http://llvm.org/docs/lnt/](http://llvm.org/docs/lnt/) and the performance site [https://lnt.llvm.org/](https://lnt.llvm.org/)
451
439
452
-
[^12]:
453
-
Description of Haskell issues with hard performance continous integration tests [https://ghc.haskell.org/trac/ghc/wiki/Performance/Tests](https://ghc.haskell.org/trac/ghc/wiki/Performance/Tests)
440
+
[^12]: Description of Haskell issues with hard performance continous integration tests [https://ghc.haskell.org/trac/ghc/wiki/Performance/Tests](https://ghc.haskell.org/trac/ghc/wiki/Performance/Tests)
454
441
455
-
[^13]:
456
-
‘2017 summer of code’ work on improving Haskell performance integration tests [https://github.com/jared-w/HSOC2017/blob/master/Proposal.pdf](https://github.com/jared-w/HSOC2017/blob/master/Proposal.pdf)
442
+
[^13]: ‘2017 summer of code’ work on improving Haskell performance integration tests [https://github.com/jared-w/HSOC2017/blob/master/Proposal.pdf](https://github.com/jared-w/HSOC2017/blob/master/Proposal.pdf)
457
443
458
-
[^14]:
459
-
Live tracking site [https://perf.rust-lang.org/](https://perf.rust-lang.org/) and code for it [https://github.com/rust-lang-nursery/rustc-perf](https://github.com/rust-lang-nursery/rustc-perf)
444
+
[^14]: Live tracking site [https://perf.rust-lang.org/](https://perf.rust-lang.org/) and code for it [https://github.com/rust-lang-nursery/rustc-perf](https://github.com/rust-lang-nursery/rustc-perf)
460
445
461
-
[^15]:
462
-
More information here: [http://kcsrk.info/multicore/ocaml/benchmarks/2018/09/13/1543-multicore-ci/](http://kcsrk.info/multicore/ocaml/benchmarks/2018/09/13/1543-multicore-ci/)
446
+
[^15]: More information here: [http://kcsrk.info/multicore/ocaml/benchmarks/2018/09/13/1543-multicore-ci/](http://kcsrk.info/multicore/ocaml/benchmarks/2018/09/13/1543-multicore-ci/)
463
447
464
-
[^16]:
465
-
A nice description of why first-parent helps is here [http://www.davidchudzicki.com/posts/first-parent/](http://www.davidchudzicki.com/posts/first-parent/)
448
+
[^16]: A nice description of why first-parent helps is here [http://www.davidchudzicki.com/posts/first-parent/](http://www.davidchudzicki.com/posts/first-parent/)
466
449
467
-
[^17]:
468
-
For more details please see [https://github.com/ocaml-bench/ocaml_bench_scripts/#notes-on-hardware-and-os-settings-for-linux-benchmarking](https://github.com/ocaml-bench/ocaml_bench_scripts/#notes-on-hardware-and-os-settings-for-linux-benchmarking)
450
+
[^17]: For more details please see [https://github.com/ocaml-bench/ocaml_bench_scripts/#notes-on-hardware-and-os-settings-for-linux-benchmarking](https://github.com/ocaml-bench/ocaml_bench_scripts/#notes-on-hardware-and-os-settings-for-linux-benchmarking)
469
451
470
-
[^18]:
471
-
There is some academic literature in this direction; for example “Rigourous benchmarking in reasonable time”, Kalibera et al [https://dl.acm.org/citation.cfm?id=2464160](https://dl.acm.org/citation.cfm?id=2464160) and “STABILIZER: statistically sound performance evaluation”, Curtsinger et al, [https://dl.acm.org/citation.cfm?id=2451141](https://dl.acm.org/citation.cfm?id=2451141). However randomization approaches need to take care that the layout randomization distribution captures the relevant features of layout randomization that real user binaries will see from ASLR, link-order or dynamic library linking.
452
+
[^18]: There is some academic literature in this direction; for example “Rigourous benchmarking in reasonable time”, Kalibera et al [https://dl.acm.org/citation.cfm?id=2464160](https://dl.acm.org/citation.cfm?id=2464160) and “STABILIZER: statistically sound performance evaluation”, Curtsinger et al, [https://dl.acm.org/citation.cfm?id=2451141](https://dl.acm.org/citation.cfm?id=2451141). However randomization approaches need to take care that the layout randomization distribution captures the relevant features of layout randomization that real user binaries will see from ASLR, link-order or dynamic library linking.
472
453
473
-
[^19]:
474
-
We feel that there must be some memory latency bottlenecks in the sandmark macro benchmarking suite, but we have yet to deep-dive and investigate a performance instability due to noise in memory latency for fetching instructions to execute. That is the performance may be bottlenecked on instruction memory fetch, but we haven’t seen instruction memory fetch latency being very noisy between benchmark runs in our setup.
454
+
[^19]: We feel that there must be some memory latency bottlenecks in the sandmark macro benchmarking suite, but we have yet to deep-dive and investigate a performance instability due to noise in memory latency for fetching instructions to execute. That is the performance may be bottlenecked on instruction memory fetch, but we haven’t seen instruction memory fetch latency being very noisy between benchmark runs in our setup.
475
455
476
-
[^20]:
477
-
Google’s AutoFDO [https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45290.pdf](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45290.pdf), Facebook’s HFSort [https://research.fb.com/wp-content/uploads/2017/01/cgo2017-hfsort-final1.pdf](https://research.fb.com/wp-content/uploads/2017/01/cgo2017-hfsort-final1.pdf) and Facebook’s Bolt [https://github.com/facebookincubator/BOLT](https://github.com/facebookincubator/BOLT) are recent examples to report imporvements in some deployed data-centre workloads.
456
+
[^20]: Google’s AutoFDO [https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45290.pdf](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45290.pdf), Facebook’s HFSort [https://research.fb.com/wp-content/uploads/2017/01/cgo2017-hfsort-final1.pdf](https://research.fb.com/wp-content/uploads/2017/01/cgo2017-hfsort-final1.pdf) and Facebook’s Bolt [https://github.com/facebookincubator/BOLT](https://github.com/facebookincubator/BOLT) are recent examples to report imporvements in some deployed data-centre workloads.
478
457
479
-
[^21]:
480
-
The effects are not limited to x86 as it has also been observed on ARM Cortex-A53 and Cortex-A57 processors with LLVM [https://www.youtube.com/watch?v=COmfRpnujF8](https://www.youtube.com/watch?v=COmfRpnujF8)
458
+
[^21]: The effects are not limited to x86 as it has also been observed on ARM Cortex-A53 and Cortex-A57 processors with LLVM [https://www.youtube.com/watch?v=COmfRpnujF8](https://www.youtube.com/watch?v=COmfRpnujF8)
481
459
482
-
[^22]:
483
-
The OCaml example is here [https://github.com/ocaml-bench/ocaml_bench_scripts/tree/master/stability_example](https://github.com/ocaml-bench/ocaml_bench_scripts/tree/master/stability_example), for the super curious there are yet more C++ examples to be had (although not necessarily the same underlying micro mechanism and all dependent on processor) [https://dendibakh.github.io/blog/2018/01/18/Code_alignment_issues](https://dendibakh.github.io/blog/2018/01/18/Code_alignment_issues)
460
+
[^22]: The OCaml example is here [https://github.com/ocaml-bench/ocaml_bench_scripts/tree/master/stability_example](https://github.com/ocaml-bench/ocaml_bench_scripts/tree/master/stability_example), for the super curious there are yet more C++ examples to be had (although not necessarily the same underlying micro mechanism and all dependent on processor) [https://dendibakh.github.io/blog/2018/01/18/Code_alignment_issues](https://dendibakh.github.io/blog/2018/01/18/Code_alignment_issues)
484
461
485
-
[^23]:
486
-
The Q&A at LLVM dev meeting videos [https://www.youtube.com/watch?v=IX16gcX4vDQ](https://www.youtube.com/watch?v=IX16gcX4vDQ) and [https://www.youtube.com/watch?v=COmfRpnujF8](https://www.youtube.com/watch?v=COmfRpnujF8) are examples of discussion around the area. The general area of code layout is also a problem area in the academic community “Producing wrong data without doing anything obviously wrong!”, Mytkowicz et al [https://dl.acm.org/citation.cfm?id=1508275](https://dl.acm.org/citation.cfm?id=1508275)
462
+
[^23]: The Q&A at LLVM dev meeting videos [https://www.youtube.com/watch?v=IX16gcX4vDQ](https://www.youtube.com/watch?v=IX16gcX4vDQ) and [https://www.youtube.com/watch?v=COmfRpnujF8](https://www.youtube.com/watch?v=COmfRpnujF8) are examples of discussion around the area. The general area of code layout is also a problem area in the academic community “Producing wrong data without doing anything obviously wrong!”, Mytkowicz et al [https://dl.acm.org/citation.cfm?id=1508275](https://dl.acm.org/citation.cfm?id=1508275)
487
463
488
-
[^24]:
489
-
For more on LLVM alignment options, see [https://dendibakh.github.io/blog/2018/01/25/Code_alignment_options_in_llvm](https://dendibakh.github.io/blog/2018/01/25/Code_alignment_options_in_llvm)
464
+
[^24]: For more on LLVM alignment options, see [https://dendibakh.github.io/blog/2018/01/25/Code_alignment_options_in_llvm](https://dendibakh.github.io/blog/2018/01/25/Code_alignment_options_in_llvm)
0 commit comments