Description
Hi!
Recently I checked Profile-Guided Optimization (PGO) improvements on multiple projects. The results are available here. According to the tests, PGO can help to achieve better performance in many cases for many applications including compilers like Clang, GCC, Rustc, LDC, Go, etc. I think trying to optimize Shady/VCC with PGO can be a good idea.
I can suggest the following action points:
- Perform PGO benchmarks on Shady. If it shows improvements - add a note to the documentation about possible improvements in Shady's performance with PGO.
- Providing an easier way (e.g. a build option) to build scripts with PGO can be helpful for the end-users and maintainers since they will be able to optimize Shady according to their workloads.
- Optimize pre-built Shady binaries (if any).
Testing Post-Link Optimization (PLO) techniques (like LLVM BOLT) would be interesting (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.
Here are some examples of how PGO optimization is integrated into other projects:
- Rustc: a CI script for the multi-stage build
- GCC:
- Clang: Docs
- Python:
- Go: Bash script
- V8: Bazel flag
- ChakraCore: Scripts
- Chromium: Script
- Firefox: Docs
- PHP - Makefile command and old Centminmod scripts
- MySQL: CMake script
- YugabyteDB: GitHub commit
- FoundationDB: Script
- Zstd: Makefile
- Foot: Scripts
- Windows Terminal: GitHub PR
- Pydantic-core: GitHub PR
- file.d: GitHub PR
- OceanBase: CMake flag
Here are some examples of how PGO-related documentation could look in the project:
- ClickHouse: https://clickhouse.com/docs/en/operations/optimizing-performance/profile-guided-optimization
- Databend: https://databend.rs/doc/contributing/pgo
- Vector: https://vector.dev/docs/administration/tuning/pgo/
- Nebula: https://docs.nebula-graph.io/3.5.0/8.service-tuning/enable_autofdo_for_nebulagraph/
- GCC: Official docs, section "Building with profile feedback" (even AutoFDO build is supported)
- Clang:
- tsv-utils: https://github.com/eBay/tsv-utils/blob/master/docs/BuildingWithLTO.md
Below are listed some LLVM BOLT results:
- Rustc:
- CPython: GitHub PR
- YDB: GitHub comment
- Clang:
- LDC: GitHub comment
- HHVM, Proxygen and others: Facebook paper
- NodeJS: Blog
- Chromium: Blog
- MySQL, MongoDB, memcached, Verilator: Paper
I would be happy to answer your questions about PGO and PLO.
I understand that the project is in the early development stages so features could have a bigger priority than performance. Anyway, in this case, you can treat the issue just like an idea for future improvements.
Activity