Skip to content

[Build] Add LTO, PGO, and -march=native CMake options#11025

Open
taherbert wants to merge 1 commit intosimulationcraft:midnightfrom
taherbert:cmake-build-optimizations
Open

[Build] Add LTO, PGO, and -march=native CMake options#11025
taherbert wants to merge 1 commit intosimulationcraft:midnightfrom
taherbert:cmake-build-optimizations

Conversation

@taherbert
Copy link
Contributor

@taherbert taherbert commented Feb 23, 2026

The legacy engine/Makefile supports LTO, PGO, and -march=native but the CMake build has no equivalent. This adds them as opt-in options.

New options (all OFF by default — existing builds unaffected):

  • SC_LTO — thin LTO on Clang, full LTO on GCC
  • SC_MARCH_NATIVE — tune for the host CPU
  • SC_PGO_GENERATE / SC_PGO_USE — profile-guided optimization (Clang and GCC)

Bumps cmake_minimum_required to 3.13 for target_link_options().

Also adds scripts/benchmark.sh which builds all three variants and prints a comparison table.

Results (Apple Silicon M4 Max, AppleClang 17, best of 3 runs, 1000 iterations, 4 threads):

Variant Time(s) Speedup
Release (baseline) 0.64 1.00x
LTO + march=native 0.57 1.12x
PGO + LTO + march=native 0.55 1.16x

-ffast-math was also tested — it made things 3x slower, so it is excluded.

Usage:

cmake -B build -DCMAKE_BUILD_TYPE=Release -DSC_LTO=ON -DSC_MARCH_NATIVE=ON

Or run scripts/benchmark.sh for the full comparison including PGO.

Note: Let me know if we'd prefer to remove benchmarking script, but thought it would be helpful for verification.

The legacy engine/Makefile supports LTO, PGO, and -march=native but
the CMake build has no equivalent. This adds opt-in CMake options that
bring those capabilities to cmake users.

New options (all OFF by default, existing builds unaffected):
- SC_LTO: thin LTO on Clang, full LTO on GCC
- SC_MARCH_NATIVE: tune for the host CPU
- SC_PGO_GENERATE / SC_PGO_USE: LLVM/GCC profile-guided optimization

Bumps cmake_minimum_required to 3.13 for target_link_options().

Also adds scripts/benchmark.sh which automates a three-way comparison
(baseline vs LTO+march vs PGO+LTO+march) and prints a results table.

Tested on Apple Silicon (M4 Max, AppleClang 17):

Variant                       Time(s)  Speedup
Release (baseline)               0.64    1.00x
LTO + march=native               0.57    1.12x
PGO + LTO + march=native         0.55    1.16x

(best of 3 runs, 1000 iterations, 4 threads, MID1_Priest_Shadow)
@taherbert taherbert changed the title Add LTO, PGO, and -march=native CMake options [Build] Add LTO, PGO, and -march=native CMake options Feb 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant