Skip to content

Add CMake tryrun cache for macOS ARM64 to speed up configure#124046

Draft
steveisok wants to merge 2 commits intodotnet:mainfrom
steveisok:why-not-use-tryrun-osx
Draft

Add CMake tryrun cache for macOS ARM64 to speed up configure#124046
steveisok wants to merge 2 commits intodotnet:mainfrom
steveisok:why-not-use-tryrun-osx

Conversation

@steveisok
Copy link
Member

Add a pre-configured CMake cache file for macOS ARM64 (Apple Silicon) that eliminates redundant feature detection checks during the configure phase.

Performance improvement:

  • CMake configure time: 105s → 12s (89% faster)
  • Full clean build (clr+libs): 9:51 → 7:36 (18% faster)

The build currently runs CMake configuration 3 times (coreclr, native libs, host), with 597 total checks of which 395 are duplicated across configurations. The cache file pre-populates known results for macOS ARM64, similar to the existing tryrun.browser.cmake for WebAssembly builds.

Valid for:

  • macOS 14.0+ (Sonoma and later)
  • Xcode 15.0+ / AppleClang 15.0+
  • Architecture: arm64 (Apple Silicon)

To disable if issues arise (e.g., after Xcode upgrade):
export CLR_CMAKE_SKIP_PLATFORM_CACHE=1

Add a pre-configured CMake cache file for macOS ARM64 (Apple Silicon) that
eliminates redundant feature detection checks during the configure phase.

Performance improvement:
- CMake configure time: 105s → 12s (89% faster)
- Full clean build (clr+libs): 9:51 → 7:36 (18% faster)

The build currently runs CMake configuration 3 times (coreclr, native libs,
host), with 597 total checks of which 395 are duplicated across configurations.
The cache file pre-populates known results for macOS ARM64, similar to the
existing tryrun.browser.cmake for WebAssembly builds.

Valid for:
- macOS 14.0+ (Sonoma and later)
- Xcode 15.0+ / AppleClang 15.0+
- Architecture: arm64 (Apple Silicon)

To disable if issues arise (e.g., after Xcode upgrade):
  export CLR_CMAKE_SKIP_PLATFORM_CACHE=1
@steveisok steveisok requested review from a team and Copilot February 5, 2026 14:09
@github-actions github-actions bot added the needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners label Feb 5, 2026
@steveisok
Copy link
Member Author

steveisok commented Feb 5, 2026

@dotnet/runtime-infrastructure @jkotas @am11 I have this in draft as I think we need to make sure we strike the right balance in terms of caching and to make sure there isn't a great deal of friction using and updating this.

I do believe this is the right path for most of our configurations. If you're all on board, easiest to start with one and we'll methodically go through the rest.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a CMake tryrun cache file for macOS ARM64 builds to significantly reduce CMake configure time. The change introduces a pre-configured cache file (tryrun.osx-arm64.cmake) containing 337 feature detection results that are known for macOS 14.0+ with Xcode 15.0+, eliminating redundant checks during the build configuration phase. The cache is automatically applied for macOS ARM64 builds and can be disabled via the CLR_CMAKE_SKIP_PLATFORM_CACHE=1 environment variable.

Changes:

  • Added a new CMake cache file with pre-populated feature detection results for macOS ARM64 platform
  • Modified the build system to automatically load this cache file when building for macOS ARM64
  • Implemented an opt-out mechanism for users who encounter issues

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
eng/native/tryrun.osx-arm64.cmake New CMake cache file containing 337 pre-configured feature detection results for macOS ARM64 (Sonoma 14.0+, Xcode 15.0+) to eliminate redundant checks during configuration
eng/native/gen-buildsys.sh Added logic to automatically load the macOS ARM64 cache file when target_os is "osx" and host_arch is "arm64", with opt-out via CLR_CMAKE_SKIP_PLATFORM_CACHE=1 environment variable

Comment on lines +88 to +93
# Use platform-specific tryrun cache to speed up CMake configure (opt-out via CLR_CMAKE_SKIP_PLATFORM_CACHE=1)
if [[ "$CLR_CMAKE_SKIP_PLATFORM_CACHE" != "1" ]]; then
if [[ "$target_os" == "osx" && "$host_arch" == "arm64" && -f "$scriptroot/tryrun.osx-arm64.cmake" ]]; then
cmake_extra_defines="-C $scriptroot/tryrun.osx-arm64.cmake $cmake_extra_defines"
fi
fi
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When cross-compiling to macOS ARM64, both the new platform-specific cache (line 91) and the cross-compilation cache (line 75) will be loaded. Since tryrun.osx-arm64.cmake is prepended, CMake processes it first, but values can be overridden by the later tryrun.cmake.

There's a specific value discrepancy: HAVE_SHM_OPEN_THAT_WORKS_WELL_ENOUGH_WITH_MMAP_EXITCODE is set to 255 in tryrun.osx-arm64.cmake (line 233) but 1 in tryrun.cmake for Darwin ARM64 (line 75 context). In cross-compile scenarios, tryrun.cmake's value of 1 will override. However, for native macOS ARM64 builds, the value of 255 from tryrun.osx-arm64.cmake will be used.

Additionally, tryrun.osx-arm64.cmake is missing several _EXITCODE variables that are defined in tryrun.cmake for Darwin (HAVE_BROKEN_FIFO_SELECT_EXITCODE, HAVE_CLOCK_MONOTONIC_COARSE_EXITCODE, HAVE_CLOCK_GETTIME_NSEC_NP_EXITCODE, HAVE_MMAP_DEV_ZERO_EXITCODE, HAVE_SCHED_GETCPU_EXITCODE, MMAP_ANON_IGNORES_PROTECTION_EXITCODE, SEM_INIT_MODIFIES_ERRNO_EXITCODE).

Consider adding a check to skip loading tryrun.osx-arm64.cmake when CROSSCOMPILE is set to avoid any potential conflicts, or verify that these value differences are intentional and correct for the different build scenarios.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +4
# CMake pre-configured cache for macOS ARM64 (Apple Silicon) native builds
#
# This file caches the results of CMake feature detection checks to significantly
# speed up the CMake configure phase for macOS ARM64 builds.
Copy link

Copilot AI Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states this is for "macOS ARM64 (Apple Silicon) native builds" but the file can also be loaded during cross-compilation to macOS ARM64 (when CROSSCOMPILE=1 is set). Consider updating the comment to clarify that it applies to both native and cross-compilation scenarios, or alternatively, modify the loading logic in gen-buildsys.sh to skip this cache file when CROSSCOMPILE=1 to avoid the potential conflicts noted in the other comment.

Suggested change
# CMake pre-configured cache for macOS ARM64 (Apple Silicon) native builds
#
# This file caches the results of CMake feature detection checks to significantly
# speed up the CMake configure phase for macOS ARM64 builds.
# CMake pre-configured cache for macOS ARM64 (Apple Silicon) builds
#
# This file caches the results of CMake feature detection checks to significantly
# speed up the CMake configure phase when targeting macOS ARM64 (both native
# builds on Apple Silicon and cross-compilation with CROSSCOMPILE=1).

Copilot uses AI. Check for mistakes.
@jkotas
Copy link
Member

jkotas commented Feb 5, 2026

I think we need to have some sort of detection when the checked in configs get out of sync before adding more of them.

We seem to have bugs due to messed up checked in configs #123950

set(COMPILER_SUPPORTS_W_RESERVED_IDENTIFIER 1 CACHE INTERNAL "")
set(FNO_LTO_AVAILABLE 1 CACHE INTERNAL "")
set_cache_value(HAS_POSIX_SEMAPHORES_EXITCODE 1)
set(HAS_POSIX_SEMAPHORES "" CACHE INTERNAL "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may want to check how many of these are actually needed. I checked a few and immediately discovered a cluster of dead code: #124049

@steveisok
Copy link
Member Author

I think we need to have some sort of detection when the checked in configs get out of sync before adding more of them.

We seem to have bugs due to messed up checked in configs #123950

I think the kinds of checks we need to do are likely to be different per target. With wasm, it's when we bump emscripten. What do you think is appropriate here? XCode major/minor versions? OS version?

@steveisok
Copy link
Member Author

I think the kinds of checks we need to do are likely to be different per target. With wasm, it's when we bump emscripten. What do you think is appropriate here? XCode major/minor versions? OS version?

I think we have a min version the cache is valid for and a check that we don't get too far ahead (2 major versions?). That way regeneration would likely to occur when we bump above the min version on our CI machines.

Do we want to warn or error when we violate the cache checks?

@am11
Copy link
Member

am11 commented Feb 6, 2026

Caching the introspection result is fine for CI like environments which are deterministic. But for dev innerloop, we can't predict what user machine has installed. I honestly don't think saving a few seconds on non-critical use-cases is worth the hassle and it also begs the questions like; "why only the macOS".

@jkotas
Copy link
Member

jkotas commented Feb 6, 2026

What do you think is appropriate here? XCode major/minor versions? OS version?

I do not know what the best practices around this are.

Copilot highlighted some of the discrepancies with what's getting added here in https://github.com/dotnet/runtime/pull/124046/files#r2769414845

@steveisok
Copy link
Member Author

steveisok commented Feb 6, 2026

Caching the introspection result is fine for CI like environments which are deterministic. But for dev innerloop, we can't predict what user machine has installed. I honestly don't think saving a few seconds on non-critical use-cases is worth the hassle and it also begs the questions like; "why only the macOS".

I think we have to try because the various build time savings (this plus ninja to start) is hard to ignore. I think around 18% for caching is about what you'd find in most cases (in addition to the ~8% ninja boost). That savings adds up.

"why on the macOS"

I want to apply this approach everywhere that makes sense. Starting with one allows us to keep the discussion focused and then we can go down the line.

@steveisok
Copy link
Member Author

steveisok commented Feb 6, 2026

Copilot highlighted some of the discrepancies with what's getting added here in https://github.com/dotnet/runtime/pull/124046/files#r2769414845

Nice! I wondered if there were difference between when we cross build on CI and when we just run. Apple does a fairly decent job of making it appear seamless, so this is easy to miss.

I do not know what the best practices around this are.

I double checked with copilot and here's what it suggests (I think I pretty much agree).

Summary: Sync Detection for macOS ARM64 CMake Cache

Approach: Track minimum Xcode version, warn on mismatch

  1. Add version tracking to the cache file: set(TRYRUN_OSX_ARM64_MIN_XCODE_VERSION "15" CACHE INTERNAL "Minimum Xcode version this cache is valid for")
  2. Detection logic in gen-buildsys.sh:
  - Detect current Xcode major version (via xcodebuild -version or clang version)
  - Warn if older than minimum → cache may reference features that don't exist
  - Silent if newer → expected for local dev, cache is conservative
  - Optional: warn if 2+ major versions ahead → gentle reminder to consider regenerating
  3. Regeneration trigger:
  - When CI updates its Xcode image, bump the minimum version and regenerate the cache
  - This is a manual process tied to infrastructure updates (similar to emscripten bumps for WASM)

Why this works:

  - CI runs at the minimum version → perfect match
  - Local devs run ahead → no friction, cache remains valid (features only grow)
  - Devs running old Xcode get a warning before hitting mysterious build issues
  - The CLR_CMAKE_SKIP_PLATFORM_CACHE=1 escape hatch remains for edge cases

Differences from browser/WASM:

  - WASM has a version file (emscripten-version.txt) to check against
  - macOS doesn't have an equivalent, so we detect Xcode at configure time instead

We also need to factor in when crosscompiling.

@steveisok
Copy link
Member Author

One more thing... There are comments in wasm highlighting how to reconstruct the cache. We need to automate that part to make it easier on whoever is charged with updating the cache.

@jkotas
Copy link
Member

jkotas commented Feb 6, 2026

I think we have to try because the various build time savings

An alternative way to address this is to ask why we have so many of these and why they are so slow to evaluate.

@am11
Copy link
Member

am11 commented Feb 6, 2026

I think we have to try because the various build time savings (this plus ninja to start) is hard to ignore. I think around 18% for caching is about what you'd find in most cases (in addition to the ~8% ninja boost). That savings adds up.

I am thinking about the potential risks. The whole point of cmake introspection is to adapt to the machine or ensuring the "desired state" of the machine. Otherwise, we can have hand-rolled config.h file. :)

In cmake, there is no single type of introspection. In places, we raise manual error in cmake script for unexpected state, so this will paper over those situation where something can go wrong. IOW, it has potential to break stuff in non-obvious ways.

Perhaps, we can add a ./build.sh --validate-config mode so if someone runs into weird issue (or recommend users to run it every once in a while, e.g. after upgrading the system), they can validate if their machine matches the cached preset.

@steveisok
Copy link
Member Author

I am thinking about the potential risks. The whole point of cmake introspection is to adapt to the machine or ensuring the "desired state" of the machine. Otherwise, we can have hand-rolled config.h file. :)

Agreed - I think part of what we're also trying to figure out is where the line is. It's likely different for every platform.

Perhaps, we can add a ./build.sh --validate-config mode so if someone runs into weird issue (or recommend users to run it every once in a while, e.g. after upgrading the system), they can validate if their machine matches the cached preset.

I think that's a good idea.

@steveisok
Copy link
Member Author

steveisok commented Feb 6, 2026

An alternative way to address this is to ask why we have so many of these and why they are so slow to evaluate.

Summary of analysis from the coreclr part of configure:

  1. Runtime checks are ~2.8s (6.3%) - These test behaviors standardized 20-30 years ago. Extremely safe to cache.
  2. The rest of configure time (~42s) is spent on:
  • try_compile / check_*_source_compiles (~20s) - Compile-only checks
  • fetchcontent (~4.4s) - Fetching zlib-ng, zstd, brotli
  • CMake processing overhead
  1. CMake already caches these results in CMakeCache.txt for incremental builds - the problem is only for clean builds (like CI or fresh clones).

The fundamental issue is that cmake has to invoke the compiler many times (~300+ try_compile calls) to probe the toolchain/platform. There's no way to make those faster other than:

  • Caching (the PR 124046 approach)
  • Eliminating obsolete checks (minor gains, code churn)
  • Precomputing results for known CI environments

So yes - caching is the right lever to pull. The values are stable, and for CI where build times matter most, a checked-in cache makes sense.

https://gist.githubusercontent.com/steveisok/97a54cfd082562ac5893a3e4fc49d49e/raw/28361a98f1da445e9f022edc0747496951634df1/cmake-macarm64-analysis.md

Add options to validate and regenerate platform-specific CMake cache files:

- --validate-config: Runs cmake configure without the cache and compares
  detected values against the cached file, reporting any differences.

- --regenerate-config: Same as validate, but updates the cache file if
  differences are found.

The new validate-platform-cache.sh script supports multiple platforms:
- osx-arm64, osx-x64, linux-x64, linux-arm64, browser, ios/tvos

This addresses feedback about providing a way to verify cached preset
values match the current system configuration.
@steveisok
Copy link
Member Author

Perhaps, we can add a ./build.sh --validate-config mode so if someone runs into weird issue (or recommend users to run it every once in a while, e.g. after upgrading the system), they can validate if their machine matches the cached preset.

I think that's a good idea.

I added a validation and regeneration switch along with a script to carry it out. Is the concept in line with what you were thinking? We may want to tweak the regen script, but what matters first is if this is what we want.

Comment on lines +310 to +311
echo "Run with --regenerate to update the cache file:"
echo " $0 --regenerate $platform"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo "Run with --regenerate to update the cache file:"
echo " $0 --regenerate $platform"
echo "Run with --regenerate-config to update the cache file:"

Maybe it's better to use the arg name in the top-level script.

@am11
Copy link
Member

am11 commented Feb 6, 2026

Just tried and it has caught one difference:

$ ./build.sh --regenerate-config
...
============================================================
Results
============================================================
Total variables checked: 312
Matched: 311
Differences: 1

Differences found:

  - HAVE_SYS_ENDIAN_H: cached='' detected='1'

Run with --regenerate to update the cache file:
  /Users/adeel/projects/runtime5/eng/native/validate-platform-cache.sh --regenerate osx-arm64

@steveisok
Copy link
Member Author

steveisok commented Feb 6, 2026

Just tried and it has caught one difference:

Curious, what is your mac setup?

I surfaced it on my setup. I generated the cache from MacOS 15.4 / XCode 16.3. On my other Mac, that has 26.2 and is a newer SDK.

The initial cache as part of this PR is closer to the floor that we want. You're too new.

@am11
Copy link
Member

am11 commented Feb 6, 2026

Mine is macos 26.2 (25C56) with Xcode 26.2 (17C52).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-area-label An area label is needed to ensure this gets routed to the appropriate area owners

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants