-
Notifications
You must be signed in to change notification settings - Fork 281
chore: update target-cpus in published binaries to x86-64-v3 and neoverse-n1 #3368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… -Ctarget-feature=-prefer-256-bit which doesn't make sense for Skylake, and on native it's unclear what effect it's supposed to have.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3368 +/- ##
============================================
+ Coverage 56.12% 59.99% +3.86%
- Complexity 976 1475 +499
============================================
Files 119 175 +56
Lines 11743 16165 +4422
Branches 2251 2681 +430
============================================
+ Hits 6591 9698 +3107
- Misses 4012 5114 +1102
- Partials 1140 1353 +213 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
I asked Claude to summarize the net benefit of this PR based on how rustc chooses targets when not given a target-cpu argument: Target CPU comparison
Key features gained
|
40b5ba4 to
a66cedc
Compare
|
I ran the release script on main and this PR, and ran objdump to look for relevant instructions: I also included the binary size: x86-64 (baseline → x86-64-v3)
ARM64 (baseline → neoverse-n1)
We saw a handful of specialized instructions before because some libraries implement architecture specific implementations and dispatch at runtime, but now we're getting these instructions in a lot more places. |
|
A couple of thoughts as I mark this ready for review:
|
| # build native libs for amd64 architecture Linux/MacOS on a Linux/amd64 machine/container | ||
| core-amd64-libs: | ||
| cd native && cargo build -j 2 --release $(FEATURES_ARG) | ||
| cd native && RUSTFLAGS="-Ctarget-cpu=x86-64-v3" cargo build -j 2 --release $(FEATURES_ARG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a target for released binaries (build-release-comet.sh).
| # build native libs for arm64 architecture Linux/MacOS on a Linux/arm64 machine/container | ||
| core-arm64-libs: | ||
| cd native && cargo build -j 2 --release $(FEATURES_ARG) | ||
| cd native && RUSTFLAGS="-Ctarget-cpu=neoverse-n1" cargo build -j 2 --release $(FEATURES_ARG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also a target for released binaries (build-release-comet.sh).
| core-amd64: | ||
| rustup target add x86_64-apple-darwin | ||
| cd native && RUSTFLAGS="-Ctarget-cpu=skylake -Ctarget-feature=-prefer-256-bit" CC=o64-clang CXX=o64-clang++ cargo build --target x86_64-apple-darwin --release $(FEATURES_ARG) | ||
| cd native && RUSTFLAGS="-Ctarget-cpu=skylake" CC=o64-clang CXX=o64-clang++ cargo build --target x86_64-apple-darwin --release $(FEATURES_ARG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This flag doesn't make sense on this target (Skylake), so I suspect it's mistakenly copy-pasted from elsewhere.
| mkdir -p common/target/classes/org/apache/comet/darwin/x86_64 | ||
| cp native/target/x86_64-apple-darwin/release/libcomet.dylib common/target/classes/org/apache/comet/darwin/x86_64 | ||
| cd native && RUSTFLAGS="-Ctarget-cpu=haswell -Ctarget-feature=-prefer-256-bit" cargo build --release $(FEATURES_ARG) | ||
| cd native && RUSTFLAGS="-Ctarget-cpu=x86-64-v3" cargo build --release $(FEATURES_ARG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for local development I guess, but made consistent. Also this flag doesn't make sense on this target (Haswell), so I suspect it's mistakenly copy-pasted from elsewhere.
| mkdir -p common/target/classes/org/apache/comet/darwin/aarch64 | ||
| cp native/target/aarch64-apple-darwin/release/libcomet.dylib common/target/classes/org/apache/comet/darwin/aarch64 | ||
| cd native && RUSTFLAGS="-Ctarget-cpu=native" cargo build --release $(FEATURES_ARG) | ||
| cd native && RUSTFLAGS="-Ctarget-cpu=neoverse-n1" cargo build --release $(FEATURES_ARG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for local development I guess, but made consistent.
| release-linux: clean | ||
| rustup target add aarch64-apple-darwin x86_64-apple-darwin | ||
| cd native && RUSTFLAGS="-Ctarget-cpu=apple-m1" CC=arm64-apple-darwin21.4-clang CXX=arm64-apple-darwin21.4-clang++ CARGO_FEATURE_NEON=1 cargo build --target aarch64-apple-darwin --release $(FEATURES_ARG) | ||
| cd native && RUSTFLAGS="-Ctarget-cpu=skylake -Ctarget-feature=-prefer-256-bit" CC=o64-clang CXX=o64-clang++ cargo build --target x86_64-apple-darwin --release $(FEATURES_ARG) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This flag doesn't make sense on this target (Skylake), so I suspect it's mistakenly copy-pasted from elsewhere.
andygrove
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mbutrovich. This makes sense to me.
|
Note that even with x86-64-v3 target, we still get native AES intrinsics for encryption/decryption in the Ring crate because they embed multiple implementations and dispatch at runtime. I counted them in the binary:
|
This is an alternative to #3366, but has larger ramifications for Comet's published binaries. I suspect this will provide a better experience for users trying out Comet binaries in a cloud environment in 2026.
Which issue does this PR close?
N/A.
Rationale for this change
The release Docker builds (core-amd64-libs and core-arm64-libs) previously used no explicit target-cpu, relying on rustc defaults. This produces binaries targeting a minimal baseline that may miss important optimizations.
When running on a Graviton2 CPU, for example, we see significant time spent in atomics that should have been optimized by LSE instructions.
For x86-64, x86-64-v3 is the appropriate baseline - it covers AVX2/FMA which are available on all modern cloud instances (Haswell-era and newer, ~2013+).
For ARM64, neoverse-n1 is the appropriate baseline - it covers AWS Graviton2, Ampere Altra, and newer cloud ARM processors from Azure and GCP. Key features include LSE atomics which significantly improve concurrent workload performance. Code compiled for neoverse-n1 also runs on Apple Silicon since M1 is a superset. Newer neoverse targets are likely too aggressive for released binaries, but n1 (2019) is a good start.
What changes are included in this PR?
Updates Makefile to set explicit CPU targets for release builds:
Development targets (core, release, bench) continue to use native for optimal local performance. x86-64-v3 was chosen over x86-64-v4 because v4 requires AVX-512, which is unavailable on AMD Zen 1-3 (EPYC Rome/Milan) and older Intel chips. Most current cloud instances use these processors.
How are these changes tested?
These are build configuration changes. The release Docker build process (dev/release/build-release-comet.sh) will use the new targets. The binaries produced will be compatible with a wider range of cloud instances while still enabling important optimizations like AVX2 (x86) and LSE atomics (ARM). I will do a build and dig into the instructions.