-
Notifications
You must be signed in to change notification settings - Fork 281
build: graviton specific build #3366
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Also modify the dependencies. For graviton reqwest does not pull in serde transitively so specify it explicitly
| # Requires Rust nightly for -Z tls-model flag | ||
| # Uses initial-exec TLS model, LSE, and native CPU optimizations for Graviton | ||
| core-graviton-libs: | ||
| cd native && RUSTFLAGS="-Z tls-model=initial-exec -C target-feature=+lse -C target-cpu=native" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The -Z tls-model=initial-exec flag requires nightly. The gain from this change only came to about 1% so we can revert this if we don't want to use nightly
|
I propose just changing the |
One reason for doing it this way was that building for LSE and running on an ARM version that doesn't support LSE causes an instant crash. I don't have any real world data on how many people may be on older versions of ARM but didn't want to take the chance. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3366 +/- ##
============================================
+ Coverage 56.12% 59.89% +3.77%
- Complexity 976 1465 +489
============================================
Files 119 175 +56
Lines 11743 16196 +4453
Branches 2251 2692 +441
============================================
+ Hits 6591 9701 +3110
- Misses 4012 5138 +1126
- Partials 1140 1357 +217 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
As best as I can tell, LSE was added in ARMv-8.1-A, which predates anything I think we'd find in modern data centers. Azure and GCP use something newer than I summarized my thoughts in #3368 |
Which issue does this PR close?
Rationale for this change
While running an Iceberg benchmark on a graviton cluster, @hsiang-c found the following calls taking ~12% of the execution time
The
__aarch64_cas8_acq_rel, __aarch64_swp8_acq_rel, __aarch64_ldadd8_relcalls can be optimized on graviton V2 and newer ARM processors into a single instruction.The
_dl_tlsdesc_dynamiccall can be optimized to reduce the overhead of thread local storage resolution by the using a static linking model (https://maskray.me/blog/2021-02-14-all-about-thread-local-storage)What changes are included in this PR?
Adds a separate build for a graviton binary. Loading of the native binary checks for graviton processors and then loads the graviton version if necessary.
This can be overridden by specifying the environment variable -
NATIVE_VARIANT_ENVThe build for graviton uses
nightlyin the build container. This is for the-Z tls-model=initial-execflag which is only available in nightly.The build also adds explicit dependencies on
serdeandserde_urlencodedwhich are depended on transitively byreqwestand are somehow not resolved while building on graviton.The build adds to the size of the uber jar
The
build-comet-release.shscript is also updated and makes it a little easier to specify which platforms to build the binary for. Building for graviton is optionalHow are these changes tested?
Manually by running the
build-release-comet.shscript. Script and binary independently verified by @hsiang-cContributor note: @hsiang-c contributed to this PR. Also Claude Code contributed.