build: graviton specific build #3366

parthchandra · 2026-02-02T19:29:44Z

Which issue does this PR close?

Rationale for this change

While running an Iceberg benchmark on a graviton cluster, @hsiang-c found the following calls taking ~12% of the execution time

_dl_tlsdesc_dynamic
__aarch64_cas8_acq_rel
__aarch64_swp8_acq_rel
__aarch64_ldadd8_rel

The __aarch64_cas8_acq_rel, __aarch64_swp8_acq_rel, __aarch64_ldadd8_rel calls can be optimized on graviton V2 and newer ARM processors into a single instruction.
The _dl_tlsdesc_dynamic call can be optimized to reduce the overhead of thread local storage resolution by the using a static linking model (https://maskray.me/blog/2021-02-14-all-about-thread-local-storage)

What changes are included in this PR?

Adds a separate build for a graviton binary. Loading of the native binary checks for graviton processors and then loads the graviton version if necessary.

This can be overridden by specifying the environment variable - NATIVE_VARIANT_ENV

The build for graviton uses nightly in the build container. This is for the -Z tls-model=initial-exec flag which is only available in nightly.

The build also adds explicit dependencies on serde and serde_urlencoded which are depended on transitively by reqwest and are somehow not resolved while building on graviton.

The build adds to the size of the uber jar

-rwxr-xr-x@ 1 parth  staff   90455632 Feb  2 10:32 ./common/target/classes/org/apache/comet/linux/aarch64/libcomet.so
-rwxr-xr-x@ 1 parth  staff   98510816 Feb  2 10:20 ./common/target/classes/org/apache/comet/linux/amd64/libcomet.so
-rwxr-xr-x@ 1 parth  staff   95560232 Feb  2 10:43 ./common/target/classes/org/apache/comet/linux/graviton/libcomet.so

The build-comet-release.sh script is also updated and makes it a little easier to specify which platforms to build the binary for. Building for graviton is optional

How are these changes tested?

Manually by running the build-release-comet.sh script. Script and binary independently verified by @hsiang-c

Contributor note: @hsiang-c contributed to this PR. Also Claude Code contributed.

Also modify the dependencies. For graviton reqwest does not pull in serde transitively so specify it explicitly

parthchandra · 2026-02-02T19:32:48Z

Makefile

+# Requires Rust nightly for -Z tls-model flag
+# Uses initial-exec TLS model, LSE, and native CPU optimizations for Graviton
+core-graviton-libs:
+	cd native && RUSTFLAGS="-Z tls-model=initial-exec -C target-feature=+lse -C target-cpu=native" \


The -Z tls-model=initial-exec flag requires nightly. The gain from this change only came to about 1% so we can revert this if we don't want to use nightly

mbutrovich · 2026-02-02T22:43:27Z

I propose just changing the target-cpu on ARM builds to neoverse-n1 which gets you LSE (atomics) and is a reasonable baseline for ARM architectures available in 2026 from cloud vendors. It's also a subset of what Apple Silicon provides. Newer neoverse targets would be great, but I don't know if we want to require new Graviton and eliminate stuff like Ampere Altra. I'd need to do a survey of GCP and Azure ARM cores.

parthchandra · 2026-02-02T23:45:19Z

I propose just changing the target-cpu on ARM builds to neoverse-n1 which gets you LSE (atomics) and is a reasonable baseline for ARM architectures available in 2026 from cloud vendors. It's also a subset of what Apple Silicon provides. Newer neoverse targets would be great, but I don't know if we want to require new Graviton and eliminate stuff like Ampere Altra. I'd need to do a survey of GCP and Azure ARM cores.

One reason for doing it this way was that building for LSE and running on an ARM version that doesn't support LSE causes an instant crash. I don't have any real world data on how many people may be on older versions of ARM but didn't want to take the chance.

codecov-commenter · 2026-02-03T00:03:12Z

Codecov Report

❌ Patch coverage is 12.50000% with 28 lines in your changes missing coverage. Please review.
✅ Project coverage is 59.89%. Comparing base (f09f8af) to head (525a556).
⚠️ Report is 918 commits behind head on main.

Files with missing lines	Patch %	Lines
...mon/src/main/java/org/apache/comet/NativeBase.java	12.50%	24 Missing and 4 partials ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #3366      +/-   ##
============================================
+ Coverage     56.12%   59.89%   +3.77%     
- Complexity      976     1465     +489     
============================================
  Files           119      175      +56     
  Lines         11743    16196    +4453     
  Branches       2251     2692     +441     
============================================
+ Hits           6591     9701    +3110     
- Misses         4012     5138    +1126     
- Partials       1140     1357     +217

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

mbutrovich · 2026-02-03T02:01:23Z

I propose just changing the target-cpu on ARM builds to neoverse-n1 which gets you LSE (atomics) and is a reasonable baseline for ARM architectures available in 2026 from cloud vendors. It's also a subset of what Apple Silicon provides. Newer neoverse targets would be great, but I don't know if we want to require new Graviton and eliminate stuff like Ampere Altra. I'd need to do a survey of GCP and Azure ARM cores.

One reason for doing it this way was that building for LSE and running on an ARM version that doesn't support LSE causes an instant crash. I don't have any real world data on how many people may be on older versions of ARM but didn't want to take the chance.

As best as I can tell, LSE was added in ARMv-8.1-A, which predates anything I think we'd find in modern data centers. Azure and GCP use something newer than neoverse-n1, and n1 seems like a reasonable baselines for current Graviton offerings. In the same way probably don't want to ship binaries for anything that doesn't have AVX2, I think it's reasonable to cut off ARM without LSE, at least for shipped binaries. Folks are free to compile their own if they need, in the same way that they'll get a faster build if they compile on their target architecture.

I summarized my thoughts in #3368

graviton specific build

22fe173

Also modify the dependencies. For graviton reqwest does not pull in serde transitively so specify it explicitly

parthchandra requested review from andygrove and mbutrovich and removed request for andygrove February 2, 2026 19:29

parthchandra commented Feb 2, 2026

View reviewed changes

parthchandra changed the title ~~graviton specific build~~ build: graviton specific build Feb 2, 2026

mbutrovich mentioned this pull request Feb 2, 2026

chore: update target-cpus in published binaries to x86-64-v3 and neoverse-n1 #3368

Open

prettier

e0eb578

add serde_urlencoded to machete ignore list

525a556

parthchandra closed this Feb 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build: graviton specific build #3366

build: graviton specific build #3366

Uh oh!

parthchandra commented Feb 2, 2026

Uh oh!

parthchandra Feb 2, 2026

Uh oh!

mbutrovich commented Feb 2, 2026

Uh oh!

parthchandra commented Feb 2, 2026

Uh oh!

codecov-commenter commented Feb 3, 2026 •

edited

Loading

Uh oh!

mbutrovich commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

build: graviton specific build #3366

build: graviton specific build #3366

Uh oh!

Conversation

parthchandra commented Feb 2, 2026

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

How are these changes tested?

Uh oh!

parthchandra Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

mbutrovich commented Feb 2, 2026

Uh oh!

parthchandra commented Feb 2, 2026

Uh oh!

codecov-commenter commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mbutrovich commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Feb 3, 2026 •

edited

Loading