Skip to content

8379630: Add JMH benchmark to measure the overhead of using captured call state#30719

Open
Arraying wants to merge 3 commits intoopenjdk:masterfrom
Arraying:JDK-8379630
Open

8379630: Add JMH benchmark to measure the overhead of using captured call state#30719
Arraying wants to merge 3 commits intoopenjdk:masterfrom
Arraying:JDK-8379630

Conversation

@Arraying
Copy link
Copy Markdown
Member

@Arraying Arraying commented Apr 14, 2026

Hi all,

The Java FF&M API includes functionality to both initialize and read from thread-local data prior to and immediately after downcalls, respectively, through the Linker.Option::captureCallState API. This is useful, as an example, when setting or capturing errno when interfacing with C functions. However, using this linker option introduces some invocation overhead at runtime.

This RFE introduces a JMH microbenchmark which quantifies this overhead. A simple downcall to strtol is measured with and without call state capturing.

Testing: GHA for sanity testing.

Benchmarking Results

I have executed this benchmark on Oracle-supported platforms, the results can be found below. For each platform, I've done two trials corresponding to different JDKs: one being the current HEAD ("current", based on 8357de8), and one with JDK-8378559 reverted ("legacy", revert commit on top of 8357de8). This one-off experiment is insightful since JDK-8378559 increased the overhead of downcalls using state capturing by introducing thread-local data initialization. The performance impact of this change was previously unquantified.

Linux x64

Current:

Benchmark                                               Mode  Cnt   Score   Error  Units
CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  38.442 ± 0.016  ns/op
CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  45.425 ± 1.826  ns/op

Legacy:

Benchmark                                               Mode  Cnt   Score   Error  Units
CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  39.224 ± 0.789  ns/op
CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  41.011 ± 0.058  ns/op

Linux AArch64

Current:

Benchmark                                               Mode  Cnt   Score   Error  Units
CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  45.396 ± 0.185  ns/op
CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  56.116 ± 0.463  ns/op

Legacy:

Benchmark                                               Mode  Cnt   Score   Error  Units
CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  44.859 ± 0.183  ns/op
CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  51.721 ± 0.153  ns/op

macOS x64

Current:

Benchmark                                               Mode  Cnt   Score   Error  Units
CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  41.388 ± 0.547  ns/op
CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  46.102 ± 0.208  ns/op

Legacy:

Benchmark                                               Mode  Cnt   Score   Error  Units
CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  40.929 ± 0.138  ns/op
CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  43.938 ± 0.385  ns/op

macOS AArch64

Current:

Benchmark                                               Mode  Cnt   Score   Error  Units
CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  16.892 ± 0.008  ns/op
CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  21.933 ± 0.016  ns/op

Legacy:

Benchmark                                               Mode  Cnt   Score   Error  Units
CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  16.894 ± 0.008  ns/op
CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  19.083 ± 0.009  ns/op

Windows x64

Current:

Benchmark                                               Mode  Cnt   Score   Error  Units
CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  39.711 ± 0.031  ns/op
CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  63.534 ± 0.208  ns/op

Legacy:

Benchmark                                               Mode  Cnt   Score   Error  Units
CaptureCallStateOverheadBench.doNotUseCaptureCallState  avgt   30  40.532 ± 0.544  ns/op
CaptureCallStateOverheadBench.useCaptureCallState       avgt   30  50.759 ± 0.275  ns/op


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8379630: Add JMH benchmark to measure the overhead of using captured call state (Enhancement - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/30719/head:pull/30719
$ git checkout pull/30719

Update a local copy of the PR:
$ git checkout pull/30719
$ git pull https://git.openjdk.org/jdk.git pull/30719/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 30719

View PR using the GUI difftool:
$ git pr show -t 30719

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/30719.diff

Using Webrev

Link to Webrev Comment

@Arraying
Copy link
Copy Markdown
Member Author

/cc core-libs hotspot

@bridgekeeper
Copy link
Copy Markdown

bridgekeeper bot commented Apr 14, 2026

👋 Welcome back phubner! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link
Copy Markdown

openjdk bot commented Apr 14, 2026

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk openjdk bot changed the title 8379630 8379630: Add JMH benchmark to measure the overhead of using captured call state Apr 14, 2026
@openjdk openjdk bot added core-libs core-libs-dev@openjdk.org hotspot hotspot-dev@openjdk.org labels Apr 14, 2026
@openjdk
Copy link
Copy Markdown

openjdk bot commented Apr 14, 2026

@Arraying
The core-libs label was successfully added.

The hotspot label was successfully added.

@openjdk
Copy link
Copy Markdown

openjdk bot commented Apr 14, 2026

@Arraying To determine the appropriate audience for reviewing this pull request, one or more labels corresponding to different subsystems will normally be applied automatically. However, no automatic labelling rule matches the changes in this pull request. In order to have an "RFR" email sent to the correct mailing list, you will need to add one or more applicable labels manually using the /label pull request command.

Applicable Labels
  • build
  • client
  • compiler
  • core-libs
  • hotspot
  • hotspot-compiler
  • hotspot-gc
  • hotspot-jfr
  • hotspot-runtime
  • i18n
  • ide-support
  • javadoc
  • jdk
  • net
  • nio
  • security
  • serviceability
  • shenandoah

@Arraying Arraying marked this pull request as ready for review April 14, 2026 11:26
@openjdk openjdk bot added the rfr Pull request is ready for review label Apr 14, 2026
@mlbridge
Copy link
Copy Markdown

mlbridge bot commented Apr 14, 2026

Webrevs

@Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@Fork(value = 3, jvmArgs = {"--add-exports=java.base/jdk.internal.foreign=ALL-UNNAMED",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The add exports seems not necessary; you are not using anything from this package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core-libs core-libs-dev@openjdk.org hotspot hotspot-dev@openjdk.org rfr Pull request is ready for review

Development

Successfully merging this pull request may close these issues.

2 participants