Allow setting of opsPerInvocation from setup functions #97

bbrehm · 2023-03-17T12:54:38Z

In many cases, opsPerInvocation is known at the @setup function, but cannot be reasonably known at any other point. This example is somewhat lame in that regard; I actually needed that for some not-quite-micro benchmarks (the setup function loads real-ish data from disk, and the opsPerInvocation depends on the data that has been loaded).

Since every good example teaches multiple things, I added one that also warns about the branch history table stitching together surprisingly long patterns.

The super-duper correct way of doing this example (cf https://discourse.julialang.org/t/psa-microbenchmarks-remember-branch-history/17436) would be to have constant opsPerInvocation and one large array; then a smaller (@param) array is copied into that multiple times. Thus we could exclude effects of memory hierarchy. That would look quite unnatural and harder to relate to "normal looking" code (and there are no effects of memory anyway, since first, the benchmark is too slow to saturate DRAM bandwidth, and second even the largest set fits into L2).

Progress

Change must not contain extraneous whitespace
Change must be properly reviewed (1 review required, with at least 1 Committer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jmh.git pull/97/head:pull/97
$ git checkout pull/97

Update a local copy of the PR:
$ git checkout pull/97
$ git pull https://git.openjdk.org/jmh.git pull/97/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 97

View PR using the GUI difftool:
$ git pr show -t 97

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jmh/pull/97.diff

Webrev

Link to Webrev Comment

bridgekeeper · 2023-03-17T12:56:22Z

Hi @bbrehm, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user bbrehm" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

bbrehm · 2023-03-17T15:18:29Z

/signed
(but signature review at oracle is still pending)

bridgekeeper · 2023-03-17T15:20:21Z

Thank you! Please allow for up to two weeks to process your OCA, although it is usually done within one to two business days. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated!

shipilev · 2023-03-23T10:27:03Z

Interesting example!

However, the benchmark parameters (like all other params) are supposed to be immutable for the run. It is pure luck that all uses of opsPerInvocation happen after @Setup, and thus calculations are not subtly broken. This is not actually guaranteed. Trying to reconcile what to do when the benchmark code mutates its own configuration is a much larger task. (Things to think about: Do we allow to set opsPerInvocation multiple times during the run? Do we allow doing this in @Setup(Level.Invocation)?)

The usual answer for things that cannot be captured in declarative form via annotations is going to Java API and doing it in imperative form there. For your example, it would be something like (drafting from memory, might not readily compile):

    public static void main(String[] args) throws RunnerException {
        Options parent = new OptionsBuilder()
                .include(".*" + JMHSample_40_InfraParamNormalization.class.getSimpleName() + ".*")
                .addProfiler(LinuxPerfNormProfiler.class)
                .build();

        for (int size : new int[] { 100, 1000, 5000, 10000, 100000 }) {
            Options opts = new OptionsBuilder()
                  .parent(parent)
                  .param("size", String.valueOf(size))
                  .operationsPerInvocation(size)
                  .build();
             new Runner(opt).run();
        }
    }

mlbridge · 2023-03-24T10:05:05Z

Webrevs

00: Full (58bd6698)

bbrehm · 2023-03-24T16:54:49Z

Interesting example!

Thanks! I think the example is fun, the issue is pervasive in micro-benchmarking, and "branch history table stitching" (making up a word here) is not a very well-known effect. So I think some form of this example would be worthwhile to add, even without dynamic opsPerInvocation.

I could change the example to work with static opsPerInvocation, following the variant in https://discourse.julialang.org/t/psa-microbenchmarks-remember-branch-history/17436

This has the advantage of obviously excluding memory as the source of the effect (no one can claim that this is due to the working set still being in L1 from the last benchmark iteration), and the disadvantage of being less relatable: Iterating over a length 100k array that has 100 copies of the same random length 1000 pattern looks unnatural, and furthermore obscures the effect that this is a genuine benchmarking artifact -- the CPU cheats because the branch history table remembers previous iterations from the hidden jmh-generated benchmarking loop.

Things to think about: Do we allow to set opsPerInvocation multiple times during the run? Do we allow doing this in @setup(Level.Invocation)?

True enough. We'd need to document the specifics of the implementation (which is: You can overwrite opsPerInvocation as often as you want, but only the last write counts). That issue alone could make this PR non-workable as stable public API :(

The usual answer for things that cannot be captured in declarative form via annotations is going to Java API ...

I'd consider doing that instead if there was a sensible API to individually configure benchmarks and combine them into a single Runner for output formatting. Alas, almost all relevant functions in Runner are private instead of protected.

Nevertheless, I think figuring out or documenting a low-effort way of proper rescaling / dynamic opsPerInvocation would be worthwhile.

Maybe the rescaling operation happening in the generated functions like public BenchmarkTaskResult Foo_Throughput(InfraControl control, ThreadParams threadParams) throws Throwable could get a hook?

How would e.g. you use jmh to benchmark a piece of complex code, where performance is meaningful only relative to a corpus of test-data (> 100 mb), like e.g. parse a stream of webserver logs?

Obviously the test-data cannot be known at compile time, and obviously some kind of normalization is necessary, otherwise the time will only reflect the size of the corpus (and it's up to the user to select a corpus that matches the intended deployment, and to experiment to decide between cycles-per-byte, messages-per-second or whatever metric turns out to be most meaningful).

Using reflection to modify opsPerInvocation is good enough for my needs at the moment, now that I figured out how to do that. But I'd feel very dirty recommending that to other people.

bridgekeeper · 2023-12-05T01:41:15Z

@bbrehm This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper · 2024-01-02T06:51:44Z

@bbrehm This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

bbrehm added 2 commits March 17, 2023 10:10

add example, using reflection

1214ce4

add setter function for opsPerInvocation; use it in example

58bd669

bridgekeeper bot added the oca Needs verification of OCA signatory status label Mar 17, 2023

bridgekeeper bot added the oca-verify Needs verification of OCA signatory status label Mar 17, 2023

bbrehm mentioned this pull request Mar 20, 2023

Add first proper benchmarks joernio/flatgraph#17

Merged

bridgekeeper bot removed oca Needs verification of OCA signatory status oca-verify Needs verification of OCA signatory status labels Mar 24, 2023

openjdk bot added the rfr Pull request is ready for review label Mar 24, 2023

bridgekeeper bot closed this Jan 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow setting of opsPerInvocation from setup functions #97

Allow setting of opsPerInvocation from setup functions #97

Uh oh!

bbrehm commented Mar 17, 2023 •

edited by openjdk bot

Loading

Uh oh!

bridgekeeper bot commented Mar 17, 2023

Uh oh!

bbrehm commented Mar 17, 2023

Uh oh!

bridgekeeper bot commented Mar 17, 2023

Uh oh!

shipilev commented Mar 23, 2023

Uh oh!

mlbridge bot commented Mar 24, 2023

Uh oh!

bbrehm commented Mar 24, 2023

Uh oh!

bridgekeeper bot commented Dec 5, 2023

Uh oh!

bridgekeeper bot commented Jan 2, 2024

Uh oh!

Uh oh!

Allow setting of opsPerInvocation from setup functions #97

Allow setting of opsPerInvocation from setup functions #97

Uh oh!

Conversation

bbrehm commented Mar 17, 2023 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Reviewing

Webrev

Uh oh!

bridgekeeper bot commented Mar 17, 2023

Uh oh!

bbrehm commented Mar 17, 2023

Uh oh!

bridgekeeper bot commented Mar 17, 2023

Uh oh!

shipilev commented Mar 23, 2023

Uh oh!

mlbridge bot commented Mar 24, 2023

Webrevs

Uh oh!

bbrehm commented Mar 24, 2023

Uh oh!

bridgekeeper bot commented Dec 5, 2023

Uh oh!

bridgekeeper bot commented Jan 2, 2024

Uh oh!

Uh oh!

bbrehm commented Mar 17, 2023 •

edited by openjdk bot

Loading