Skip to content

Allow setting of opsPerInvocation from setup functions #97

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

bbrehm
Copy link

@bbrehm bbrehm commented Mar 17, 2023

In many cases, opsPerInvocation is known at the @setup function, but cannot be reasonably known at any other point. This example is somewhat lame in that regard; I actually needed that for some not-quite-micro benchmarks (the setup function loads real-ish data from disk, and the opsPerInvocation depends on the data that has been loaded).

Since every good example teaches multiple things, I added one that also warns about the branch history table stitching together surprisingly long patterns.

The super-duper correct way of doing this example (cf https://discourse.julialang.org/t/psa-microbenchmarks-remember-branch-history/17436) would be to have constant opsPerInvocation and one large array; then a smaller (@param) array is copied into that multiple times. Thus we could exclude effects of memory hierarchy. That would look quite unnatural and harder to relate to "normal looking" code (and there are no effects of memory anyway, since first, the benchmark is too slow to saturate DRAM bandwidth, and second even the largest set fits into L2).


Progress

  • Change must not contain extraneous whitespace
  • Change must be properly reviewed (1 review required, with at least 1 Committer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jmh.git pull/97/head:pull/97
$ git checkout pull/97

Update a local copy of the PR:
$ git checkout pull/97
$ git pull https://git.openjdk.org/jmh.git pull/97/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 97

View PR using the GUI difftool:
$ git pr show -t 97

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jmh/pull/97.diff

Webrev

Link to Webrev Comment

@bridgekeeper bridgekeeper bot added the oca Needs verification of OCA signatory status label Mar 17, 2023
@bridgekeeper
Copy link

bridgekeeper bot commented Mar 17, 2023

Hi @bbrehm, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user bbrehm" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

@bbrehm
Copy link
Author

bbrehm commented Mar 17, 2023

/signed
(but signature review at oracle is still pending)

@bridgekeeper bridgekeeper bot added the oca-verify Needs verification of OCA signatory status label Mar 17, 2023
@bridgekeeper
Copy link

bridgekeeper bot commented Mar 17, 2023

Thank you! Please allow for up to two weeks to process your OCA, although it is usually done within one to two business days. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated!

@shipilev
Copy link
Member

Interesting example!

However, the benchmark parameters (like all other params) are supposed to be immutable for the run. It is pure luck that all uses of opsPerInvocation happen after @Setup, and thus calculations are not subtly broken. This is not actually guaranteed. Trying to reconcile what to do when the benchmark code mutates its own configuration is a much larger task. (Things to think about: Do we allow to set opsPerInvocation multiple times during the run? Do we allow doing this in @Setup(Level.Invocation)?)

The usual answer for things that cannot be captured in declarative form via annotations is going to Java API and doing it in imperative form there. For your example, it would be something like (drafting from memory, might not readily compile):

    public static void main(String[] args) throws RunnerException {
        Options parent = new OptionsBuilder()
                .include(".*" + JMHSample_40_InfraParamNormalization.class.getSimpleName() + ".*")
                .addProfiler(LinuxPerfNormProfiler.class)
                .build();

        for (int size : new int[] { 100, 1000, 5000, 10000, 100000 }) {
            Options opts = new OptionsBuilder()
                  .parent(parent)
                  .param("size", String.valueOf(size))
                  .operationsPerInvocation(size)
                  .build();
             new Runner(opt).run();
        }
    }

@bridgekeeper bridgekeeper bot removed oca Needs verification of OCA signatory status oca-verify Needs verification of OCA signatory status labels Mar 24, 2023
@openjdk openjdk bot added the rfr Pull request is ready for review label Mar 24, 2023
@mlbridge
Copy link

mlbridge bot commented Mar 24, 2023

Webrevs

@bbrehm
Copy link
Author

bbrehm commented Mar 24, 2023

Interesting example!

Thanks! I think the example is fun, the issue is pervasive in micro-benchmarking, and "branch history table stitching" (making up a word here) is not a very well-known effect. So I think some form of this example would be worthwhile to add, even without dynamic opsPerInvocation.

I could change the example to work with static opsPerInvocation, following the variant in https://discourse.julialang.org/t/psa-microbenchmarks-remember-branch-history/17436

This has the advantage of obviously excluding memory as the source of the effect (no one can claim that this is due to the working set still being in L1 from the last benchmark iteration), and the disadvantage of being less relatable: Iterating over a length 100k array that has 100 copies of the same random length 1000 pattern looks unnatural, and furthermore obscures the effect that this is a genuine benchmarking artifact -- the CPU cheats because the branch history table remembers previous iterations from the hidden jmh-generated benchmarking loop.

Things to think about: Do we allow to set opsPerInvocation multiple times during the run? Do we allow doing this in @setup(Level.Invocation)?

True enough. We'd need to document the specifics of the implementation (which is: You can overwrite opsPerInvocation as often as you want, but only the last write counts). That issue alone could make this PR non-workable as stable public API :(

The usual answer for things that cannot be captured in declarative form via annotations is going to Java API ...

I'd consider doing that instead if there was a sensible API to individually configure benchmarks and combine them into a single Runner for output formatting. Alas, almost all relevant functions in Runner are private instead of protected.

Nevertheless, I think figuring out or documenting a low-effort way of proper rescaling / dynamic opsPerInvocation would be worthwhile.

Maybe the rescaling operation happening in the generated functions like public BenchmarkTaskResult Foo_Throughput(InfraControl control, ThreadParams threadParams) throws Throwable could get a hook?

How would e.g. you use jmh to benchmark a piece of complex code, where performance is meaningful only relative to a corpus of test-data (> 100 mb), like e.g. parse a stream of webserver logs?

Obviously the test-data cannot be known at compile time, and obviously some kind of normalization is necessary, otherwise the time will only reflect the size of the corpus (and it's up to the user to select a corpus that matches the intended deployment, and to experiment to decide between cycles-per-byte, messages-per-second or whatever metric turns out to be most meaningful).

Using reflection to modify opsPerInvocation is good enough for my needs at the moment, now that I figured out how to do that. But I'd feel very dirty recommending that to other people.

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 5, 2023

@bbrehm This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 2, 2024

@bbrehm This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfr Pull request is ready for review
Development

Successfully merging this pull request may close these issues.

2 participants