Skip to content

8281518: New optimization: convert "(x|y)-(x^y)" into "x&y" #7395

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

CptGit
Copy link
Contributor

@CptGit CptGit commented Feb 9, 2022

Convert (x|y)-(x^y) into x&y, in SubINode::Ideal and SubLNode::Ideal.

The results of the microbenchmark are as follows:

Baseline:                                                                                                                                         
Benchmark                                Mode  Cnt  Score   Error  Units
SubIdeal_XOrY_Minus_XXorY_.baselineInt   avgt   60  0.481 ± 0.003  ns/op
SubIdeal_XOrY_Minus_XXorY_.baselineLong  avgt   60  0.482 ± 0.004  ns/op
SubIdeal_XOrY_Minus_XXorY_.testInt       avgt   60  0.901 ± 0.007  ns/op
SubIdeal_XOrY_Minus_XXorY_.testLong      avgt   60  0.894 ± 0.004  ns/op

Patch:
Benchmark                                Mode  Cnt  Score   Error  Units
SubIdeal_XOrY_Minus_XXorY_.baselineInt   avgt   60  0.480 ± 0.003  ns/op
SubIdeal_XOrY_Minus_XXorY_.baselineLong  avgt   60  0.483 ± 0.005  ns/op
SubIdeal_XOrY_Minus_XXorY_.testInt       avgt   60  0.600 ± 0.004  ns/op
SubIdeal_XOrY_Minus_XXorY_.testLong      avgt   60  0.602 ± 0.004  ns/op

Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8281518: New optimization: convert "(x|y)-(x^y)" into "x&y"

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/7395/head:pull/7395
$ git checkout pull/7395

Update a local copy of the PR:
$ git checkout pull/7395
$ git pull https://git.openjdk.java.net/jdk pull/7395/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 7395

View PR using the GUI difftool:
$ git pr show -t 7395

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/7395.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Feb 9, 2022

👋 Welcome back CptGit! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Feb 9, 2022

@CptGit The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the hotspot-compiler hotspot-compiler-dev@openjdk.org label Feb 9, 2022
@chhagedorn
Copy link
Member

A bug has been filed https://bugs.openjdk.java.net/browse/JDK-8281518.

@CptGit CptGit changed the title [TBD]: New optimization: convert "(x|y)-(x^y)" into "x&y" 8281518: New optimization: convert "(x|y)-(x^y)" into "x&y" Feb 9, 2022
@openjdk openjdk bot added the rfr Pull request is ready for review label Feb 9, 2022
@mlbridge
Copy link

mlbridge bot commented Feb 9, 2022

Webrevs

@merykitty
Copy link
Member

There is a large number of transformations in this bitwise operation family such as (x & y) | (x ^ y) == x | y, (x & y) ^ (x | y) == x ^ y, (x ^ y) ^ (x | y) == x & y , etc and that does not even touch operations involving 3 operands. It seems both gcc and clang perform these 2-operand bitwise transformations so you could look at them to see if there is a more general way to achieve all combinations.
Thanks.

@adinn
Copy link
Contributor

adinn commented Feb 9, 2022

I am not clear whether there is a justification for pushing this change. We are in danger of heading down the garden path looking for optimization fairies.

The above transformation adds extra case handling overhead to the AD matcher (correction, ideal code) when processing a Subtract node which slows down compilation to a small degree for a relatively common case (most apps use subtraction). On the credit side it may generate a small speed up in generated code when the pattern is matched, the saving also depending on not just on seeing this pattern but also on how often the resulting generated code gets executed. So, we have a trade-off.

For any app there are probably going to be a lot of times where the compiler matches subtract nodes. There are probably going to be very few cases where this pattern will turn up -- even if you include cases where it happens through recursive reduction -- and even less where the resulting generated code gets executed many times. At some point we need to trade off the compiler overhead for all applications against the potential gains for some applications. The micro-benchmark only addresses one side of that trade-off.

I'd really like to see a better justification for including this patch and the related transformations suggested by @merykitty before proceeding.

n.b. the fact that gcc and clang do this is not really a good argument. In Java the trade-off is one runtime cost against another which is not the case for those compilers.

@merykitty
Copy link
Member

merykitty commented Feb 9, 2022

Hi,

For clarification, my idea is to look at GCC and clang's codebases to see if there is a more general way to achieve every transformation elegantly instead of naively matching every combination, which may mitigate the cost for each additional transformation.

Thanks.

@theRealAph
Copy link
Contributor

I am not clear whether there is a justification for pushing this change. We are in danger of heading down the garden path looking for optimization fairies.

n.b. the fact that gcc and clang do this is not really a good argument. In Java the trade-off is one runtime cost against another which is not the case for those compilers.

I agree. If we're dong this kind of optimization it makes little sense to do it piecemeal. Maybe, just maybe, there's some opportunity for some more general boolean simplification, but even then it's not clear how much of it is worth doing.

@CptGit
Copy link
Contributor Author

CptGit commented Feb 9, 2022

For any app there are probably going to be a lot of times where the compiler matches subtract nodes. There are probably going to be very few cases where this pattern will turn up -- even if you include cases where it happens through recursive reduction -- and even less where the resulting generated code gets executed many times. At some point we need to trade off the compiler overhead for all applications against the potential gains for some applications. The micro-benchmark only addresses one side of that trade-off.

Thanks for your input. I totally agree JIT cares compilation overhead way more than those static compilers, but I was wondering if there is a good way to benchmark the general cases where this pattern is few seen. I know there are some benckmark suites for Java such as specjvm or renaissance but I don't think they are a good fit here. What I wanted to ask is what is an objective metric in the community to decide if we should adopt a new optimization, if there is one.

@adinn
Copy link
Contributor

adinn commented Feb 10, 2022

I was wondering if there is a good way to benchmark the general cases where this pattern is few seen

It's very difficult to find a way to assess the positive and negative aspects of a change like this. Micro-benchmarks only really provide a ballpark guide to the potential benefit because they test the effect of the change in isolation. Even then they only tell part of the story because they ignore the degree to which that benefit will be realized. The potential costs are even harder to estimate. They will vary from app to app according to what gets compiled and which paths the compilation takes. They will even vary from run to run of the same app because the JVM does not guarantee precise repeatability across restarts even if you keep all inputs the same.

For quite a few ideal transformations it is clear that they will be applicable very frequently and hence that they are worth implementing. That's often clear because we know that frequently used Java language constructs translate to graphs that will have a shape that matches the input checked for by the ideal code. In other cases, we can know that related ideal transforms will recursively combine to generate the target shape. For many other possible transforms we are in a grey area where we cannot know if the cost of checking for will repay in saved execution.

@mlbridge
Copy link

mlbridge bot commented Feb 11, 2022

Mailing list message from John Rose on hotspot-compiler-dev:

On 9 Feb 2022, at 8:38, Quan Anh Mai wrote:

For clarification, my idea is to look at GCC and clang's codebases to
see if there is a more general way to achieve every transformation
elegantly instead of naively matching every combination, which may
mitigate the cost for each additional transformation.

Yes, that thought occurred to me as well. It seems like we are on the
verge of seeing dozens of identities like the currently proposed one,
each with its own wad of hand-maintained optimizer code. On balance it
will make the optimizer harder to maintain. That would be OK if it got
us significant performance benefit, but surely it doesn?t.

What *would* get us benefit in a cost-effective way would be to take
this optimization and several previous ones (and future ones) and
refactor them into a single comprehensive analysis based on truth
tables, which would normalize bitwise expressions up to some particular
complexity (say, up to three input variables and up to depth 2, with
variety of operators). This is a minor research project, but (I think)
a worthy one.

? John

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 11, 2022

@CptGit This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@CptGit CptGit marked this pull request as draft March 11, 2022 16:46
@openjdk openjdk bot removed the rfr Pull request is ready for review label Mar 11, 2022
@bridgekeeper
Copy link

bridgekeeper bot commented May 6, 2022

@CptGit This pull request has been inactive for more than 16 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this May 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler hotspot-compiler-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

5 participants