-
Notifications
You must be signed in to change notification settings - Fork 720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Escape Analysis under OSR #5737
Improve Escape Analysis under OSR #5737
Conversation
This change is dependent upon OMR pull request 3841 Vijay @vijaysun-omr, may I ask you to review this change? Andrew @andrewcraik, may I ask you to verify my descriptions of your improvements? 2019-05-13: Removed WIP from title, as OMR pull request 3841 has been merged. |
e450af0
to
c31e473
Compare
Force pushed to c31e473 in order to rectify copyright problem |
CI builds were failing because I had failed to include |
description is correct - review from @vijaysun-omr is required since I was involved with the development of this change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note - we need @vijaysun-omr to review since I was involved with the implementation. I verify the docs are correct and the implementation looks reasonable.
Looks reasonable to me. What kind of testing has been done on this change ? |
@hzongaro can you comment on the testing? |
@vijaysun-omr Vijay asked,
Thanks for reviewing, Vijay. I ran internal "gold" testing (level.sanity, level.regression, level.promotion) on all platforms. I also tested this change with OMR pull request #3842 against the benchmark referenced in issue #2072. Unfortunately, even after this change, there still seems to be lingering heap allocation of |
5aedfd0
to
5c9af1b
Compare
I wrote,
I think the lingering object allocation that I mentioned was due to a "fix" I made for an infinite recursion. I will mark this pull request as WIP while I fix the problem, so the intended improvement to the Escape Analysis optimization isn't hobbled. |
Is this PR functionally correct? Does it improve the EA and the amount of objets that get stack allocated? If the answer is yes to those questions then I do not see why we do not merge this PR and the result of the investigation into further refinements can not happen in another PR. |
@hzongaro could you please answer ? |
@charliegracie Charlie, I believe the PR is functionally correct, improves EA and reduces the amount of objects that get stack allocated. My concern is that Andrew's primary motivation for these changes was to improve on the number of objects stack allocated for issue #2072. In testing the change with the benchmark snippet that appeared in the description from issue #2072 -- which I've reproduced below -- I verified that the objects in the loop were stack allocated with Andrew's changes.
However, last week I went back to the original benchmark link (which now seems to be broken) in the issue. There, the relevant code in the benchmark was slightly different:
With the addition of those boxed So, although the stack allocations now happen with the version of the code that appeared in issue #2072, the fact that there was no improvement in stack allocation using the original version of the benchmark led me to move this pull request back to WIP. If you feel it's still worthwhile merging this change while the work on ensuring the objects in the original version of the benchmark can be stack allocated continues, I would be happy to remove the WIP from the title. Please accept my apologies for failing to notice this difference earlier. |
@hzongaro I think it is worth putting this in. Getting the |
OK. Removing WIP from the title. |
Jenkins test sanity xlinux,plinux,win jdk8,jdk11 |
Hi, Vijay @vijaysun-omr - I'm still working on tracking down the source of the intermittent crash from |
Sorry for the long delay on updating this. I've marked this change as a work in progress. As was mentioned in #5737 (comment), there is intermittent failure that occurs with this change. The problem is in unrelated code, but happens to exposed by this change. OpenJ9 pull request #6626, OMR pull request #4177 and OMR pull request #4182 address that problem. Once they are delivered, I will remove the WIP prefix from this pull request. |
bfef953
to
69739bb
Compare
Just force pushed rebasing the branch on the latest master. No changes otherwise. |
Jenkins test sanity xlinux,win,plinux jdk8,jdk11 |
have started sanity since the sequence of changes needed to fix the OSR metadata bug exposed by this work has now landed. |
Removed WIP prefix, because as Andrew mentioned, the bug this change exposes has been fixed. |
@hzongaro I don't think we should merge this until the new Pre/Post are added to the opt strategy in this commit - the danger is that we would keep running regular EA and ignore the 'real' prepares thinking the fake prepares will save us, but we won't have added them |
Thanks, Andrew @andrewcraik. So I think that means that I need to separate out the part of this change that adds PreEscapeAnalysis and PostEscapeAnalysis. Then after that's merged, OMR pull request #3842, which adds them to the optimization strategy can be merged, and finally, the remainder of this change, with the actual EA changes, can be delivered. Do I have that right? |
Yes I think so - we can't change main EA until we have pre/post running, but it is safe to run pre/post with existing EA - you just see more escapes (escapes that would have happened anyway) |
69739bb
to
5791293
Compare
Reorganized changes so that PreEscapeAnaysis and PostEscapeAnalysis are available and running when the Escape Analysis changes are delivered, as Andrew recommends above. This makes this change now dependent upon pull request #6654 and OMR pull request #3842. |
394c9ea
to
f3f7383
Compare
Just force pushed to bring the |
The preEscapeAnalysis optimization, which was introduced in a previous change, looks for calls to OSRInductionHelper and adds a fake prepareForOSR call that references all live auto symrefs and pending pushes. Any object that ends up as a candidate for stack allocation that appears to be used by such a fake prepareForOSR call can be heapified at that point by Escape Analysis. During EA itself, those references on prepareForOSR calls are marked as ignoreable for the purposes of determining whether an object can be stack allocated. Submitted on behalf of Andrew Craik <ajcraik@ca.ibm.com> Signed-off-by: Henry Zongaro <zongaro@ca.ibm.com>
The code in findIgnoreableUses that populates the TR_BitVector, _ignoreableUses, uses the global indices of nodes to index the TR_BitVector. Code in checkDefsAndUses was using the value numbers of nodes to check the entries in _ignoreableUses. Fixed checkDefsAndUses to use nodes' global indices instead. Also, reorganized the code checking _ignoreableUses to pull the invariant check out of a loop. Signed-off-by: Henry Zongaro <zongaro@ca.ibm.com>
f3f7383
to
3fb2b2a
Compare
Andrew @andrewcraik, I've brought this branch up to date with the latest changes in OpenJ9. Before this change is sanity tested, we'll need OMR pull request 3842 to make it through the OpenJ9 OMR acceptance testing. I will leave this marked as a work in progress until then. |
@hzongaro I believe 3842 has merged and been accepted. If you could confirm and un-WIP I'll restart testing on this change. |
Removed WIP prefix now that OMR pull request 3842 has been merged and passed OpenJ9 OMR acceptance testing. |
Jenkins test sanity all jdk8,jdk11 |
This change defines new optimization passes,
preEscapeAnalysis
andpostEscapeAnalysis
, that help Escape Analysis (EA) do better under voluntary OSR.The new
preEscapeAnalysis
optimization looks for calls toOSRInductionHelper
and adds a fakeprepareForOSR
call that references all live auto symrefs and pending pushes. Any object that ends up as a candidate for stack allocation that appears to be used by such a fakeprepareForOSR
call can be heapified at that point.During EA itself, those references on
prepareForOSR
calls are marked as ignoreable for the purposes of determining whether an object can be stack allocated.After EA, the
postEscapeAnalysis
pass will remove the fake calls toprepareForOSR
.Submitted on behalf of Andrew Craik ajcraik@ca.ibm.com
Signed-off-by: Henry Zongaro zongaro@ca.ibm.com