Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate and implement locals compaction on Power, AArch64 #5910

Open
0xdaryl opened this issue May 28, 2019 · 8 comments
Open

Investigate and implement locals compaction on Power, AArch64 #5910

0xdaryl opened this issue May 28, 2019 · 8 comments

Comments

@0xdaryl
Copy link
Contributor

0xdaryl commented May 28, 2019

I noticed that the JIT Power codegen doesn't support the compact locals optimization (where the JIT frame is compacted by sharing non-interfering autos). Was that a conscious decision (e.g., it wouldn't be worth the effort because there are so many registers), or is it just an oversight? @gita-omr @zl-wang

Ditto for ARM and AArch64.

@zl-wang
Copy link
Contributor

zl-wang commented May 28, 2019

My impression is mainly the former (i.e., so many registers). Plus, a comprehensive test was conducted re stack frame size around the time 64bit support was introduced ... even bigger stack frame in 64bit bore little portion of performance impact by 64bit: bigger object on cache capacity efficiency was largely the major performance issue.

@0xdaryl
Copy link
Contributor Author

0xdaryl commented May 28, 2019

If the last investigation was when 64-bit was introduced, that was 15 years ago. I think the workloads of interest to OpenJ9 have changed, and profiles have gotten flatter and the stacks deeper. Do you think it's worth repeating an investigation again?

@zl-wang
Copy link
Contributor

zl-wang commented May 28, 2019

It might be an interesting item for one of the students. Let's gather some data wrt typically how many stack slots can be saved with compaction on etc, and its resulting performance impact.

@gita-omr
Copy link
Contributor

Yes, I think it's worth trying. @0xdaryl you mean CompactLocals done by the optimizer?

@0xdaryl
Copy link
Contributor Author

0xdaryl commented May 28, 2019

Yes. It's probably worth doing the investigation on either X or Z where the optimization is enabled. Locals should be the same across architectures (i.e., it doesn't work on spill temps or other local artifacts created by backends). On X86 there is some code in OMR::Linkage::mapCompactedStack guarded by DEBUG macros that can report some statistics on how the compaction went. It's dusty code though, but probably can be resurrected pretty easily.

@0xdaryl 0xdaryl changed the title Locals compaction on Power, AArch64 ? Investigate and implement locals compaction on Power, AArch64 Oct 10, 2019
@fjeremic
Copy link
Contributor

fjeremic commented Jun 4, 2020

IMO if any work is done, this feature needs to sink down into OMR and be made cross platform. Nothing about local compaction is really codegen specific (may be some corner cases?). See eclipse-omr/omr#5284 for details.

@0xdaryl
Copy link
Contributor Author

0xdaryl commented Jun 4, 2020

I agree, as much as possible. There are some language-specific interference issues that may need to be abstracted away (e.g., can a reference and an integer share the same slot? Some languages might say yes, others disallow it).

@knn-k
Copy link
Contributor

knn-k commented Mar 7, 2022

@Akira1Saitoh has implemented and enabled locals compaction on AArch64 by eclipse-omr/omr#6387 and related PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants