Skip to content

Conversation

@preames
Copy link
Collaborator

@preames preames commented Sep 25, 2025

This heuristic was originally added in 40c4aa with the stated purpose of avoiding global split on live long ranges created by MachineLICM hoisting trivially rematerializable instructions. In the meantime, various backends have introduced non-trivial rematerialization cases, MachineLICM gained an explicitly triviality check, and we've reworked our APIs to match naming wise. Let's move this heuristic back to truely trivial remat only.

This is a functional change, though somewhat hard to hit. This change will cause non-trivially rematerializable instructions to be globally split more often. This is likely a good thing since non-trivial remat may not be legal at all possible points in the live interval, but may cost slightly more compile time.

I don't have a motivating example; I found it when reviewing the callers of isRemMaterializable(MI).

This heuristic was originally added in 40c4aa with the stated purpose
of avoiding global split on live long ranges created by MachineLICM
hoisting trivially rematerializable instructions.  In the meantime,
various backends have introduced non-trivial rematerialization cases,
MachineLICM gained an explicitly triviality check, and we've reworked
our APIs to match naming wise.  Let's move this heuristic back to
truely trivial remat only.

This is a functional change, though somewhat hard to hit.  This
change will cause non-trivially rematerializable instructions
to be globally split more often.  This is likely a good thing
since non-trivial remat may not be legal at all possible points
in the live interval, but may cost slightly more compile time.

I don't have a motivating example; I found it when reviewing the
callers of isRemMaterializable(MI).
@preames
Copy link
Collaborator Author

preames commented Sep 25, 2025

@lukel97 You appear to have good infrastructure for looking at spill/fill impacts, would you mind sanity checking this one just to make sure there's no unexpected negative interactions? Not expecting any, but well, RegAlloc is famous for the unexpected.

Copy link
Contributor

@lukel97 lukel97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tested and there were no codegen/spills/fills changes on llvm test suite or SPEC CPU 2017 on rva23u64 -O3.

@preames preames merged commit 84df412 into llvm:main Sep 26, 2025
11 checks passed
@preames preames deleted the pr-remat-region-split-heuristic branch September 26, 2025 13:53
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
)

This heuristic was originally added in 40c4aa with the stated purpose of
avoiding global split on live long ranges created by MachineLICM
hoisting trivially rematerializable instructions. In the meantime,
various backends have introduced non-trivial rematerialization cases,
MachineLICM gained an explicitly triviality check, and we've reworked
our APIs to match naming wise. Let's move this heuristic back to truely
trivial remat only.

This is a functional change, though somewhat hard to hit. This change
will cause non-trivially rematerializable instructions to be globally
split more often. This is likely a good thing since non-trivial remat
may not be legal at all possible points in the live interval, but may
cost slightly more compile time.

I don't have a motivating example; I found it when reviewing the callers
of isRemMaterializable(MI).
lukel97 added a commit that referenced this pull request Oct 4, 2025
#159211)

In the register allocator we define non-trivial rematerialization as the
rematerlization of an instruction with virtual register uses.

We have been able to perform non-trivial rematerialization for a while,
but it has been prevented by default unless specifically overriden by
the target in `TargetTransformInfo::isReMaterializableImpl`. The
original reasoning for this given by the comment in the default
implementation is because we might increase a live range of the virtual
register, but we don't actually do this.
LiveRangeEdit::allUsesAvailableAt makes sure that we only rematerialize
instructions whose virtual registers are already live at the use sites.

https://reviews.llvm.org/D106408 had originally tried to remove this
restriction but it was reverted after some performance regressions were
reported. We think it is likely that the regressions were caused by the
fact that the old isTriviallyReMaterializable API sometimes returned
true for non-trivial rematerializations.

However #160377 recently split
the API out into a separate non-trivial and trivial version and updated
the call-sites accordingly, and
#160709 and #159180 fixed
heuristics which weren't accounting for the difference between
non-trivial and trivial.

With these fixes in place, this patch proposes to again allow
non-trivial rematerialization by default which reduces a significant
amount of spills and reloads across various targets.

For llvm-test-suite built with -O3 -flto, we get the following geomean
reduction in reloads:

- arm64-apple-darwin: 11.6%
- riscv64-linux-gnu: 8.1%
- x86_64-linux-gnu: 6.5%
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Oct 4, 2025
…erialization (#159211)

In the register allocator we define non-trivial rematerialization as the
rematerlization of an instruction with virtual register uses.

We have been able to perform non-trivial rematerialization for a while,
but it has been prevented by default unless specifically overriden by
the target in `TargetTransformInfo::isReMaterializableImpl`. The
original reasoning for this given by the comment in the default
implementation is because we might increase a live range of the virtual
register, but we don't actually do this.
LiveRangeEdit::allUsesAvailableAt makes sure that we only rematerialize
instructions whose virtual registers are already live at the use sites.

https://reviews.llvm.org/D106408 had originally tried to remove this
restriction but it was reverted after some performance regressions were
reported. We think it is likely that the regressions were caused by the
fact that the old isTriviallyReMaterializable API sometimes returned
true for non-trivial rematerializations.

However llvm/llvm-project#160377 recently split
the API out into a separate non-trivial and trivial version and updated
the call-sites accordingly, and
llvm/llvm-project#160709 and #159180 fixed
heuristics which weren't accounting for the difference between
non-trivial and trivial.

With these fixes in place, this patch proposes to again allow
non-trivial rematerialization by default which reduces a significant
amount of spills and reloads across various targets.

For llvm-test-suite built with -O3 -flto, we get the following geomean
reduction in reloads:

- arm64-apple-darwin: 11.6%
- riscv64-linux-gnu: 8.1%
- x86_64-linux-gnu: 6.5%
aokblast pushed a commit to aokblast/llvm-project that referenced this pull request Oct 6, 2025
llvm#159211)

In the register allocator we define non-trivial rematerialization as the
rematerlization of an instruction with virtual register uses.

We have been able to perform non-trivial rematerialization for a while,
but it has been prevented by default unless specifically overriden by
the target in `TargetTransformInfo::isReMaterializableImpl`. The
original reasoning for this given by the comment in the default
implementation is because we might increase a live range of the virtual
register, but we don't actually do this.
LiveRangeEdit::allUsesAvailableAt makes sure that we only rematerialize
instructions whose virtual registers are already live at the use sites.

https://reviews.llvm.org/D106408 had originally tried to remove this
restriction but it was reverted after some performance regressions were
reported. We think it is likely that the regressions were caused by the
fact that the old isTriviallyReMaterializable API sometimes returned
true for non-trivial rematerializations.

However llvm#160377 recently split
the API out into a separate non-trivial and trivial version and updated
the call-sites accordingly, and
llvm#160709 and llvm#159180 fixed
heuristics which weren't accounting for the difference between
non-trivial and trivial.

With these fixes in place, this patch proposes to again allow
non-trivial rematerialization by default which reduces a significant
amount of spills and reloads across various targets.

For llvm-test-suite built with -O3 -flto, we get the following geomean
reduction in reloads:

- arm64-apple-darwin: 11.6%
- riscv64-linux-gnu: 8.1%
- x86_64-linux-gnu: 6.5%
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants