-
Notifications
You must be signed in to change notification settings - Fork 751
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[MachineCopyPropagation] Check CrossCopyRegClass for cross-class copys
On some AMDGPU subtargets, copying to and from AGPR registers using another AGPR register is not possible. A intermediate VGPR register is needed for AGPR to AGPR copy. This is an issue when machine copy propagation forwards a COPY $agpr, replacing a COPY $vgpr which results in $agpr = COPY $agpr. It is removing a cross class copy that may have been optimized by previous passes and potentially creating an unoptimized cross class copy later on. To avoid this issue, check CrossCopyRegClass if a different register class will be needed for the copy. If so then avoid forwarding the copy when the destination does not match the desired register class and if the original copy already matches the desired register class. Issue seen while attempting to optimize another AGPR to AGPR issue: Live-ins: $agpr0 $vgpr0 = COPY $agpr0 $agpr1 = V_ACCVGPR_WRITE_B32 $vgpr0 $agpr2 = COPY $vgpr0 $agpr3 = COPY $vgpr0 $agpr4 = COPY $vgpr0 After machine-cp: $vgpr0 = COPY $agpr0 $agpr1 = V_ACCVGPR_WRITE_B32 $vgpr0 $agpr2 = COPY $agpr0 $agpr3 = COPY $agpr0 $agpr4 = COPY $agpr0 Machine-cp propagated COPY $agpr0 to replace $vgpr0 creating 3 AGPR to AGPR copys. Later this creates a cross-register copy from AGPR->VGPR->AGPR for each copy when the prior VGPR->AGPR copy was already optimal. Reviewed By: lkail, rampitec Differential Revision: https://reviews.llvm.org/D108011
- Loading branch information
1 parent
2a35d59
commit 549f6a8
Showing
4 changed files
with
110 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py | ||
# RUN: llc -march=amdgcn -mcpu=gfx908 %s -o - -run-pass machine-cp -verify-machineinstrs | FileCheck -check-prefix=GFX908 %s | ||
# RUN: llc -march=amdgcn -mcpu=gfx90a %s -o - -run-pass machine-cp -verify-machineinstrs | FileCheck -check-prefix=GFX90A %s | ||
|
||
--- | ||
name: do_not_propagate_agpr_to_agpr | ||
body: | | ||
bb.0: | ||
successors: | ||
liveins: $agpr0 | ||
; GFX908-LABEL: name: do_not_propagate_agpr_to_agpr | ||
; GFX908: renamable $vgpr0 = COPY renamable $agpr0, implicit $exec | ||
; GFX908: renamable $agpr1 = COPY renamable $vgpr0, implicit $exec | ||
; GFX908: renamable $agpr2 = COPY renamable $vgpr0, implicit $exec | ||
; GFX908: S_ENDPGM 0, implicit $vgpr0, implicit $agpr1, implicit $agpr2 | ||
; GFX90A-LABEL: name: do_not_propagate_agpr_to_agpr | ||
; GFX90A: renamable $vgpr0 = COPY renamable $agpr0, implicit $exec | ||
; GFX90A: renamable $agpr1 = COPY $agpr0, implicit $exec | ||
; GFX90A: renamable $agpr2 = COPY $agpr0, implicit $exec | ||
; GFX90A: S_ENDPGM 0, implicit $vgpr0, implicit $agpr1, implicit $agpr2 | ||
renamable $vgpr0 = COPY renamable $agpr0, implicit $exec | ||
renamable $agpr1 = COPY renamable $vgpr0, implicit $exec | ||
renamable $agpr2 = COPY renamable $vgpr0, implicit $exec | ||
S_ENDPGM 0, implicit $vgpr0, implicit $agpr1, implicit $agpr2 | ||
... | ||
--- | ||
name: propagate_vgpr_to_agpr | ||
body: | | ||
bb.0: | ||
successors: | ||
liveins: $vgpr0 | ||
; GFX908-LABEL: name: propagate_vgpr_to_agpr | ||
; GFX908: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec | ||
; GFX908: renamable $agpr1 = COPY $vgpr0, implicit $exec | ||
; GFX908: renamable $agpr2 = COPY $vgpr0, implicit $exec | ||
; GFX908: S_ENDPGM 0, implicit $agpr0, implicit $agpr1, implicit $agpr2 | ||
; GFX90A-LABEL: name: propagate_vgpr_to_agpr | ||
; GFX90A: renamable $agpr0 = COPY renamable $vgpr0, implicit $exec | ||
; GFX90A: renamable $agpr1 = COPY $vgpr0, implicit $exec | ||
; GFX90A: renamable $agpr2 = COPY $vgpr0, implicit $exec | ||
; GFX90A: S_ENDPGM 0, implicit $agpr0, implicit $agpr1, implicit $agpr2 | ||
renamable $agpr0 = COPY renamable $vgpr0, implicit $exec | ||
renamable $agpr1 = COPY renamable $agpr0, implicit $exec | ||
renamable $agpr2 = COPY renamable $agpr0, implicit $exec | ||
S_ENDPGM 0, implicit $agpr0, implicit $agpr1, implicit $agpr2 | ||
... | ||
--- | ||
name: propagate_agpr_to_vgpr | ||
body: | | ||
bb.0: | ||
successors: | ||
liveins: $agpr0 | ||
; GFX908-LABEL: name: propagate_agpr_to_vgpr | ||
; GFX908: renamable $vgpr0 = COPY renamable $agpr0, implicit $exec | ||
; GFX908: renamable $vgpr1 = COPY $agpr0, implicit $exec | ||
; GFX908: renamable $vgpr2 = COPY $agpr0, implicit $exec | ||
; GFX908: S_ENDPGM 0, implicit $vgpr0, implicit $vgpr1, implicit $vgpr2 | ||
; GFX90A-LABEL: name: propagate_agpr_to_vgpr | ||
; GFX90A: renamable $vgpr0 = COPY renamable $agpr0, implicit $exec | ||
; GFX90A: renamable $vgpr1 = COPY $agpr0, implicit $exec | ||
; GFX90A: renamable $vgpr2 = COPY $agpr0, implicit $exec | ||
; GFX90A: S_ENDPGM 0, implicit $vgpr0, implicit $vgpr1, implicit $vgpr2 | ||
renamable $vgpr0 = COPY renamable $agpr0, implicit $exec | ||
renamable $vgpr1 = COPY renamable $vgpr0, implicit $exec | ||
renamable $vgpr2 = COPY renamable $vgpr0, implicit $exec | ||
S_ENDPGM 0, implicit $vgpr0, implicit $vgpr1, implicit $vgpr2 | ||
... |