Skip to content

[PPC] Lower unreachable IR instruction to a trap. #101379

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

mandlebug
Copy link
Contributor

Lower unreachables to traps so that we don't fallthrough to execute the wrong blocks or falloff the end of empty functions.

Most of the affected tests are updated to reflect the new behaviour, while some tests are modified to no longer use unreachables, with a couple of tests using a hidden option to disable emitting traps for unreachables to preserve the test coverage.

Lower unreachables to traps so that we don't fallthrough to execute
the wrong blocks or falloff the end of empty functions.

Most of the affected tests are updated to reflect the new behaviour,
while some tests are modified to no longer use unreachables, with a
couple of test using a hidden option to disable emitting traps for
unreachables to preserve the test coverage.
@llvmbot
Copy link
Member

llvmbot commented Jul 31, 2024

@llvm/pr-subscribers-backend-powerpc

Author: Sean Fertile (mandlebug)

Changes

Lower unreachables to traps so that we don't fallthrough to execute the wrong blocks or falloff the end of empty functions.

Most of the affected tests are updated to reflect the new behaviour, while some tests are modified to no longer use unreachables, with a couple of tests using a hidden option to disable emitting traps for unreachables to preserve the test coverage.


Full diff: https://github.com/llvm/llvm-project/pull/101379.diff

11 Files Affected:

  • (modified) llvm/lib/Target/PowerPC/PPCTargetMachine.cpp (+11)
  • (added) llvm/test/CodeGen/PowerPC/aix-ub-unreachable.ll (+18)
  • (modified) llvm/test/CodeGen/PowerPC/empty-functions.ll (+4-1)
  • (modified) llvm/test/CodeGen/PowerPC/fptoui-be-crash.ll (+2)
  • (modified) llvm/test/CodeGen/PowerPC/ifcvt-forked-bug-2016-08-08.ll (+6-2)
  • (modified) llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll (+4-3)
  • (modified) llvm/test/CodeGen/PowerPC/pr43527.ll (+2-2)
  • (modified) llvm/test/CodeGen/PowerPC/pr45448.ll (+7-10)
  • (modified) llvm/test/CodeGen/PowerPC/reduce_scalarization.ll (+6-6)
  • (modified) llvm/test/CodeGen/PowerPC/test-and-cmp-folding.ll (+2-1)
  • (modified) llvm/test/CodeGen/PowerPC/trunc-srl-load.ll (+1-1)
diff --git a/llvm/lib/Target/PowerPC/PPCTargetMachine.cpp b/llvm/lib/Target/PowerPC/PPCTargetMachine.cpp
index 1ef891d1b677a..f74a354552803 100644
--- a/llvm/lib/Target/PowerPC/PPCTargetMachine.cpp
+++ b/llvm/lib/Target/PowerPC/PPCTargetMachine.cpp
@@ -111,6 +111,11 @@ static cl::opt<bool> EnablePPCGenScalarMASSEntries(
              "(scalar) entries"),
     cl::Hidden);
 
+static cl::opt<bool> PPCDisableTrapOnUnreachable(
+  "ppc-disable-trap-on-unreachable", cl::init(false),
+  cl::desc("Disable lowering of unreachable IR intruction to a trap instuction"),
+           cl::Hidden);
+
 extern "C" LLVM_EXTERNAL_VISIBILITY void LLVMInitializePowerPCTarget() {
   // Register the targets
   RegisterTargetMachine<PPCTargetMachine> A(getThePPC32Target());
@@ -354,6 +359,12 @@ PPCTargetMachine::PPCTargetMachine(const Target &T, const Triple &TT,
       TLOF(createTLOF(getTargetTriple())),
       TargetABI(computeTargetABI(TT, Options)),
       Endianness(isLittleEndianTriple(TT) ? Endian::LITTLE : Endian::BIG) {
+
+  if (!PPCDisableTrapOnUnreachable) {
+    this->Options.TrapUnreachable = true;
+    this->Options.NoTrapAfterNoreturn = true;
+  }
+
   initAsmInfo();
 }
 
diff --git a/llvm/test/CodeGen/PowerPC/aix-ub-unreachable.ll b/llvm/test/CodeGen/PowerPC/aix-ub-unreachable.ll
new file mode 100644
index 0000000000000..21d98bb3e1117
--- /dev/null
+++ b/llvm/test/CodeGen/PowerPC/aix-ub-unreachable.ll
@@ -0,0 +1,18 @@
+; RUN: llc -verify-machineinstrs -mcpu=pwr7 -mtriple powerpc-ibm-aix-xcoff < \
+; RUN: %s | FileCheck %s
+
+; RUN: llc -verify-machineinstrs -mcpu=pwr7 -mtriple powerpc64-ibm-aix-xcoff < \
+; RUN: %s | FileCheck %s
+
+define dso_local i32 @main() {
+entry:
+  unreachable
+
+unreachabl_bb:
+  ret i32 0
+}
+
+; CHECK:       .main:
+; CHECK-NEXT:  # %bb.0:                                # %entry
+; CHECK-NEXT:    trap
+
diff --git a/llvm/test/CodeGen/PowerPC/empty-functions.ll b/llvm/test/CodeGen/PowerPC/empty-functions.ll
index f3238bae560e6..de7c4e2ea8185 100644
--- a/llvm/test/CodeGen/PowerPC/empty-functions.ll
+++ b/llvm/test/CodeGen/PowerPC/empty-functions.ll
@@ -6,11 +6,12 @@ entry:
   unreachable
 }
 
-; An empty function is perfectly fine on ELF.
+
 ; LINUX-NO-FP: func:
 ; LINUX-NO-FP-NEXT: {{^}}.L[[BEGIN:.*]]:{{$}}
 ; LINUX-NO-FP-NEXT: .cfi_startproc
 ; LINUX-NO-FP-NEXT: {{^}}#
+; LINUX-NO-FP-NEXT: trap
 ; LINUX-NO-FP-NEXT: {{^}}.L[[END:.*]]:{{$}}
 ; LINUX-NO-FP-NEXT: .size   func, .L[[END]]-.L[[BEGIN]]
 ; LINUX-NO-FP-NEXT: .cfi_endproc
@@ -25,6 +26,8 @@ entry:
 ; LINUX-FP-NEXT:  .cfi_def_cfa_offset 16
 ; LINUX-FP-NEXT: .cfi_offset r31, -4
 ; LINUX-FP-NEXT: mr 31, 1
+; LINUX-FP-NEXT: .cfi_def_cfa_register r31
+; LINUX-FP-NEXT: trap
 ; LINUX-FP-NEXT: {{^}}.L[[END:.*]]:{{$}}
 ; LINUX-FP-NEXT: .size   func, .L[[END]]-.L[[BEGIN]]
 ; LINUX-FP-NEXT: .cfi_endproc
diff --git a/llvm/test/CodeGen/PowerPC/fptoui-be-crash.ll b/llvm/test/CodeGen/PowerPC/fptoui-be-crash.ll
index 004d77eb33933..875be83658961 100644
--- a/llvm/test/CodeGen/PowerPC/fptoui-be-crash.ll
+++ b/llvm/test/CodeGen/PowerPC/fptoui-be-crash.ll
@@ -62,9 +62,11 @@ define dso_local void @calc_buffer() local_unnamed_addr #0 {
 ; CHECK-NEXT:    rldic r4, r4, 63, 0
 ; CHECK-NEXT:    xor r3, r3, r4
 ; CHECK-NEXT:    std r3, 0(r3)
+; CHECK-NEXT:    trap
 ; CHECK-NEXT:  .LBB0_10:
 ; CHECK-NEXT:    ld r3, -16(r1)
 ; CHECK-NEXT:    std r3, 0(r3)
+; CHECK-NEXT:    trap
   %load_initial = load i64, ptr poison, align 8
   %conv39 = uitofp i64 %load_initial to float
   %add48 = fadd float 0.000000e+00, %conv39
diff --git a/llvm/test/CodeGen/PowerPC/ifcvt-forked-bug-2016-08-08.ll b/llvm/test/CodeGen/PowerPC/ifcvt-forked-bug-2016-08-08.ll
index 64c60bffa31d7..b9bee4abbdae9 100644
--- a/llvm/test/CodeGen/PowerPC/ifcvt-forked-bug-2016-08-08.ll
+++ b/llvm/test/CodeGen/PowerPC/ifcvt-forked-bug-2016-08-08.ll
@@ -11,7 +11,11 @@ entry:
 
 ; CHECK: # %land.lhs.true
 ; Test updated due D63152 where any load/store prevents shrink-wrapping
-; CHECK-NEXT: bc
+; CHECK-NEXT:   bc 4, 20, .LBB0_4
+; CHECK-NEXT: # %bb.2:
+; CHECK-NEXT:   blr
+; CHECK-NEXT: .LBB0_3:                                # %if.end
+; CHECK:      beqlr
 ; CHECK-NEXT: # %if.end4
 land.lhs.true:                                    ; preds = %entry
   br i1 undef, label %return, label %if.end4
@@ -22,7 +26,7 @@ if.end:                                           ; preds = %entry
 
 if.end4:                                          ; preds = %if.end, %land.lhs.true
   %call5 = tail call ptr @_ZN11__sanitizer21internal_start_threadEPFvPvES0_(ptr nonnull @_ZN11__sanitizer16BackgroundThreadEPv, ptr null) #7
-  unreachable
+  ret void
 
 return:                                           ; preds = %if.end, %land.lhs.true
   ret void
diff --git a/llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll b/llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll
index 4b032781c3764..7fe3b5c2fd77e 100644
--- a/llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll
+++ b/llvm/test/CodeGen/PowerPC/p10-spill-crlt.ll
@@ -1,9 +1,10 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
 ; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu \
-; RUN:     -mcpu=pwr10 -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr < %s | \
-; RUN:     FileCheck %s
+; RUN:     -mcpu=pwr10 -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
+; RUN:     -ppc-disable-trap-on-unreachable < %s | FileCheck %s
 ; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu \
-; RUN:     -mcpu=pwr10 -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr < %s | \
+; RUN:     -mcpu=pwr10 -ppc-asm-full-reg-names -ppc-vsr-nums-as-vr \
+; RUN:     -ppc-disable-trap-on-unreachable < %s | \
 ; RUN:     FileCheck %s --check-prefix=CHECK-BE
 
 ; This test case tests spilling the CR LT bit on Power10. On Power10, this is
diff --git a/llvm/test/CodeGen/PowerPC/pr43527.ll b/llvm/test/CodeGen/PowerPC/pr43527.ll
index 379bd6c070c77..97d47509f58fc 100644
--- a/llvm/test/CodeGen/PowerPC/pr43527.ll
+++ b/llvm/test/CodeGen/PowerPC/pr43527.ll
@@ -7,7 +7,7 @@ define dso_local void @test(i64 %arg, i64 %arg1) {
 ; CHECK:       # %bb.0: # %bb
 ; CHECK-NEXT:    bc 4, 4*cr5+lt, .LBB0_5
 ; CHECK-NEXT:  # %bb.1: # %bb3
-; CHECK-NEXT:    bc 12, 4*cr5+lt, .LBB0_6
+; CHECK-NEXT:    bc 12, 4*cr5+lt, .LBB0_5
 ; CHECK-NEXT:  # %bb.2: # %bb4
 ; CHECK-NEXT:    mflr r0
 ; CHECK-NEXT:    .cfi_def_cfa_offset 64
@@ -38,7 +38,7 @@ define dso_local void @test(i64 %arg, i64 %arg1) {
 ; CHECK-NEXT:    mtlr r0
 ; CHECK-NEXT:    blr
 ; CHECK-NEXT:  .LBB0_5: # %bb2
-; CHECK-NEXT:  .LBB0_6: # %bb14
+; CHECK-NEXT:    trap
 bb:
   br i1 undef, label %bb3, label %bb2
 
diff --git a/llvm/test/CodeGen/PowerPC/pr45448.ll b/llvm/test/CodeGen/PowerPC/pr45448.ll
index 0f2dcb3ccc8a0..e4fb7050f499d 100644
--- a/llvm/test/CodeGen/PowerPC/pr45448.ll
+++ b/llvm/test/CodeGen/PowerPC/pr45448.ll
@@ -7,18 +7,17 @@ define hidden void @julia_tryparse_internal_45896() #0 {
 ; CHECK:       # %bb.0: # %top
 ; CHECK-NEXT:    ld r3, 0(r3)
 ; CHECK-NEXT:    cmpldi r3, 0
-; CHECK-NEXT:    beq cr0, .LBB0_6
+; CHECK-NEXT:    beq cr0, .LBB0_2
 ; CHECK-NEXT:  # %bb.1: # %top
 ; CHECK-NEXT:    cmpldi r3, 10
 ; CHECK-NEXT:    beq cr0, .LBB0_3
-; CHECK-NEXT:  # %bb.2: # %top
+; CHECK-NEXT:  .LBB0_2:
+; CHECK-NEXT:    trap
 ; CHECK-NEXT:  .LBB0_3: # %L294
-; CHECK-NEXT:    bc 12, 4*cr5+lt, .LBB0_5
+; CHECK-NEXT:    bc 12, 4*cr5+lt, .LBB0_2
 ; CHECK-NEXT:  # %bb.4: # %L294
-; CHECK-NEXT:    bc 4, 4*cr5+lt, .LBB0_7
-; CHECK-NEXT:  .LBB0_5: # %L1057.preheader
-; CHECK-NEXT:  .LBB0_6: # %fail194
-; CHECK-NEXT:  .LBB0_7: # %L670
+; CHECK-NEXT:    bc 12, 4*cr5+lt, .LBB0_2
+; CHECK-NEXT:  # %bb.5:
 ; CHECK-NEXT:    li r5, -3
 ; CHECK-NEXT:    sradi r4, r3, 63
 ; CHECK-NEXT:    rldic r5, r5, 4, 32
@@ -27,9 +26,7 @@ define hidden void @julia_tryparse_internal_45896() #0 {
 ; CHECK-NEXT:    cmpld cr1, r6, r3
 ; CHECK-NEXT:    mulhdu. r3, r4, r5
 ; CHECK-NEXT:    crorc 4*cr5+lt, 4*cr1+lt, eq
-; CHECK-NEXT:    bc 4, 4*cr5+lt, .LBB0_9
-; CHECK-NEXT:  # %bb.8: # %L917
-; CHECK-NEXT:  .LBB0_9: # %L994
+; CHECK-NEXT:    trap
 top:
   %0 = load i64, ptr undef, align 8
   %1 = icmp ne i64 %0, 0
diff --git a/llvm/test/CodeGen/PowerPC/reduce_scalarization.ll b/llvm/test/CodeGen/PowerPC/reduce_scalarization.ll
index 6d188ce3b4a5e..67264a2104b88 100644
--- a/llvm/test/CodeGen/PowerPC/reduce_scalarization.ll
+++ b/llvm/test/CodeGen/PowerPC/reduce_scalarization.ll
@@ -239,7 +239,7 @@ define dso_local i32 @test6() #0 {
 ; CHECK-P10-NEXT:    andi. r3, r3, 1
 ; CHECK-P10-NEXT:    bc 4, gt, .LBB5_2
 ; CHECK-P10-NEXT:  # %bb.1: # %bb8
-; CHECK-P10-NEXT:  .LBB5_2: # %bb7
+; CHECK-P10:       .LBB5_2: # %bb7
 ;
 ; CHECK-P10-BE-LABEL: test6:
 ; CHECK-P10-BE:       # %bb.0: # %bb
@@ -256,7 +256,7 @@ define dso_local i32 @test6() #0 {
 ; CHECK-P10-BE-NEXT:    andi. r3, r3, 1
 ; CHECK-P10-BE-NEXT:    bc 4, gt, .LBB5_2
 ; CHECK-P10-BE-NEXT:  # %bb.1: # %bb8
-; CHECK-P10-BE-NEXT:  .LBB5_2: # %bb7
+; CHECK-P10-BE:       .LBB5_2: # %bb7
 ;
 ; AIX-64-LABEL: test6:
 ; AIX-64:       # %bb.0: # %bb
@@ -274,7 +274,7 @@ define dso_local i32 @test6() #0 {
 ; AIX-64-NEXT:    andi. r3, r3, 1
 ; AIX-64-NEXT:    bc 4, gt, L..BB5_2
 ; AIX-64-NEXT:  # %bb.1: # %bb8
-; AIX-64-NEXT:  L..BB5_2: # %bb7
+; AIX-64:       L..BB5_2: # %bb7
 ;
 ; AIX-32-LABEL: test6:
 ; AIX-32:       # %bb.0: # %bb
@@ -294,7 +294,7 @@ define dso_local i32 @test6() #0 {
 ; AIX-32-NEXT:    andi. r3, r3, 1
 ; AIX-32-NEXT:    bc 4, gt, L..BB5_2
 ; AIX-32-NEXT:  # %bb.1: # %bb8
-; AIX-32-NEXT:  L..BB5_2: # %bb7
+; AIX-32:       L..BB5_2: # %bb7
 bb:
   br label %bb1
 
@@ -308,8 +308,8 @@ bb1:                                              ; preds = %bb
   br i1 %i6, label %bb8, label %bb7
 
 bb7:                                              ; preds = %bb1
-  unreachable
+  ret i32 0
 
 bb8:                                              ; preds = %bb1
-  unreachable
+  ret i32 1
 }
diff --git a/llvm/test/CodeGen/PowerPC/test-and-cmp-folding.ll b/llvm/test/CodeGen/PowerPC/test-and-cmp-folding.ll
index 0e924eabbed24..65c3a99d6ca98 100644
--- a/llvm/test/CodeGen/PowerPC/test-and-cmp-folding.ll
+++ b/llvm/test/CodeGen/PowerPC/test-and-cmp-folding.ll
@@ -30,7 +30,8 @@ if.then.i.i.i636:                                 ; preds = %if.then14.i
 dummy.exit.i:               ; preds = %if.then.i.i.i636, %if.then14.i
 ; CHECK: # %dummy.exit.i
 ; CHECK-NEXT: andi.
-; CHECK-NEXT: bc 12
+; CHECK-NEXT: .LBB0_6:
+; CHECK-NEXT: trap
   %cond82.i = icmp eq i64 %and.i126.i, 0
   br i1 %cond82.i, label %if.end50.i, label %dummy.exit
 
diff --git a/llvm/test/CodeGen/PowerPC/trunc-srl-load.ll b/llvm/test/CodeGen/PowerPC/trunc-srl-load.ll
index dfc1bd46601cc..8b864ce9ed922 100644
--- a/llvm/test/CodeGen/PowerPC/trunc-srl-load.ll
+++ b/llvm/test/CodeGen/PowerPC/trunc-srl-load.ll
@@ -1,5 +1,5 @@
 ; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
-; RUN: llc -mtriple=powerpc64-unknown-unknown %s -o - | FileCheck %s
+; RUN: llc -ppc-disable-trap-on-unreachable -mtriple=powerpc64-unknown-unknown %s -o - | FileCheck %s
 
 define dso_local fastcc void @trunc_srl_load(i32 zeroext %AttrArgNo) {
 ; CHECK-LABEL: trunc_srl_load:

Copy link

github-actions bot commented Jul 31, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Collaborator

@chenzheng1030 chenzheng1030 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except one concern for the case change. It is OK for me with either option, change the case or add the option.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's appropriate to do this as a per-target opt-in (as long as there is no technical requirement that is actually target specific). If you want to do this change, it should happen globally in clang, for all targets.

@RolandF77
Copy link
Collaborator

I don't think it's appropriate to do this as a per-target opt-in (as long as there is no technical requirement that is actually target specific). If you want to do this change, it should happen globally in clang, for all targets.

I think the target requirement is having way to terminate that is lightweight and non-intrusive. PPC has a h/w instruction trap so it can do this with single inline instruction that has no dependencies. Other targets may require a branching sequence, or maybe a call. The unreachable may be assumed to be hit rarely but a heavier implementation may impact the code around it, the processor state, or the containing function ABI overhead.

@nikic
Copy link
Contributor

nikic commented Aug 9, 2024

That property is not unique to PowerPC, pretty much every target will have some form of hardware trap.

A valid technical requirement to opt-in on a per target basis would be if omitting the trap causes verifier failures (relevant for wasm, or other targets with structurization requirements).

@efriedma-quic
Copy link
Collaborator

IMO, this makes sense to do this as the default for all targets... just nobody has actually proposed a patch to do that. (See also https://discourse.llvm.org/t/can-we-keep-must-progress-optimizations-around-infinite-loop-in-c-while-avoiding-some-surprising-behavior/69205 https://discourse.llvm.org/t/zero-length-function-pointer-equality/55984/26 .)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants