-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[IndVars] Teach widenLoopCompare to use sext if narrow IV is positive and other operand is already sext. #142703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
… and other operand is already sext. This prevents us from ending up with (zext (sext X)). The zext will require an instruction on targets where zext isn't free like RISC-V.
@llvm/pr-subscribers-llvm-transforms Author: Craig Topper (topperc) ChangesThis prevents us from ending up with (zext (sext X)). The zext will Full diff: https://github.com/llvm/llvm-project/pull/142703.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp b/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
index 3fe4621eca70d..6d2266a5a98b5 100644
--- a/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
+++ b/llvm/lib/Transforms/Utils/SimplifyIndVar.cpp
@@ -1630,6 +1630,12 @@ bool WidenIV::widenLoopCompare(WidenIV::NarrowIVDefUse DU) {
// Widen the other operand of the compare, if necessary.
if (CastWidth < IVWidth) {
+ // If the narrow IV is always postive and the other operand is sext, widen
+ // using sext so we can combine them. This works for all comparison
+ // predicates.
+ if (DU.NeverNegative && isa<SExtInst>(Op))
+ CmpPreferredSign = true;
+
Value *ExtOp = createExtendInst(Op, WideType, CmpPreferredSign, Cmp);
DU.NarrowUse->replaceUsesOfWith(Op, ExtOp);
}
diff --git a/llvm/test/Transforms/IndVarSimplify/iv-cmp-sext.ll b/llvm/test/Transforms/IndVarSimplify/iv-cmp-sext.ll
new file mode 100644
index 0000000000000..34925a6cca955
--- /dev/null
+++ b/llvm/test/Transforms/IndVarSimplify/iv-cmp-sext.ll
@@ -0,0 +1,66 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=indvars -S | FileCheck %s
+
+target datalayout = "e-m:e-p:64:64-i64:64-i128:128-n32:64-S128"
+target triple = "riscv64"
+
+define void @foo(ptr %x, i32 %n) {
+; CHECK-LABEL: define void @foo(
+; CHECK-SAME: ptr [[X:%.*]], i32 [[N:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[CMP10:%.*]] = icmp sgt i32 [[N]], 0
+; CHECK-NEXT: br i1 [[CMP10]], label %[[FOR_BODY_PREHEADER:.*]], label %[[FOR_COND_CLEANUP:.*]]
+; CHECK: [[FOR_BODY_PREHEADER]]:
+; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[N]] to i64
+; CHECK-NEXT: br label %[[FOR_BODY:.*]]
+; CHECK: [[FOR_COND_CLEANUP_LOOPEXIT:.*]]:
+; CHECK-NEXT: br label %[[FOR_COND_CLEANUP]]
+; CHECK: [[FOR_COND_CLEANUP]]:
+; CHECK-NEXT: ret void
+; CHECK: [[FOR_BODY]]:
+; CHECK-NEXT: [[INDVARS_IV:%.*]] = phi i64 [ [[INDVARS_IV_NEXT:%.*]], %[[FOR_INC:.*]] ], [ 0, %[[FOR_BODY_PREHEADER]] ]
+; CHECK-NEXT: [[ARRAYIDX:%.*]] = getelementptr inbounds nuw i16, ptr [[X]], i64 [[INDVARS_IV]]
+; CHECK-NEXT: [[TMP0:%.*]] = load i16, ptr [[ARRAYIDX]], align 2
+; CHECK-NEXT: [[CONV:%.*]] = sext i16 [[TMP0]] to i32
+; CHECK-NEXT: [[TMP1:%.*]] = sext i32 [[CONV]] to i64
+; CHECK-NEXT: [[CMP1:%.*]] = icmp eq i64 [[INDVARS_IV]], [[TMP1]]
+; CHECK-NEXT: br i1 [[CMP1]], label %[[IF_THEN:.*]], label %[[FOR_INC]]
+; CHECK: [[IF_THEN]]:
+; CHECK-NEXT: store i16 0, ptr [[ARRAYIDX]], align 2
+; CHECK-NEXT: br label %[[FOR_INC]]
+; CHECK: [[FOR_INC]]:
+; CHECK-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
+; CHECK-NEXT: [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
+; CHECK-NEXT: br i1 [[EXITCOND]], label %[[FOR_BODY]], label %[[FOR_COND_CLEANUP_LOOPEXIT]]
+;
+entry:
+ %cmp10 = icmp sgt i32 %n, 0
+ br i1 %cmp10, label %for.body.preheader, label %for.cond.cleanup
+
+for.body.preheader: ; preds = %entry
+ br label %for.body
+
+for.cond.cleanup.loopexit: ; preds = %for.inc
+ br label %for.cond.cleanup
+
+for.cond.cleanup: ; preds = %for.cond.cleanup.loopexit, %entry
+ ret void
+
+for.body: ; preds = %for.body.preheader, %for.inc
+ %i.011 = phi i32 [ %inc, %for.inc ], [ 0, %for.body.preheader ]
+ %idxprom = zext nneg i32 %i.011 to i64
+ %arrayidx = getelementptr inbounds nuw i16, ptr %x, i64 %idxprom
+ %0 = load i16, ptr %arrayidx, align 2
+ %conv = sext i16 %0 to i32
+ %cmp1 = icmp eq i32 %i.011, %conv
+ br i1 %cmp1, label %if.then, label %for.inc
+
+if.then: ; preds = %for.body
+ store i16 0, ptr %arrayidx, align 2
+ br label %for.inc
+
+for.inc: ; preds = %for.body, %if.then
+ %inc = add nuw nsw i32 %i.011, 1
+ %cmp = icmp slt i32 %inc, %n
+ br i1 %cmp, label %for.body, label %for.cond.cleanup.loopexit
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As the narrow IV is always positive, can we use zext nneg
instead?
We're extending the non-IV operand, we don't know about its sign bit. |
@@ -1630,6 +1630,12 @@ bool WidenIV::widenLoopCompare(WidenIV::NarrowIVDefUse DU) { | |||
|
|||
// Widen the other operand of the compare, if necessary. | |||
if (CastWidth < IVWidth) { | |||
// If the narrow IV is always postive and the other operand is sext, widen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To check my understanding, this comment basically says that icmp PRED (zext nonneg X), (zext Y) == icmp PRED (zext nonneg X), (sext Y)?
Thinking through this...
If the high bit of X is set, then this is poison.
If the high bit of X is not set, then there are two cases:
- The high bit of Y is not set, zext == sext by definition.
- The high bit of Y is set, zext and sext produce different values. Neither can be equal to zext nonneg X (because it would be poison).
I can buy this for equality, but does this hold for inequality? In particular, icmp slt (zext nonneg X), ext(255) gives different results doesn't it?
Edit: Fixed a typo which changes meaning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can buy this for equality, but does this hold for inequality? In particular, icmp slt (zext nonneg X), ext(255) gives different results doesn't it?
Was ext(255)
supposed to be sext(255)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were never going to use zext for a slt/sgt/sle/sge compare. That's checked earlier when CmpPreferredSign
was first calculated. I've only changed what happens for eq/ne/ult/ugt/ule/uge.
This prevents us from ending up with (zext (sext X)). The zext will
require an instruction on targets where zext isn't free like RISC-V.