[VPlan] Implement VPWidenCallRecipe::computeCost (NFCI). #106047

fhahn · 2024-08-26T09:03:57Z

Implement cost computation for VPWidenCallRecipe. In some cases, targets use argument info to compute intrinsic costs. If all operands of the call are VPValues with an underlying IR value, use the IR values as arguments.

llvmbot · 2024-08-26T09:04:31Z

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

Implement cost computation for VPWidenCallRecipe. In some cases, targets use argument info to compute intrinsic costs. If all operands of the call are VPValues with an underlying IR value, use the IR values as arguments.

Full diff: https://github.com/llvm/llvm-project/pull/106047.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/VPlan.h (+4)
(modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+40)

diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 36a1aa08654d5b..ea88fe6720d96f 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -1565,6 +1565,10 @@ class VPWidenCallRecipe : public VPSingleDefRecipe {
   /// Produce a widened version of the call instruction.
   void execute(VPTransformState &State) override;
 
+  /// Return the cost of this VPWidenCallRecipe.
+  InstructionCost computeCost(ElementCount VF,
+                              VPCostContext &Ctx) const override;
+
   Function *getCalledScalarFunction() const {
     return cast<Function>(getOperand(getNumOperands() - 1)->getLiveInIRValue());
   }
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index fe1325f4163004..8bc0c6621fbde2 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -924,6 +924,46 @@ void VPWidenCallRecipe::execute(VPTransformState &State) {
   }
 }
 
+InstructionCost VPWidenCallRecipe::computeCost(ElementCount VF,
+                                               VPCostContext &Ctx) const {
+  TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput;
+  if (Variant) {
+    return Ctx.TTI.getCallInstrCost(nullptr, Variant->getReturnType(),
+                                    Variant->getFunctionType()->params(),
+                                    CostKind);
+  }
+
+  FastMathFlags FMF;
+  // TODO: Manage flags via VPRecipeWithIRFlags.
+  if (auto *FPMO = dyn_cast_or_null<FPMathOperator>(getUnderlyingValue()))
+    FMF = FPMO->getFastMathFlags();
+
+  // Some backends analyze intrinsic arguments to determine cost. If all
+  // operands are VPValues with an underlying IR value, use the original IR
+  // values for cost computations.
+  SmallVector<const Value *> Arguments;
+  for (VPValue *Op : operands()) {
+    auto *V = Op->getUnderlyingValue();
+    if (!V) {
+      Arguments.clear();
+      break;
+    }
+    Arguments.push_back(V);
+  }
+
+  Type *RetTy =
+      ToVectorTy(Ctx.Types.inferScalarType(this->getVPSingleValue()), VF);
+  SmallVector<Type *> ParamTys;
+  for (unsigned I = 0; I != getNumOperands(); ++I)
+    ParamTys.push_back(
+        ToVectorTy(Ctx.Types.inferScalarType(getOperand(I)), VF));
+
+  IntrinsicCostAttributes CostAttrs(VectorIntrinsicID, RetTy, Arguments,
+                                    ParamTys, FMF);
+  return Ctx.TTI.getIntrinsicInstrCost(
+      CostAttrs, TargetTransformInfo::TCK_RecipThroughput);
+}
+
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
 void VPWidenCallRecipe::print(raw_ostream &O, const Twine &Indent,
                               VPSlotTracker &SlotTracker) const {

arcbbb

LGTM

arcbbb · 2024-08-28T03:00:22Z

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

+  IntrinsicCostAttributes CostAttrs(VectorIntrinsicID, RetTy, Arguments,
+                                    ParamTys, FMF);
+  return Ctx.TTI.getIntrinsicInstrCost(
+      CostAttrs, TargetTransformInfo::TCK_RecipThroughput);


nit: we can just use CostKind here

Done thanks!

Implement cost computation for VPWidenCallRecipe. In some cases, targets use argument info to compute intrinsic costs. If all operands of the call are VPValues with an underlying IR value, use the IR values as arguments.

llvm-ci · 2024-09-01T16:40:27Z

LLVM Buildbot has detected a new failure on builder clang-aarch64-sve-vls-2stage running on linaro-g3-03 while building llvm at step 18 "test-suite".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/4/builds/1815

Here is the relevant piece of the build log for the reference

Step 18 (test-suite) failure: test (failure)
...
cd /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/Fortran/gfortran/regression && /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/stage2.install/bin/llvm-size --format=sysv /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/Fortran/gfortran/regression/gfortran-regression-execute-regression__widechar_intrinsics_5_f90 > /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/Fortran/gfortran/regression/gfortran-regression-execute-regression__widechar_intrinsics_5_f90.size
gmake[2]: Leaving directory '/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build'
[ 36%] Built target gfortran-regression-execute-regression__widechar_intrinsics_5_f90
gmake[1]: Leaving directory '/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build'
gmake: *** [Makefile:136: all] Error 2

2024-09-01 16:38:36 INFO: Testing...
2024-09-01 16:38:36 INFO: Execute: /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/stage1/bin/llvm-lit -v -j 32 /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build -o /home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/outputi47f1v5p.json
-- Testing: 10005 tests, 32 workers --
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_1026.test (1 of 10005)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_1026.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_abs_1026' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_1028.test (2 of 10005)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_1028.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_abs_1028' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_366.test (3 of 10005)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_366.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_abs_366' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_1027.test (4 of 10005)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_1027.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_abs_1027' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_367.test (5 of 10005)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_367.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_abs_367' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_38.test (6 of 10005)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_38.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_abs_38' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_698.test (7 of 10005)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_698.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_abs_698' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_add_1029.test (8 of 10005)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_add_1029.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_add_1029' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_add_1030.test (9 of 10005)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_add_1030.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_add_1030' is missing
********************
NOEXE: test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_697.test (10 of 10005)
******************** TEST 'test-suite :: Bitcode/simd_ops/simd_ops_test_op_abs_697.test' FAILED ********************
Executable '/home/tcwg-buildbot/worker/clang-aarch64-sve-vls-2stage/test/sandbox/build/Bitcode/simd_ops/simd_ops_test_op_abs_697' is missing
********************

chapuni · 2024-09-02T02:37:19Z

Looks like this causes buildbot failures.

In my case, stage1-clang crashes while compiling ADTTests on the aarch64 host.

FAILED: unittests/ADT/CMakeFiles/ADTTests.dir/BitVectorTest.cpp.o 
clang++: llvm-project/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7366: llvm::VectorizationFactor llvm::LoopVectorizationPlanner::computeBestVF(): Assertion `(BestFactor.Width == LegacyVF.Width || planContainsAdditionalSimplifications(getPlanFor(BestFactor.Width), CostCtx, OrigLoop)) && " VPlan cost model and legacy cost model disagreed"' failed.

fhahn · 2024-09-02T07:58:25Z

@chapuni could you share the full crashing compiler invocation? Bootstrapping on ARM64 macOS works fine for me.

DavidSpickett · 2024-09-02T08:07:58Z

This is also the cause of test suite failures on AArch64 Linux: https://lab.llvm.org/buildbot/#/builders/41/builds/1722

It's just that build configuration issues are hiding them under a pile of unintentional NOEXE results. We'll find what is actually failing.

fhahn · 2024-09-02T08:11:42Z

@DavidSpickett thanks, I think it looks like an assertion is triggered when building a Fortran workload from the llvm-test-suite with Flang.

Is there any chance to extract the IR reproducer from Flang?

antmox · 2024-09-02T08:35:43Z

Looks like the failing test is typebound_operator_9.f03

…m#106047)" Breaks build of 534.hpgmg This reverts commit 9ccf825 and b0de7fa [VPlan] Use op from underlying call in computeCost if needed. Change-Id: Ib8b10cc72bfb787d5f295971aa871e625148aa9d

mstorsjo · 2024-09-03T08:03:44Z

I also ran into assert failures with this, on a multitude of architectures. (I would have caught it already yesterday, but my nightly build yesterday was broken due to other reasons, so yesterday's build went untested otherwise.)

It's reproducible with the following snippets:

float *a;
int b;
double fabs();
__attribute__((cold)) int c() {
  for (; b; b++)
    a[b] = 1. - fabs(b);
}

$ clang -target aarch64-linux-gnu -c -O2 repro.c
clang: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7369: llvm::VectorizationFactor llvm::LoopVectorizationPlanner::computeBestVF(): Assertion `(BestFactor.Width == LegacyVF.Width || planContainsAdditionalSimplifications(getPlanFor(BestFactor.Width), CostCtx, OrigLoop)) && " VPlan cost model and legacy cost model disagreed"' failed.

And:

int *a;
__attribute__((cold)) int b(int c) {
  long d, f;
  double e;
  for (int g = 0; g < c; g++) {
    e = g * 3.14159265358979323846;
    f = e;
    if (f > 0)
      d = 0;
    else
      d = f;
    a[g] = d;
  }
}

$ clang -target x86_64-w64-mingw32 -c -O2 repro.c
clang: ../lib/Transforms/Vectorize/LoopVectorize.cpp:7369: llvm::VectorizationFactor llvm::LoopVectorizationPlanner::computeBestVF(): Assertion `(BestFactor.Width == LegacyVF.Width || planContainsAdditionalSimplifications(getPlanFor(BestFactor.Width), CostCtx, OrigLoop)) && " VPlan cost model and legacy cost model disagreed"' failed.

fhahn · 2024-09-03T13:32:03Z

@mstorsjo thanks, should be fixed by dd94537

@antmox @DavidSpickett AFAICT now the bots with Flang crashes should be back to green?

antmox · 2024-09-03T13:35:12Z

@fhahn yes indeed, bots are back to green, thanks

aeubanks · 2024-09-03T18:22:54Z

this is causing crashes on

target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128-Fn32"
target triple = "aarch64-unknown-linux-android26"

define fastcc void @selfguided_filter(ptr %arrayidx32, ptr %dav1d_sgr_x_by_x) {
entry:
  %sumsq = alloca [26520 x i32], i32 0, align 4
  br label %for.body27

for.cond22.for.cond.cleanup26_crit_edge:          ; preds = %for.body27
  ret void

for.body27:                                       ; preds = %for.body27, %entry
  %indvars.iv1 = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body27 ]
  %0 = load i32, ptr %arrayidx32, align 4
  %cond.i = tail call i32 @llvm.smax.i32(i32 %0, i32 0)
  %cond.i186 = tail call i32 @llvm.umin.i32(i32 %cond.i, i32 1)
  %idxprom45 = zext i32 %cond.i186 to i64
  %arrayidx46 = getelementptr [256 x i8], ptr %dav1d_sgr_x_by_x, i64 0, i64 %idxprom45
  %1 = load i8, ptr %arrayidx46, align 1
  %conv47 = zext i8 %1 to i32
  %mul51 = mul i32 0, %conv47
  store i32 %mul51, ptr %sumsq, align 4
  store i32 0, ptr null, align 4
  %indvars.iv.next = add i64 %indvars.iv1, 1
  %exitcond.not = icmp eq i64 %indvars.iv.next, 0
  br i1 %exitcond.not, label %for.cond22.for.cond.cleanup26_crit_edge, label %for.body27
}

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare i32 @llvm.smax.i32(i32, i32) #0

; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare i32 @llvm.umin.i32(i32, i32) #0

attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }

$ opt -passes="function<eager-inv>(loop-vectorize<no-interleave-forced-only;no-vectorize-forced-only;>)" -disable-output /tmp/a.ll
opt: ../../llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7382: VectorizationFactor llvm::LoopVectorizationPlanner::computeBestVF(): Assertion `(BestFactor.Width == LegacyVF.Width || planContainsAdditionalSimplifications(getPlanFor(BestFactor.Width), CostCtx, OrigLoop)) && " VPlan cost model and legacy cost model disagreed"' failed.

fhahn · 2024-09-03T20:08:07Z

@aeubanks thanks for the report, it was the same issue as #107051: setVectorizedCallDecision did not consider forced scalars, meaning the VPlans created did not reflect the legacy cost decisions. Should be fixed now

Implement cost computation for VPWidenCallRecipe. In some cases, targets use argument info to compute intrinsic costs. If all operands of the call are VPValues with an underlying IR value, use the IR values as arguments. PR: llvm#106731 Change-Id: I9eea7ea07341d8662d5cb0798dcb7151f0fb04e2

fhahn requested review from rengolin, ayalz and aniragil August 26, 2024 09:03

llvmbot added vectorizers llvm:transforms labels Aug 26, 2024

fhahn requested review from arcbbb and ElvisWang123 August 27, 2024 10:58

arcbbb approved these changes Aug 28, 2024

View reviewed changes

[VPlan] Implement VPWidenCallRecipe::computeCost (NFCI).

3730ebf

Implement cost computation for VPWidenCallRecipe. In some cases, targets use argument info to compute intrinsic costs. If all operands of the call are VPValues with an underlying IR value, use the IR values as arguments.

fhahn force-pushed the vpwidencall-cost branch from 3fc1eee to 3730ebf Compare August 30, 2024 14:22

fhahn merged commit 9ccf825 into llvm:main Sep 1, 2024
5 of 8 checks passed

fhahn deleted the vpwidencall-cost branch September 1, 2024 15:26

ye-luo mentioned this pull request Sep 2, 2024

[VPlan] clang crash #107016

Closed

[VPlan] Implement VPWidenCallRecipe::computeCost (NFCI). #106047

[VPlan] Implement VPWidenCallRecipe::computeCost (NFCI). #106047

Uh oh!

Conversation

fhahn commented Aug 26, 2024

Uh oh!

llvmbot commented Aug 26, 2024

Uh oh!

arcbbb left a comment

Choose a reason for hiding this comment

Uh oh!

arcbbb Aug 28, 2024

Choose a reason for hiding this comment

Uh oh!

fhahn Aug 30, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented Sep 1, 2024

Uh oh!

chapuni commented Sep 2, 2024

Uh oh!

fhahn commented Sep 2, 2024

Uh oh!

DavidSpickett commented Sep 2, 2024

Uh oh!

fhahn commented Sep 2, 2024

Uh oh!

antmox commented Sep 2, 2024

Uh oh!

mstorsjo commented Sep 3, 2024

Uh oh!

fhahn commented Sep 3, 2024

Uh oh!

antmox commented Sep 3, 2024

Uh oh!

aeubanks commented Sep 3, 2024

Uh oh!

fhahn commented Sep 3, 2024

Uh oh!

Uh oh!