[VPlan] Factor out logic to common compute costs to helper (NFCI). #153361

fhahn · 2025-08-13T07:35:25Z

A number of recipes compute costs for the same opcodes for scalars or vectors, depending on the recipe.

Move the common logic out to a helper in VPRecipeWithIRFlags, that is then used by VPReplicateRecipe, VPWidenRecipe and VPInstruction.

This makes it easier to cover all relevant opcodes, without duplication.

llvmbot · 2025-08-13T07:35:58Z

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

A number of recipes compute costs for the same opcodes for scalars or vectors, depending on the recipe.

Move the common logic out to a helper in VPRecipeWithIRFlags, that is then used by VPReplicateRecipe, VPWidenRecipe and VPInstruction.

This makes it easier to cover all relevant opcodes, without duplication.

Full diff: https://github.com/llvm/llvm-project/pull/153361.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/VPlan.h (+5)
(modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+83-68)

diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 11a7d8b339ae9..1d89d3ade648d 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -898,6 +898,11 @@ struct VPRecipeWithIRFlags : public VPSingleDefRecipe, public VPIRFlags {
   }
 
   void execute(VPTransformState &State) override = 0;
+
+  /// Compute the cost for this recipe using \p Opcode. If \p IsVector is true, compute the vector cost, otherwise compute the scalar cost for a single instance.
+std::optional<InstructionCost>
+getCommonCost(unsigned Opcode, ElementCount VF,
+              VPCostContext &Ctx, bool IsVector)const;
 };
 
 /// Helper to access the operand that contains the unroll part for this recipe
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 23c10d2b25263..0bd094ecf7c99 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -940,28 +940,90 @@ Value *VPInstruction::generate(VPTransformState &State) {
   }
 }
 
+std::optional<InstructionCost>
+VPRecipeWithIRFlags::getCommonCost(unsigned Opcode, ElementCount VF,
+              VPCostContext &Ctx, bool IsVector) const {
+  Type *ScalarTy = Ctx.Types.inferScalarType(this);
+  Type *ResultTy = IsVector ? toVectorTy(ScalarTy, VF) : ScalarTy;
+  switch (Opcode) {
+  case Instruction::FNeg:
+    return Ctx.TTI.getArithmeticInstrCost(
+        Opcode, ResultTy, Ctx.CostKind,
+        {TargetTransformInfo::OK_AnyValue, TargetTransformInfo::OP_None},
+        {TargetTransformInfo::OK_AnyValue, TargetTransformInfo::OP_None});
+  case Instruction::UDiv:
+  case Instruction::SDiv:
+  case Instruction::SRem:
+  case Instruction::URem:
+  case Instruction::Add:
+  case Instruction::FAdd:
+  case Instruction::Sub:
+  case Instruction::FSub:
+  case Instruction::Mul:
+  case Instruction::FMul:
+  case Instruction::FDiv:
+  case Instruction::FRem:
+  case Instruction::Shl:
+  case Instruction::LShr:
+  case Instruction::AShr:
+  case Instruction::And:
+  case Instruction::Or:
+  case Instruction::Xor: {
+    VPValue *RHS = getOperand(1);
+    // Certain instructions can be cheaper to vectorize if they have a constant
+    // second vector operand. One example of this are shifts on x86.
+    TargetTransformInfo::OperandValueInfo RHSInfo = Ctx.getOperandInfo(RHS);
+
+    if (RHSInfo.Kind == TargetTransformInfo::OK_AnyValue &&
+        getOperand(1)->isDefinedOutsideLoopRegions())
+      RHSInfo.Kind = TargetTransformInfo::OK_UniformValue;
+    Instruction *CtxI =
+        dyn_cast_or_null<Instruction>(getUnderlyingValue());
+
+    SmallVector<const Value *, 4> Operands;
+    if (CtxI)
+      Operands.append(CtxI->value_op_begin(), CtxI->value_op_end());
+    return Ctx.TTI.getArithmeticInstrCost(
+        Opcode, ResultTy, Ctx.CostKind,
+        {TargetTransformInfo::OK_AnyValue, TargetTransformInfo::OP_None},
+        RHSInfo, Operands, CtxI, &Ctx.TLI);
+  }
+  case Instruction::Freeze:
+    // This opcode is unknown. Assume that it is the same as 'mul'.
+    return Ctx.TTI.getArithmeticInstrCost(Instruction::Mul, ResultTy,
+                                          Ctx.CostKind);
+  case Instruction::ExtractValue:
+    return Ctx.TTI.getInsertExtractValueCost(Instruction::ExtractValue,
+                                             Ctx.CostKind);
+  case Instruction::ICmp:
+  case Instruction::FCmp: {
+    Type *ScalarOpTy = Ctx.Types.inferScalarType(getOperand(0));
+    Type *OpTy = IsVector ? toVectorTy(ScalarOpTy, VF) : ScalarOpTy;
+    Instruction *CtxI =
+        dyn_cast_or_null<Instruction>(getUnderlyingValue());
+    return Ctx.TTI.getCmpSelInstrCost(
+        Opcode, OpTy, CmpInst::makeCmpResultType(OpTy), getPredicate(),
+        Ctx.CostKind, {TTI::OK_AnyValue, TTI::OP_None},
+        {TTI::OK_AnyValue, TTI::OP_None}, CtxI);
+  }
+  }
+  return std::nullopt;
+}
+
 InstructionCost VPInstruction::computeCost(ElementCount VF,
                                            VPCostContext &Ctx) const {
   if (Instruction::isBinaryOp(getOpcode())) {
-    Type *ResTy = Ctx.Types.inferScalarType(this);
-    if (!vputils::onlyFirstLaneUsed(this))
-      ResTy = toVectorTy(ResTy, VF);
-
-    if (!getUnderlyingValue()) {
-      switch (getOpcode()) {
-      case Instruction::FMul:
-        return Ctx.TTI.getArithmeticInstrCost(getOpcode(), ResTy, Ctx.CostKind);
-      default:
-        // TODO: Compute cost for VPInstructions without underlying values once
-        // the legacy cost model has been retired.
-        return 0;
-      }
+    if (!getUnderlyingValue() && getOpcode() != Instruction::FMul) {
+      // TODO: Compute cost for VPInstructions without underlying values once
+      // the legacy cost model has been retired.
+      return 0;
     }
 
     assert(!doesGeneratePerAllLanes() &&
            "Should only generate a vector value or single scalar, not scalars "
            "for all lanes.");
-    return Ctx.TTI.getArithmeticInstrCost(getOpcode(), ResTy, Ctx.CostKind);
+    return *getCommonCost(getOpcode(), VF, Ctx,
+                          /*IsVector=*/!vputils::onlyFirstLaneUsed(this));
   }
 
   switch (getOpcode()) {
@@ -2025,20 +2087,13 @@ void VPWidenRecipe::execute(VPTransformState &State) {
 InstructionCost VPWidenRecipe::computeCost(ElementCount VF,
                                            VPCostContext &Ctx) const {
   switch (Opcode) {
-  case Instruction::FNeg: {
-    Type *VectorTy = toVectorTy(Ctx.Types.inferScalarType(this), VF);
-    return Ctx.TTI.getArithmeticInstrCost(
-        Opcode, VectorTy, Ctx.CostKind,
-        {TargetTransformInfo::OK_AnyValue, TargetTransformInfo::OP_None},
-        {TargetTransformInfo::OK_AnyValue, TargetTransformInfo::OP_None});
-  }
-
   case Instruction::UDiv:
   case Instruction::SDiv:
   case Instruction::SRem:
   case Instruction::URem:
     // More complex computation, let the legacy cost-model handle this for now.
     return Ctx.getLegacyCost(cast<Instruction>(getUnderlyingValue()), VF);
+  case Instruction::FNeg:
   case Instruction::Add:
   case Instruction::FAdd:
   case Instruction::Sub:
@@ -2052,45 +2107,12 @@ InstructionCost VPWidenRecipe::computeCost(ElementCount VF,
   case Instruction::AShr:
   case Instruction::And:
   case Instruction::Or:
-  case Instruction::Xor: {
-    VPValue *RHS = getOperand(1);
-    // Certain instructions can be cheaper to vectorize if they have a constant
-    // second vector operand. One example of this are shifts on x86.
-    TargetTransformInfo::OperandValueInfo RHSInfo = Ctx.getOperandInfo(RHS);
-
-    if (RHSInfo.Kind == TargetTransformInfo::OK_AnyValue &&
-        getOperand(1)->isDefinedOutsideLoopRegions())
-      RHSInfo.Kind = TargetTransformInfo::OK_UniformValue;
-    Type *VectorTy = toVectorTy(Ctx.Types.inferScalarType(this), VF);
-    Instruction *CtxI = dyn_cast_or_null<Instruction>(getUnderlyingValue());
-
-    SmallVector<const Value *, 4> Operands;
-    if (CtxI)
-      Operands.append(CtxI->value_op_begin(), CtxI->value_op_end());
-    return Ctx.TTI.getArithmeticInstrCost(
-        Opcode, VectorTy, Ctx.CostKind,
-        {TargetTransformInfo::OK_AnyValue, TargetTransformInfo::OP_None},
-        RHSInfo, Operands, CtxI, &Ctx.TLI);
-  }
-  case Instruction::Freeze: {
-    // This opcode is unknown. Assume that it is the same as 'mul'.
-    Type *VectorTy = toVectorTy(Ctx.Types.inferScalarType(this), VF);
-    return Ctx.TTI.getArithmeticInstrCost(Instruction::Mul, VectorTy,
-                                          Ctx.CostKind);
-  }
-  case Instruction::ExtractValue: {
-    return Ctx.TTI.getInsertExtractValueCost(Instruction::ExtractValue,
-                                             Ctx.CostKind);
-  }
+  case Instruction::Xor:
+  case Instruction::Freeze:
+  case Instruction::ExtractValue:
   case Instruction::ICmp:
-  case Instruction::FCmp: {
-    Instruction *CtxI = dyn_cast_or_null<Instruction>(getUnderlyingValue());
-    Type *VectorTy = toVectorTy(Ctx.Types.inferScalarType(getOperand(0)), VF);
-    return Ctx.TTI.getCmpSelInstrCost(
-        Opcode, VectorTy, CmpInst::makeCmpResultType(VectorTy), getPredicate(),
-        Ctx.CostKind, {TTI::OK_AnyValue, TTI::OP_None},
-        {TTI::OK_AnyValue, TTI::OP_None}, CtxI);
-  }
+  case Instruction::FCmp:
+    return *getCommonCost(getOpcode(), VF, Ctx, /*IsVector=*/true);
   default:
     llvm_unreachable("Unsupported opcode for instruction");
   }
@@ -2964,8 +2986,6 @@ InstructionCost VPReplicateRecipe::computeCost(ElementCount VF,
   // transform, avoid computing their cost multiple times for now.
   Ctx.SkipCostComputation.insert(UI);
 
-  TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput;
-  Type *ResultTy = Ctx.Types.inferScalarType(this);
   switch (UI->getOpcode()) {
   case Instruction::GetElementPtr:
     // We mark this instruction as zero-cost because the cost of GEPs in
@@ -2987,12 +3007,7 @@ InstructionCost VPReplicateRecipe::computeCost(ElementCount VF,
   case Instruction::And:
   case Instruction::Or:
   case Instruction::Xor: {
-    auto Op2Info = Ctx.getOperandInfo(getOperand(1));
-    SmallVector<const Value *, 4> Operands(UI->operand_values());
-    return Ctx.TTI.getArithmeticInstrCost(
-               UI->getOpcode(), ResultTy, CostKind,
-               {TargetTransformInfo::OK_AnyValue, TargetTransformInfo::OP_None},
-               Op2Info, Operands, UI, &Ctx.TLI) *
+    return *getCommonCost(getOpcode(), VF, Ctx, /*IsVector=*/false) *
            (isSingleScalar() ? 1 : VF.getFixedValue());
   }
   }

github-actions · 2025-08-13T07:38:41Z

✅ With the latest revision this PR passed the C/C++ code formatter.

A number of recipes compute costs for the same opcodes for scalars or vectors, depending on the recipe. Move the common logic out to a helper in VPRecipeWithIRFlags, that is then used by VPReplicateRecipe, VPWidenRecipe and VPInstruction. This makes it easier to cover all relevant opcodes, without duplication.

ElvisWang123

LGTM, thanks!

david-arm · 2025-08-14T08:40:08Z

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

+    Type *OpTy = IsVector ? toVectorTy(ScalarOpTy, VF) : ScalarOpTy;
+    Instruction *CtxI = dyn_cast_or_null<Instruction>(getUnderlyingValue());
+    return Ctx.TTI.getCmpSelInstrCost(
+        Opcode, OpTy, CmpInst::makeCmpResultType(OpTy), getPredicate(),


Shouldn't we assert that result type matches the actual result type in vplan? I don't know if there is anything in vplan that fundamentally restricts icmp/fcmp to be consistent with CmpInst::makeCmpResultType(OpTy)

Hmm, the result type will be a vector of i1 if the operand type is a vector and otherwise a scalar i1. The number of lanes of the operand type must match the number of lanes of the result type, so CmpInst::makeCmpResultType(OpTy) should always do the right thing. And the scalar type of FCmp/ICmp must also be i1.

david-arm · 2025-08-14T08:43:13Z

llvm/lib/Transforms/Vectorize/VPlan.h

@@ -898,6 +898,13 @@ struct VPRecipeWithIRFlags : public VPSingleDefRecipe, public VPIRFlags {
  }

  void execute(VPTransformState &State) override = 0;
+
+  /// Compute the cost for this recipe using \p Opcode. If \p IsVector is true,


nit: Perhaps better to also mention the VF and CostKind, and also explain the apparent contradiction between \p IsVector and \p VF in some cases, i.e. something like

/// Compute the cost for this recipe using \p Opcode, the vector's /// ElementCount \p VF and the cost kind specified by \p Ctx.CostKind. /// If \p IsVector is true, ... /// NOTE: \p IsVector may contradict \p VF in cases where only the /// first lane of the vector is used.

I updated the function to just take an VF argument, and set it to 1 if we compute the scalar cost.

david-arm · 2025-08-14T08:44:13Z

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

+VPRecipeWithIRFlags::getCommonCost(unsigned Opcode, ElementCount VF,
+                                   VPCostContext &Ctx, bool IsVector) const {
+  Type *ScalarTy = Ctx.Types.inferScalarType(this);
+  Type *ResultTy = IsVector ? toVectorTy(ScalarTy, VF) : ScalarTy;


I think it's worth adding a comment explaining the discrepancy between IsVector and VF as it could be confusing.

IsVector is gone now, thanks

david-arm · 2025-08-14T08:53:09Z

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

+    return Ctx.TTI.getArithmeticInstrCost(
+        Opcode, ResultTy, Ctx.CostKind,
+        {TargetTransformInfo::OK_AnyValue, TargetTransformInfo::OP_None},
+        {TargetTransformInfo::OK_AnyValue, TargetTransformInfo::OP_None});


Can't you just do

return Ctx.TTI.getArithmeticInstrCost(Opcode, ResultTy, Ctx.CostKind);

given that it already has the default parameters that you want?

LLVM_ABI InstructionCost getArithmeticInstrCost( unsigned Opcode, Type *Ty, TTI::TargetCostKind CostKind = TTI::TCK_RecipThroughput, TTI::OperandValueInfo Opd1Info = {TTI::OK_AnyValue, TTI::OP_None}, TTI::OperandValueInfo Opd2Info = {TTI::OK_AnyValue, TTI::OP_None}, ArrayRef<const Value *> Args = {}, const Instruction *CxtI = nullptr, const TargetLibraryInfo *TLibInfo = nullptr) const;

Yep, updated, thanks

david-arm · 2025-08-14T08:57:16Z

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

+    // second vector operand. One example of this are shifts on x86.
+    TargetTransformInfo::OperandValueInfo RHSInfo = Ctx.getOperandInfo(RHS);
+
+    if (RHSInfo.Kind == TargetTransformInfo::OK_AnyValue &&


I think the code here deviates from the code that was previously executed in VPReplicateRecipe::computeCost so I don't think this PR is NFC.

OK_UniformValue is only relevant for vector versions, so the we currently compute exactly the same costs. But it is not necessary to preform the check for scalar VFs, so I made it conditional.

…cost

… (NFCI). (#153361) A number of recipes compute costs for the same opcodes for scalars or vectors, depending on the recipe. Move the common logic out to a helper in VPRecipeWithIRFlags, that is then used by VPReplicateRecipe, VPWidenRecipe and VPInstruction. This makes it easier to cover all relevant opcodes, without duplication. PR: llvm/llvm-project#153361

fhahn requested review from davemgreen, ayalz, david-arm and ElvisWang123 August 13, 2025 07:35

llvmbot added vectorizers llvm:transforms labels Aug 13, 2025

fhahn force-pushed the vplan-compute-common-cost branch from 0a029ea to 59d2dbf Compare August 13, 2025 20:39

ElvisWang123 approved these changes Aug 13, 2025

View reviewed changes

david-arm reviewed Aug 14, 2025

View reviewed changes

fhahn added 4 commits August 18, 2025 11:15

Merge remote-tracking branch 'origin/main' into vplan-compute-common-…

15bf3a0

…cost

!fixup address comments

0d082ef

Merge remote-tracking branch 'origin/main' into vplan-compute-common-…

5d61c17

…cost

!fixup use more specific name

0f4ac21

fhahn merged commit 35be64a into llvm:main Aug 20, 2025
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[VPlan] Factor out logic to common compute costs to helper (NFCI). #153361

[VPlan] Factor out logic to common compute costs to helper (NFCI). #153361

Uh oh!

fhahn commented Aug 13, 2025

Uh oh!

llvmbot commented Aug 13, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 13, 2025 •

edited

Loading

Uh oh!

ElvisWang123 left a comment

Uh oh!

david-arm Aug 14, 2025

Uh oh!

fhahn Aug 17, 2025

Uh oh!

david-arm Aug 14, 2025

Uh oh!

fhahn Aug 18, 2025

Uh oh!

david-arm Aug 14, 2025

Uh oh!

fhahn Aug 18, 2025

Uh oh!

david-arm Aug 14, 2025

Uh oh!

fhahn Aug 18, 2025

Uh oh!

david-arm Aug 14, 2025

Uh oh!

fhahn Aug 18, 2025

Uh oh!

Uh oh!

Uh oh!

[VPlan] Factor out logic to common compute costs to helper (NFCI). #153361

[VPlan] Factor out logic to common compute costs to helper (NFCI). #153361

Uh oh!

Conversation

fhahn commented Aug 13, 2025

Uh oh!

llvmbot commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ElvisWang123 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Aug 13, 2025 •

edited

Loading

github-actions bot commented Aug 13, 2025 •

edited

Loading