[RISCV][TTI] Scale the cost of FP-Int conversion with LMUL #87506

arcbbb · 2024-04-03T15:23:07Z

Widening/narrowing the source data type to match the destination data type may require multiple steps.
To model the costs, the patch generated the interim type by following the logic in RISCVTargetLowering::lowerVPFPIntConvOp.

llvmbot · 2024-04-03T15:23:43Z

@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-llvm-analysis

Author: Shih-Po Hung (arcbbb)

Changes

Widening/narrowing the source data type to match the destination data type may require multiple steps.
To model the costs, the patch generated the interim type by following the logic in RISCVTargetLowering::lowerVPFPIntConvOp.

Patch is 352.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/87506.diff

2 Files Affected:

(modified) llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp (+95-20)
(modified) llvm/test/Analysis/CostModel/RISCV/cast.ll (+1108-1108)

diff --git a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
index 38304ff90252f0..6ea17aa1130963 100644
--- a/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
+++ b/llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
@@ -988,31 +988,106 @@ InstructionCost RISCVTTIImpl::getCastInstrCost(unsigned Opcode, Type *Dst,
     return Cost;
   }
   case ISD::FP_TO_SINT:
-  case ISD::FP_TO_UINT:
+  case ISD::FP_TO_UINT: {
+    unsigned IsSigned = ISD == ISD::FP_TO_SINT;
+    unsigned FCVT = IsSigned ? RISCV::VFCVT_RTZ_X_F_V : RISCV::VFCVT_RTZ_XU_F_V;
+    unsigned FWCVT =
+        IsSigned ? RISCV::VFWCVT_RTZ_X_F_V : RISCV::VFWCVT_RTZ_XU_F_V;
+    unsigned FNCVT =
+        IsSigned ? RISCV::VFNCVT_RTZ_X_F_W : RISCV::VFNCVT_RTZ_XU_F_W;
+    unsigned SrcEltSize = Src->getScalarSizeInBits();
+    unsigned DstEltSize = Dst->getScalarSizeInBits();
+    if (DstEltSize == 1) {
+      // For fp vector to mask, we use:
+      // vfncvt.rtz.x.f.w v9, v8
+      // vand.vi v8, v9, 1
+      // vmsne.vi v0, v8, 0
+      SrcEltSize /= 2;
+      MVT ElementVT = MVT::getIntegerVT(SrcEltSize);
+      MVT InterimVT = SrcLT.second.changeVectorElementType(ElementVT);
+      return getRISCVInstructionCost(FNCVT, InterimVT, CostKind) +
+             getRISCVInstructionCost({RISCV::VAND_VI, RISCV::VMSNE_VI},
+                                     DstLT.second, CostKind);
+    }
+    if (DstEltSize == SrcEltSize)
+      return getRISCVInstructionCost(FCVT, DstLT.second, CostKind);
+    if (DstEltSize == (2 * SrcEltSize))
+      return getRISCVInstructionCost(FWCVT, DstLT.second, CostKind);
+    if (DstEltSize == (4 * SrcEltSize) && (SrcEltSize == 16)) {
+      // Convert f16 to f32 then convert f32 to i64.
+      MVT VecF32VT = DstLT.second.changeVectorElementType(MVT::f32);
+      return getRISCVInstructionCost(RISCV::VFWCVT_F_F_V, VecF32VT, CostKind) +
+             getRISCVInstructionCost(FWCVT, DstLT.second, CostKind);
+    }
+    if (DstEltSize < SrcEltSize) {
+      SrcEltSize /= 2;
+      MVT ElementVT = MVT::getIntegerVT(SrcEltSize);
+      MVT InterimVT = DstLT.second.changeVectorElementType(ElementVT);
+      InstructionCost Cost =
+          getRISCVInstructionCost(FNCVT, InterimVT, CostKind);
+      while (DstEltSize < SrcEltSize) {
+        SrcEltSize /= 2;
+        ElementVT = MVT::getIntegerVT(SrcEltSize);
+        InterimVT = DstLT.second.changeVectorElementType(ElementVT);
+        Cost += getRISCVInstructionCost(RISCV::VNSRL_WI, InterimVT, CostKind);
+      }
+      return Cost;
+    }
+    return BaseT::getCastInstrCost(Opcode, Dst, Src, CCH, CostKind, I);
+  }
   case ISD::SINT_TO_FP:
-  case ISD::UINT_TO_FP:
-    if (Src->getScalarSizeInBits() == 1 || Dst->getScalarSizeInBits() == 1) {
-      // The cost of convert from or to mask vector is different from other
-      // cases. We could not use PowDiff to calculate it.
-      // For mask vector to fp, we should use the following instructions:
+  case ISD::UINT_TO_FP: {
+    unsigned IsSigned = ISD == ISD::SINT_TO_FP;
+    unsigned FCVT = IsSigned ? RISCV::VFCVT_F_X_V : RISCV::VFCVT_F_XU_V;
+    unsigned FWCVT = IsSigned ? RISCV::VFWCVT_F_X_V : RISCV::VFWCVT_F_XU_V;
+    unsigned FNCVT = IsSigned ? RISCV::VFNCVT_F_X_W : RISCV::VFNCVT_F_XU_W;
+    unsigned SrcEltSize = Src->getScalarSizeInBits();
+    unsigned DstEltSize = Dst->getScalarSizeInBits();
+
+    if (SrcEltSize == 1) {
+      // For mask vector to fp, we use:
       // vmv.v.i v8, 0
       // vmerge.vim v8, v8, -1, v0
-      // vfcvt.f.x.v v8, v8
+      // vfwcvt.f.x.v v8, v8
+      MVT ElementVT = MVT::getIntegerVT(DstEltSize >> 1);
+      MVT VecHalfVT = DstLT.second.changeVectorElementType(ElementVT);
+      return getRISCVInstructionCost({RISCV::VMV_V_I, RISCV::VMERGE_VIM},
+                                     VecHalfVT, CostKind) +
+             getRISCVInstructionCost(FWCVT, DstLT.second, CostKind);
+    }
 
-      // And for fp vector to mask, we use:
-      // vfncvt.rtz.x.f.w v9, v8
-      // vand.vi v8, v9, 1
-      // vmsne.vi v0, v8, 0
-      return 3;
+    if (DstEltSize == SrcEltSize)
+      return getRISCVInstructionCost(FCVT, DstLT.second, CostKind);
+
+    if (DstEltSize == (2 * SrcEltSize))
+      return getRISCVInstructionCost(FWCVT, DstLT.second, CostKind);
+
+    if (DstEltSize == (4 * SrcEltSize)) {
+      unsigned WidenIntOp = IsSigned ? RISCV::VSEXT_VF2 : RISCV::VZEXT_VF2;
+      MVT ElementVT = MVT::getIntegerVT(DstEltSize >> 1);
+      MVT VecVT = DstLT.second.changeVectorElementType(ElementVT);
+      return getRISCVInstructionCost(WidenIntOp, VecVT, CostKind) +
+             getRISCVInstructionCost(FWCVT, DstLT.second, CostKind);
     }
-    if (std::abs(PowDiff) <= 1)
-      return 1;
-    // Backend could lower (v[sz]ext i8 to double) to vfcvt(v[sz]ext.f8 i8),
-    // so it only need two conversion.
-    if (Src->isIntOrIntVectorTy())
-      return 2;
-    // Counts of narrow/widen instructions.
-    return std::abs(PowDiff);
+    if (DstEltSize == (8 * SrcEltSize)) {
+      unsigned WidenIntOp = IsSigned ? RISCV::VSEXT_VF4 : RISCV::VZEXT_VF4;
+      MVT ElementVT = MVT::getIntegerVT(DstEltSize >> 1);
+      MVT VecVT = DstLT.second.changeVectorElementType(ElementVT);
+      return getRISCVInstructionCost(WidenIntOp, VecVT, CostKind) +
+             getRISCVInstructionCost(FWCVT, DstLT.second, CostKind);
+    }
+    if (SrcEltSize == (2 * DstEltSize))
+      return getRISCVInstructionCost(FNCVT, DstLT.second, CostKind);
+
+    if ((SrcEltSize == (4 * DstEltSize)) && (DstEltSize == 16)) {
+      // Handle i64 to f16: vfncvt.f.x/xu + vfncvt.f.f
+      MVT DstVT = DstLT.second.changeVectorElementType(MVT::f32);
+      return getRISCVInstructionCost(FNCVT, DstVT, CostKind) +
+             getRISCVInstructionCost(RISCV::VFNCVT_F_F_W, DstLT.second,
+                                     CostKind);
+    }
+    return BaseT::getCastInstrCost(Opcode, Dst, Src, CCH, CostKind, I);
+  }
   }
   return BaseT::getCastInstrCost(Opcode, Dst, Src, CCH, CostKind, I);
 }
diff --git a/llvm/test/Analysis/CostModel/RISCV/cast.ll b/llvm/test/Analysis/CostModel/RISCV/cast.ll
index 6ddd57a24c51f5..616310b30d0da9 100644
--- a/llvm/test/Analysis/CostModel/RISCV/cast.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/cast.ll
@@ -1725,87 +1725,87 @@ define void @fptosi() {
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v4f16_v4i32 = fptosi <4 x half> undef to <4 x i32>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v4f32_v4i32 = fptosi <4 x float> undef to <4 x i32>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v4f64_v4i32 = fptosi <4 x double> undef to <4 x i32>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f16_v4i64 = fptosi <4 x half> undef to <4 x i64>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v4f32_v4i64 = fptosi <4 x float> undef to <4 x i64>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v4f64_v4i64 = fptosi <4 x double> undef to <4 x i64>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v4f16_v4i64 = fptosi <4 x half> undef to <4 x i64>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f32_v4i64 = fptosi <4 x float> undef to <4 x i64>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f64_v4i64 = fptosi <4 x double> undef to <4 x i64>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v4f16_v4i1 = fptosi <4 x half> undef to <4 x i1>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v4f32_v4i1 = fptosi <4 x float> undef to <4 x i1>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v4f64_v4i1 = fptosi <4 x double> undef to <4 x i1>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v8f16_v8i8 = fptosi <8 x half> undef to <8 x i8>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f32_v8i8 = fptosi <8 x float> undef to <8 x i8>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v8f64_v8i8 = fptosi <8 x double> undef to <8 x i8>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v8f64_v8i8 = fptosi <8 x double> undef to <8 x i8>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v8f16_v8i16 = fptosi <8 x half> undef to <8 x i16>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v8f32_v8i16 = fptosi <8 x float> undef to <8 x i16>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f64_v8i16 = fptosi <8 x double> undef to <8 x i16>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v8f16_v8i32 = fptosi <8 x half> undef to <8 x i32>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v8f32_v8i32 = fptosi <8 x float> undef to <8 x i32>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v8f64_v8i32 = fptosi <8 x double> undef to <8 x i32>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f16_v8i64 = fptosi <8 x half> undef to <8 x i64>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v8f32_v8i64 = fptosi <8 x float> undef to <8 x i64>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v8f64_v8i64 = fptosi <8 x double> undef to <8 x i64>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v8f64_v8i16 = fptosi <8 x double> undef to <8 x i16>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f16_v8i32 = fptosi <8 x half> undef to <8 x i32>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f32_v8i32 = fptosi <8 x float> undef to <8 x i32>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f64_v8i32 = fptosi <8 x double> undef to <8 x i32>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %v8f16_v8i64 = fptosi <8 x half> undef to <8 x i64>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v8f32_v8i64 = fptosi <8 x float> undef to <8 x i64>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v8f64_v8i64 = fptosi <8 x double> undef to <8 x i64>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v8f16_v8i1 = fptosi <8 x half> undef to <8 x i1>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v8f32_v8i1 = fptosi <8 x float> undef to <8 x i1>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v8f64_v8i1 = fptosi <8 x double> undef to <8 x i1>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v8f64_v8i1 = fptosi <8 x double> undef to <8 x i1>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v16f16_v16i8 = fptosi <16 x half> undef to <16 x i8>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v16f32_v16i8 = fptosi <16 x float> undef to <16 x i8>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v16f64_v16i8 = fptosi <16 x double> undef to <16 x i8>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v16f16_v16i16 = fptosi <16 x half> undef to <16 x i16>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v16f32_v16i16 = fptosi <16 x float> undef to <16 x i16>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v16f64_v16i16 = fptosi <16 x double> undef to <16 x i16>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v16f16_v16i32 = fptosi <16 x half> undef to <16 x i32>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v16f32_v16i32 = fptosi <16 x float> undef to <16 x i32>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v16f64_v16i32 = fptosi <16 x double> undef to <16 x i32>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v16f16_v16i64 = fptosi <16 x half> undef to <16 x i64>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v16f32_v16i64 = fptosi <16 x float> undef to <16 x i64>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v16f64_v16i64 = fptosi <16 x double> undef to <16 x i64>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v16f32_v16i8 = fptosi <16 x float> undef to <16 x i8>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %v16f64_v16i8 = fptosi <16 x double> undef to <16 x i8>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v16f16_v16i16 = fptosi <16 x half> undef to <16 x i16>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v16f32_v16i16 = fptosi <16 x float> undef to <16 x i16>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %v16f64_v16i16 = fptosi <16 x double> undef to <16 x i16>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v16f16_v16i32 = fptosi <16 x half> undef to <16 x i32>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v16f32_v16i32 = fptosi <16 x float> undef to <16 x i32>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v16f64_v16i32 = fptosi <16 x double> undef to <16 x i32>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: %v16f16_v16i64 = fptosi <16 x half> undef to <16 x i64>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %v16f32_v16i64 = fptosi <16 x float> undef to <16 x i64>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %v16f64_v16i64 = fptosi <16 x double> undef to <16 x i64>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v16f16_v16i1 = fptosi <16 x half> undef to <16 x i1>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v16f32_v16i1 = fptosi <16 x float> undef to <16 x i1>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v16f64_v16i1 = fptosi <16 x double> undef to <16 x i1>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32f16_v32i8 = fptosi <32 x half> undef to <32 x i8>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32f32_v32i8 = fptosi <32 x float> undef to <32 x i8>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %v32f64_v32i8 = fptosi <32 x double> undef to <32 x i8>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32f16_v32i16 = fptosi <32 x half> undef to <32 x i16>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32f32_v32i16 = fptosi <32 x float> undef to <32 x i16>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v32f64_v32i16 = fptosi <32 x double> undef to <32 x i16>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32f16_v32i32 = fptosi <32 x half> undef to <32 x i32>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v32f32_v32i32 = fptosi <32 x float> undef to <32 x i32>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v32f64_v32i32 = fptosi <32 x double> undef to <32 x i32>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v32f16_v32i64 = fptosi <32 x half> undef to <32 x i64>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v32f32_v32i64 = fptosi <32 x float> undef to <32 x i64>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v16f32_v16i1 = fptosi <16 x float> undef to <16 x i1>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %v16f64_v16i1 = fptosi <16 x double> undef to <16 x i1>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32f16_v32i8 = fptosi <32 x half> undef to <32 x i8>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %v32f32_v32i8 = fptosi <32 x float> undef to <32 x i8>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %v32f64_v32i8 = fptosi <32 x double> undef to <32 x i8>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v32f16_v32i16 = fptosi <32 x half> undef to <32 x i16>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v32f32_v32i16 = fptosi <32 x float> undef to <32 x i16>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 13 for instruction: %v32f64_v32i16 = fptosi <32 x double> undef to <32 x i16>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %v32f16_v32i32 = fptosi <32 x half> undef to <32 x i32>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %v32f32_v32i32 = fptosi <32 x float> undef to <32 x i32>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v32f64_v32i32 = fptosi <32 x double> undef to <32 x i32>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 25 for instruction: %v32f16_v32i64 = fptosi <32 x half> undef to <32 x i64>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v32f32_v32i64 = fptosi <32 x float> undef to <32 x i64>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v32f64_v32i64 = fptosi <32 x double> undef to <32 x i64>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v32f16_v32i1 = fptosi <32 x half> undef to <32 x i1>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v32f32_v32i1 = fptosi <32 x float> undef to <32 x i1>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %v32f64_v32i1 = fptosi <32 x double> undef to <32 x i1>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v64f16_v64i8 = fptosi <64 x half> undef to <64 x i8>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %v64f32_v64i8 = fptosi <64 x float> undef to <64 x i8>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 15 for instruction: %v64f64_v64i8 = fptosi <64 x double> undef to <64 x i8>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 1 for instruction: %v64f16_v64i16 = fptosi <64 x half> undef to <64 x i16>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v64f32_v64i16 = fptosi <64 x float> undef to <64 x i16>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %v64f64_v64i16 = fptosi <64 x double> undef to <64 x i16>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 3 for instruction: %v64f16_v64i32 = fptosi <64 x half> undef to <64 x i32>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v32f16_v32i1 = fptosi <32 x half> undef to <32 x i1>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %v32f32_v32i1 = fptosi <32 x float> undef to <32 x i1>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 13 for instruction: %v32f64_v32i1 = fptosi <32 x double> undef to <32 x i1>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v64f16_v64i8 = fptosi <64 x half> undef to <64 x i8>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 13 for instruction: %v64f32_v64i8 = fptosi <64 x float> undef to <64 x i8>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 31 for instruction: %v64f64_v64i8 = fptosi <64 x double> undef to <64 x i8>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %v64f16_v64i16 = fptosi <64 x half> undef to <64 x i16>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 9 for instruction: %v64f32_v64i16 = fptosi <64 x float> undef to <64 x i16>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 27 for instruction: %v64f64_v64i16 = fptosi <64 x double> undef to <64 x i16>
+; RV32-NEXT:  Cost Model: Found an estimated cost of 17 for instruction: %v64f16_v64i32 = fptosi <64 x half> undef to <64 x i32>
 ; RV32-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v64f32_v64i32 = fptosi <64 x float> undef to <64 x i32>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 6 for instruction: %v64f64_v64i32 = fptosi <64 x double> undef to <64 x i32>
-; RV32-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %v64f16_v64i64 = fptosi <64 x half> undef to <64 x i64>
-; RV32-NEXT:  Cost Model: Found an estimated cost...
[truncated]

topperc · 2024-04-03T22:16:19Z

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

      // vmv.v.i v8, 0
      // vmerge.vim v8, v8, -1, v0
-      // vfcvt.f.x.v v8, v8
+      // vfwcvt.f.x.v v8, v8


Weirdly vp.sitofp does not use a widening fwcvt here, but that's a separate issue that doesn't affect this patch.

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

arcbbb · 2024-05-27T01:49:58Z

ping

arcbbb · 2024-07-24T03:22:14Z

ping

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

lukel97 · 2024-08-05T05:37:14Z

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

+    if ((SrcEltSize >> 1) > DstEltSize) {
+      // For mask type, we use:
+      // vand.vi v8, v9, 1
+      // vmsne.vi v0, v8, 0
+      VectorType *VecTy =
+          VectorType::get(IntegerType::get(Dst->getContext(), SrcEltSize >> 1),
+                          cast<VectorType>(Dst)->getElementCount());
+      Cost +=
+          getCastInstrCost(Instruction::Trunc, Dst, VecTy, CCH, CostKind, I);
+    }


Can this be moved into the SrcEltSize > DstEltSize branch above, so we can reuse VecVT? Also I'm happy if you want to leave out the mask type comment, thanks for clarifying it in the reply to my review.

I moved but I cannot reuse it since one is MVT and the other is VectorType.

lukel97 · 2024-08-05T05:47:11Z

llvm/test/Analysis/CostModel/RISCV/cast-vfhmin.ll

Nit, can this be renamed to cast-half since it tests both zvfh and zvfhmin?

Fixed. Thanks!

lukel97 · 2024-08-05T12:24:37Z

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

+        VectorType *VecTy = VectorType::get(
+            IntegerType::get(Dst->getContext(), SrcEltSize >> 1),
+            cast<VectorType>(Dst)->getElementCount());


Can we use getTypeForEVT so we reuse VecVT? Otherwise I don't think we use the legalized type

Cool, thanks for catching that!

topperc · 2024-08-05T22:07:47Z

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

+    unsigned DstEltSize = Dst->getScalarSizeInBits();
+    InstructionCost Cost = 0;
+    if ((SrcEltSize == 16) &&
+        (!ST->hasVInstructionsF16() || ((DstEltSize >> 1) > SrcEltSize))) {


Use / 2 instead of >> 1. Leave converting divide to shift as a compiler optimization.

Fixed. Thanks!

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

arcbbb · 2024-08-30T03:07:29Z

gentle ping

lukel97 · 2024-08-30T08:16:47Z

llvm/test/Analysis/CostModel/RISCV/cast-half.ll

Is it possible to precommit splitting out the half tests?

Sure, created in #106692

precommit f16 test for #87506 fp-int conversion

Widening/narrowing the source data type to match the destination data type may require multiple steps. To model the costs, the patch generated the interim type by following the logic in RISCVTargetLowering::lowerVPFPIntConvOp.

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

lukel97

LGTM

llvm-ci · 2024-09-02T02:33:03Z

LLVM Buildbot has detected a new failure on builder clang-armv7-global-isel running on linaro-clang-armv7-global-isel while building llvm at step 7 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/39/builds/1391

Here is the relevant piece of the build log for the reference

Step 7 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'ClangPseudo :: cxx/unsized-array.cpp' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
RUN: at line 1: clang-pseudo -grammar=cxx -source=/home/tcwg-buildbot/worker/clang-armv7-global-isel/llvm/clang-tools-extra/pseudo/test/cxx/unsized-array.cpp --print-forest | /home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/bin/FileCheck /home/tcwg-buildbot/worker/clang-armv7-global-isel/llvm/clang-tools-extra/pseudo/test/cxx/unsized-array.cpp
+ clang-pseudo -grammar=cxx -source=/home/tcwg-buildbot/worker/clang-armv7-global-isel/llvm/clang-tools-extra/pseudo/test/cxx/unsized-array.cpp --print-forest
clang-pseudo: ../llvm/clang-tools-extra/pseudo/lib/cxx/CXX.cpp:437: auto clang::pseudo::cxx::getLanguage()::(anonymous class)::operator()() const: Assertion `Diags.empty()' failed.
+ /home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/bin/FileCheck /home/tcwg-buildbot/worker/clang-armv7-global-isel/llvm/clang-tools-extra/pseudo/test/cxx/unsized-array.cpp
#0 0x00c5535c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/bin/clang-pseudo+0x5f35c)
#1 0x00c530e4 llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/bin/clang-pseudo+0x5d0e4)
#2 0x00c55db0 SignalHandler(int) Signals.cpp:0:0
#3 0xf792d6e0 __default_sa_restorer ./signal/../sysdeps/unix/sysv/linux/arm/sigrestorer.S:67:0
#4 0xf791db06 ./csu/../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47:0
#5 0xf795d292 __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
#6 0xf792c840 gsignal ./signal/../sysdeps/posix/raise.c:27:6
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/tcwg-buildbot/worker/clang-armv7-global-isel/stage1/bin/FileCheck /home/tcwg-buildbot/worker/clang-armv7-global-isel/llvm/clang-tools-extra/pseudo/test/cxx/unsized-array.cpp

--

********************

arcbbb requested review from preames, lukel97, jacquesguan, topperc and yetingk April 3, 2024 15:23

llvmbot added backend:RISC-V llvm:analysis labels Apr 3, 2024

topperc reviewed Apr 3, 2024

View reviewed changes

lukel97 reviewed Jul 24, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp Show resolved Hide resolved

preames removed their request for review July 25, 2024 18:04

preames mentioned this pull request Jul 29, 2024

[RISCV][TTI] Cost non-power-of-two size changing casts #101047

Merged

lukel97 reviewed Jul 30, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp Outdated Show resolved Hide resolved

arcbbb force-pushed the tti-fp-conv-cost branch from 6d01c61 to 0398f01 Compare July 30, 2024 08:33

lukel97 reviewed Aug 5, 2024

View reviewed changes

topperc reviewed Aug 5, 2024

View reviewed changes

lukel97 reviewed Aug 6, 2024

View reviewed changes

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp Show resolved Hide resolved

arcbbb force-pushed the tti-fp-conv-cost branch from 34d9ba6 to 2f573e1 Compare August 15, 2024 07:24

lukel97 reviewed Aug 30, 2024

View reviewed changes

arcbbb mentioned this pull request Aug 30, 2024

[RISCV][NFC] Splits f16 cast tests into a separate file #106692

Merged

arcbbb added a commit that referenced this pull request Aug 30, 2024

[RISCV][NFC] Splits f16 cast tests into a separate file (#106692)

8f4aafb

precommit f16 test for #87506 fp-int conversion

[RISCV][TTI] Scale the cost of FP-Int conversion with LMUL

31a4557

Widening/narrowing the source data type to match the destination data type may require multiple steps. To model the costs, the patch generated the interim type by following the logic in RISCVTargetLowering::lowerVPFPIntConvOp.

arcbbb force-pushed the tti-fp-conv-cost branch from 2f573e1 to 31a4557 Compare August 30, 2024 09:33

lukel97 reviewed Aug 30, 2024

View reviewed changes

Address comments

6eb3294

lukel97 approved these changes Aug 30, 2024

View reviewed changes

arcbbb merged commit 837ee5b into llvm:main Sep 2, 2024
8 checks passed

arcbbb deleted the tti-fp-conv-cost branch September 2, 2024 01:38

[RISCV][TTI] Scale the cost of FP-Int conversion with LMUL #87506

[RISCV][TTI] Scale the cost of FP-Int conversion with LMUL #87506

Uh oh!

Conversation

arcbbb commented Apr 3, 2024

Uh oh!

llvmbot commented Apr 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

arcbbb commented May 27, 2024

Uh oh!

arcbbb commented Jul 24, 2024

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

topperc Aug 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

arcbbb commented Aug 30, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lukel97 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented Sep 2, 2024

Uh oh!

Uh oh!

llvmbot commented Apr 3, 2024 •

edited

Loading

topperc Aug 5, 2024 •

edited

Loading