[TTI][AArch64] Detect OperandInfo from scalable splats. #122469

davemgreen · 2025-01-10T15:07:52Z

Pulled out of #122236, this allows Splats constants to be recognized by getOperandInfo, allowing "better" costs for instructions like divides by constants to be produced (which are expanded into mul+add+shift). Some of the costs are not very accurate yet, but the comparison of scalar vs fixed-width vs scalable for the same div can become more accurate, especially with patches like #122236.

llvmbot · 2025-01-10T15:08:33Z

@llvm/pr-subscribers-llvm-transforms
@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-analysis

Author: David Green (davemgreen)

Changes

Pulled out of #122236, this allows Splats constants to be recognized by getOperandInfo, allowing "better" costs for instructions like divides by constants to be produced (which are expanded into mul+add+shift). Some of the costs are not very accurate yet, but the comparison of scalar vs fixed-width vs scalable for the same div can become more accurate, especially with patches like #122236.

Patch is 44.28 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/122469.diff

3 Files Affected:

(modified) llvm/lib/Analysis/TargetTransformInfo.cpp (+2-1)
(modified) llvm/test/Analysis/CostModel/AArch64/sve-div.ll (+38-38)
(modified) llvm/test/Analysis/CostModel/AArch64/sve-rem.ll (+38-38)

diff --git a/llvm/lib/Analysis/TargetTransformInfo.cpp b/llvm/lib/Analysis/TargetTransformInfo.cpp
index b32dffa9f0fe86..13a56709ed10f5 100644
--- a/llvm/lib/Analysis/TargetTransformInfo.cpp
+++ b/llvm/lib/Analysis/TargetTransformInfo.cpp
@@ -893,7 +893,8 @@ TargetTransformInfo::getOperandInfo(const Value *V) {
 
   // Check for a splat of a constant or for a non uniform vector of constants
   // and check if the constant(s) are all powers of two.
-  if (isa<ConstantVector>(V) || isa<ConstantDataVector>(V)) {
+  if (isa<ConstantVector>(V) || isa<ConstantDataVector>(V) ||
+      isa<ConstantExpr>(V)) {
     OpInfo = OK_NonUniformConstantValue;
     if (Splat) {
       OpInfo = OK_UniformConstantValue;
diff --git a/llvm/test/Analysis/CostModel/AArch64/sve-div.ll b/llvm/test/Analysis/CostModel/AArch64/sve-div.ll
index 4c25e3003177d9..ac5638de332039 100644
--- a/llvm/test/Analysis/CostModel/AArch64/sve-div.ll
+++ b/llvm/test/Analysis/CostModel/AArch64/sve-div.ll
@@ -181,22 +181,22 @@ define void @sdiv_uniformconst() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V16i8 = sdiv <16 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %V32i8 = sdiv <32 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %V64i8 = sdiv <64 x i8> undef, splat (i8 7)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i64 = sdiv <vscale x 2 x i64> undef, splat (i64 7)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV2i64 = sdiv <vscale x 2 x i64> undef, splat (i64 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %NV4i64 = sdiv <vscale x 4 x i64> undef, splat (i64 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i64 = sdiv <vscale x 8 x i64> undef, splat (i64 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i32 = sdiv <vscale x 2 x i32> undef, splat (i32 7)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i32 = sdiv <vscale x 4 x i32> undef, splat (i32 7)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV4i32 = sdiv <vscale x 4 x i32> undef, splat (i32 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %NV8i32 = sdiv <vscale x 8 x i32> undef, splat (i32 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV16i32 = sdiv <vscale x 16 x i32> undef, splat (i32 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i16 = sdiv <vscale x 2 x i16> undef, splat (i16 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i16 = sdiv <vscale x 4 x i16> undef, splat (i16 7)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i16 = sdiv <vscale x 8 x i16> undef, splat (i16 7)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV8i16 = sdiv <vscale x 8 x i16> undef, splat (i16 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %NV16i16 = sdiv <vscale x 16 x i16> undef, splat (i16 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %NV32i16 = sdiv <vscale x 32 x i16> undef, splat (i16 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i8 = sdiv <vscale x 2 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i8 = sdiv <vscale x 4 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i8 = sdiv <vscale x 8 x i8> undef, splat (i8 7)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %NV16i8 = sdiv <vscale x 16 x i8> undef, splat (i8 7)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV16i8 = sdiv <vscale x 16 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %NV32i8 = sdiv <vscale x 32 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %NV64i8 = sdiv <vscale x 64 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
@@ -260,22 +260,22 @@ define void @udiv_uniformconst() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V16i8 = udiv <16 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %V32i8 = udiv <32 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %V64i8 = udiv <64 x i8> undef, splat (i8 7)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i64 = udiv <vscale x 2 x i64> undef, splat (i64 7)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV2i64 = udiv <vscale x 2 x i64> undef, splat (i64 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %NV4i64 = udiv <vscale x 4 x i64> undef, splat (i64 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i64 = udiv <vscale x 8 x i64> undef, splat (i64 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i32 = udiv <vscale x 2 x i32> undef, splat (i32 7)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i32 = udiv <vscale x 4 x i32> undef, splat (i32 7)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV4i32 = udiv <vscale x 4 x i32> undef, splat (i32 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %NV8i32 = udiv <vscale x 8 x i32> undef, splat (i32 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV16i32 = udiv <vscale x 16 x i32> undef, splat (i32 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i16 = udiv <vscale x 2 x i16> undef, splat (i16 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i16 = udiv <vscale x 4 x i16> undef, splat (i16 7)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i16 = udiv <vscale x 8 x i16> undef, splat (i16 7)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV8i16 = udiv <vscale x 8 x i16> undef, splat (i16 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %NV16i16 = udiv <vscale x 16 x i16> undef, splat (i16 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %NV32i16 = udiv <vscale x 32 x i16> undef, splat (i16 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i8 = udiv <vscale x 2 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i8 = udiv <vscale x 4 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i8 = udiv <vscale x 8 x i8> undef, splat (i8 7)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %NV16i8 = udiv <vscale x 16 x i8> undef, splat (i8 7)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV16i8 = udiv <vscale x 16 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %NV32i8 = udiv <vscale x 32 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %NV64i8 = udiv <vscale x 64 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
@@ -339,24 +339,24 @@ define void @sdiv_uniformconstpow2() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 99 for instruction: %V16i8 = sdiv <16 x i8> undef, splat (i8 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 198 for instruction: %V32i8 = sdiv <32 x i8> undef, splat (i8 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 396 for instruction: %V64i8 = sdiv <64 x i8> undef, splat (i8 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i64 = sdiv <vscale x 2 x i64> undef, splat (i64 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %NV4i64 = sdiv <vscale x 4 x i64> undef, splat (i64 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i64 = sdiv <vscale x 8 x i64> undef, splat (i64 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i32 = sdiv <vscale x 2 x i32> undef, splat (i32 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i32 = sdiv <vscale x 4 x i32> undef, splat (i32 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %NV8i32 = sdiv <vscale x 8 x i32> undef, splat (i32 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV16i32 = sdiv <vscale x 16 x i32> undef, splat (i32 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i16 = sdiv <vscale x 2 x i16> undef, splat (i16 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i16 = sdiv <vscale x 4 x i16> undef, splat (i16 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i16 = sdiv <vscale x 8 x i16> undef, splat (i16 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %NV16i16 = sdiv <vscale x 16 x i16> undef, splat (i16 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %NV32i16 = sdiv <vscale x 32 x i16> undef, splat (i16 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i8 = sdiv <vscale x 2 x i8> undef, splat (i8 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i8 = sdiv <vscale x 4 x i8> undef, splat (i8 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i8 = sdiv <vscale x 8 x i8> undef, splat (i8 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %NV16i8 = sdiv <vscale x 16 x i8> undef, splat (i8 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %NV32i8 = sdiv <vscale x 32 x i8> undef, splat (i8 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %NV64i8 = sdiv <vscale x 64 x i8> undef, splat (i8 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %NV2i64 = sdiv <vscale x 2 x i64> undef, splat (i64 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %NV4i64 = sdiv <vscale x 4 x i64> undef, splat (i64 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 20 for instruction: %NV8i64 = sdiv <vscale x 8 x i64> undef, splat (i64 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %NV2i32 = sdiv <vscale x 2 x i32> undef, splat (i32 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %NV4i32 = sdiv <vscale x 4 x i32> undef, splat (i32 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %NV8i32 = sdiv <vscale x 8 x i32> undef, splat (i32 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 20 for instruction: %NV16i32 = sdiv <vscale x 16 x i32> undef, splat (i32 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %NV2i16 = sdiv <vscale x 2 x i16> undef, splat (i16 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %NV4i16 = sdiv <vscale x 4 x i16> undef, splat (i16 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %NV8i16 = sdiv <vscale x 8 x i16> undef, splat (i16 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %NV16i16 = sdiv <vscale x 16 x i16> undef, splat (i16 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 20 for instruction: %NV32i16 = sdiv <vscale x 32 x i16> undef, splat (i16 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %NV2i8 = sdiv <vscale x 2 x i8> undef, splat (i8 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %NV4i8 = sdiv <vscale x 4 x i8> undef, splat (i8 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %NV8i8 = sdiv <vscale x 8 x i8> undef, splat (i8 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %NV16i8 = sdiv <vscale x 16 x i8> undef, splat (i8 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 10 for instruction: %NV32i8 = sdiv <vscale x 32 x i8> undef, splat (i8 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 20 for instruction: %NV64i8 = sdiv <vscale x 64 x i8> undef, splat (i8 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   %V2i64 = sdiv <2 x i64> undef, splat (i64 16)
@@ -418,22 +418,22 @@ define void @udiv_uniformconstpow2() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V16i8 = udiv <16 x i8> undef, splat (i8 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %V32i8 = udiv <32 x i8> undef, splat (i8 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %V64i8 = udiv <64 x i8> undef, splat (i8 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i64 = udiv <vscale x 2 x i64> undef, splat (i64 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV2i64 = udiv <vscale x 2 x i64> undef, splat (i64 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %NV4i64 = udiv <vscale x 4 x i64> undef, splat (i64 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i64 = udiv <vscale x 8 x i64> undef, splat (i64 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i32 = udiv <vscale x 2 x i32> undef, splat (i32 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i32 = udiv <vscale x 4 x i32> undef, splat (i32 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV4i32 = udiv <vscale x 4 x i32> undef, splat (i32 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %NV8i32 = udiv <vscale x 8 x i32> undef, splat (i32 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV16i32 = udiv <vscale x 16 x i32> undef, splat (i32 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i16 = udiv <vscale x 2 x i16> undef, splat (i16 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i16 = udiv <vscale x 4 x i16> undef, splat (i16 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i16 = udiv <vscale x 8 x i16> undef, splat (i16 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV8i16 = udiv <vscale x 8 x i16> undef, splat (i16 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %NV16i16 = udiv <vscale x 16 x i16> undef, splat (i16 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %NV32i16 = udiv <vscale x 32 x i16> undef, splat (i16 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i8 = udiv <vscale x 2 x i8> undef, splat (i8 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i8 = udiv <vscale x 4 x i8> undef, splat (i8 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i8 = udiv <vscale x 8 x i8> undef, splat (i8 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %NV16i8 = udiv <vscale x 16 x i8> undef, splat (i8 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV16i8 = udiv <vscale x 16 x i8> undef, splat (i8 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %NV32i8 = udiv <vscale x 32 x i8> undef, splat (i8 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %NV64i8 = udiv <vscale x 64 x i8> undef, splat (i8 16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
@@ -497,22 +497,22 @@ define void @sdiv_uniformconstnegpow2() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V16i8 = sdiv <16 x i8> undef, splat (i8 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %V32i8 = sdiv <32 x i8> undef, splat (i8 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %V64i8 = sdiv <64 x i8> undef, splat (i8 -16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i64 = sdiv <vscale x 2 x i64> undef, splat (i64 -16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV2i64 = sdiv <vscale x 2 x i64> undef, splat (i64 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %NV4i64 = sdiv <vscale x 4 x i64> undef, splat (i64 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i64 = sdiv <vscale x 8 x i64> undef, splat (i64 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i32 = sdiv <vscale x 2 x i32> undef, splat (i32 -16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i32 = sdiv <vscale x 4 x i32> undef, splat (i32 -16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV4i32 = sdiv <vscale x 4 x i32> undef, splat (i32 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %NV8i32 = sdiv <vscale x 8 x i32> undef, splat (i32 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV16i32 = sdiv <vscale x 16 x i32> undef, splat (i32 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i16 = sdiv <vscale x 2 x i16> undef, splat (i16 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i16 = sdiv <vscale x 4 x i16> undef, splat (i16 -16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i16 = sdiv <vscale x 8 x i16> undef, splat (i16 -16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV8i16 = sdiv <vscale x 8 x i16> undef, splat (i16 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %NV16i16 = sdiv <vscale x 16 x i16> undef, splat (i16 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %NV32i16 = sdiv <vscale x 32 x i16> undef, splat (i16 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i8 = sdiv <vscale x 2 x i8> undef, splat (i8 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV4i8 = sdiv <vscale x 4 x i8> undef, splat (i8 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %NV8i8 = sdiv <vscale x 8 x i8> undef, splat (i8 -16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %NV16i8 = sdiv <vscale x 16 x i8> undef, splat (i8 -16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV16i8 = sdiv <vscale x 16 x i8> undef, splat (i8 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %NV32i8 = sdiv <vscale x 32 x i8> undef, splat (i8 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %NV64i8 = sdiv <vscale x 64 x i8> undef, splat (i8 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
@@ -576,22 +576,22 @@ define void @udiv_uniformconstnegpow2() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V16i8 = udiv <16 x i8> undef, splat (i8 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %V32i8 = udiv <32 x i8> undef, splat (i8 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %V64i8 = udiv <64 x i8> undef, splat (i8 -16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i64 = udiv <vscale x 2 x i64> undef, splat (i64 -16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV2i64 = udiv <vscale x 2 x i64> undef, splat (i64 -16)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %NV4i64 = udiv <vscale x 4 x i64> undef, splat (i64 -16)
 ; CHECK-NEXT:  Cost Model: Found a...
[truncated]

hassnaaHamdi · 2025-01-12T21:40:36Z

llvm/test/Analysis/CostModel/AArch64/sve-div.ll

@@ -260,22 +260,22 @@ define void @udiv_uniformconst() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %V16i8 = udiv <16 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %V32i8 = udiv <32 x i8> undef, splat (i8 7)
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %V64i8 = udiv <64 x i8> undef, splat (i8 7)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %NV2i64 = udiv <vscale x 2 x i64> undef, splat (i64 7)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 7 for instruction: %NV2i64 = udiv <vscale x 2 x i64> undef, splat (i64 7)


Hi David,
I see why the cost could be changed, but I can't understand how it makes sense that the legal type of <vscal x 2 x i64> has higher cost than a custom type like the one below: <vscale x 4 x i64>.

If expanding the div op for <vscal x 2 x i64> to the sequence (MULHS + ADD/SUB + SRA + SRL + ADD) results in higher cost, why do we do that expansion?

Hi - yeah I agree it is a bit odd, the costs are not super accurate so far. It is hard to find a very accurate cost for something that depends on the input data, but they should be improved in future patches.

The cost of a udiv/sdiv should not be 1 (or 2), which I think some of these are falling back to, neither for scalar or for sve vectors. The divides use an iterative algorithm that takes multiple cycles to complete and block other operations until they finish. So whilst the codesize cost can be 1, the recip-throughput and latency should be higher.

I believe the expansion is still better than using udiv/sdiv instructions (from experiments). These costs look better in #122236, and the others should be brought into line in future patches.

Hi @davemgreen, thanks for splitting this patch out from #122236, it's really helpful to see which changes are affecting which costs. However, I do agree with @hassnaaHamdi that something looks wrong here, which isn't fixed by #122236 either. I would assume that in @sdiv_uniformconst sdiv <vscale x 8 x i32> undef, splat (i64 7) gets legalised into two sdiv <vscale x 4 x i32> undef, splat (i64 7) instructions with the results being concatenated together. Right now the vectoriser will believe that VF=vscale x 8 is almost half the cost of VF=vscale x 4, which doesn't seem right. Something similar happens in sdiv_uniformconstnegpow2.

Having said that, in general I do agree this is a step in the right direction for most cases because previously the divide costs were too low. The throughputs for udiv/sdiv are at best 1/7 on neoverse-v1 according to the optimisation guide. This patch just increases the costs for divides of splats, and then #122236 follows on for the more general case. I do like this patch and happy to accept it, but perhaps worth understanding what's going on first with the illegal types?

sdesmalen-arm

I agree that it makes sense to investigate why the cost for illegal types gets assigned a lower cost. In a way this regresses the cost-model because for some types it now assumes that a div of a legal type is more expensive than an illegal type that requires legalisation.

sdesmalen-arm · 2025-01-13T09:34:57Z

llvm/lib/Analysis/TargetTransformInfo.cpp

+  if (isa<ConstantVector>(V) || isa<ConstantDataVector>(V) ||
+      isa<ConstantExpr>(V)) {


Rather than adding another case, which I think makes this sensitive to the plethora of constant subclasses, I think the logic here would be easier to follow if it would be structured as:

// Handle splats first since those are all uniform if (const Value *Splat = getSplatValue(V)) { if (auto *CI = dyn_cast<ConstantInt>(Splat)) { OpInfo = OK_UniformConstantValue; ... } else if (isa<Constant>(Splat)) OpInfo = OK_UniformConstantValue; else if (isa<Argument>(Splat) || isa<GlobalValue>(Splat)) OpInfo = OK_UniformValue; } else if (isa<Constant>(V)) { OpInfo = OK_NonUniformConstantValue; if (const auto *CDS = dyn_cast<ConstantDataSequential>(V)) { ... } }

This also allows folding in the argument/global value splats case.

davemgreen · 2025-02-01T13:45:06Z

Changed to use the splat to detect constants - this will make the difference that zeroinitializer is now treated as a OK_UniformConstantValue too. That can be changed if necessary.

Pulled out of llvm#122236, this allows Splats contants to be recognized in by getOperandInfo, allowing "better" costs for instructions like divides by constants to be produced (which are expanded into mul+add+shift). Some of the costs are not very accurate yet, but the comparison of scalar vs fixed-with vs scalable for the same fiv can become more accurate, especially with patches like llvm#122236.

davemgreen · 2025-02-13T16:40:00Z

Rebase and ping - this is hopefully a little simpler now. Thanks.

davemgreen

Ping

sdesmalen-arm · 2025-02-21T09:31:49Z

llvm/test/Analysis/CostModel/AArch64/sve-div.ll

-; CHECK-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %NV16i8 = sdiv <vscale x 16 x i8> undef, splat (i8 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %NV32i8 = sdiv <vscale x 32 x i8> undef, splat (i8 16)
-; CHECK-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %NV64i8 = sdiv <vscale x 64 x i8> undef, splat (i8 16)
+; CHECK-NEXT:  Cost Model: Found an estimated cost of 5 for instruction: %NV2i64 = sdiv <vscale x 2 x i64> undef, splat (i64 16)


I take it that the issue that @hassnaaHamdi reported has gone away with the latest revision?

Yep for the moment, due to the opt-opt in llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp. They should change again in a future patch, but that should happen in a more reliable way where it applies to all vector types.

davemgreen requested review from hassnaaHamdi, sdesmalen-arm, sjoerdmeijer and david-arm January 10, 2025 15:07

llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Jan 10, 2025

davemgreen mentioned this pull request Jan 10, 2025

[AArch64] Improve urem by constant costs #122236

Merged

hassnaaHamdi reviewed Jan 12, 2025

View reviewed changes

sdesmalen-arm reviewed Jan 13, 2025

View reviewed changes

davemgreen mentioned this pull request Jan 24, 2025

[AArch64][CostModel] Alter sdiv/srem cost where the divisor is constant #123552

Merged

davemgreen force-pushed the gh-a64-sveoperandinfo branch from 7321d54 to d48e165 Compare February 1, 2025 13:39

llvmbot added backend:AArch64 llvm:transforms labels Feb 1, 2025

davemgreen requested a review from topperc February 1, 2025 13:40

davemgreen requested a review from sushgokh February 4, 2025 06:35

davemgreen force-pushed the gh-a64-sveoperandinfo branch from d48e165 to 6fb39ff Compare February 13, 2025 16:37

davemgreen commented Feb 20, 2025

View reviewed changes

sdesmalen-arm approved these changes Feb 21, 2025

View reviewed changes

davemgreen merged commit b9622e8 into llvm:main Feb 21, 2025
8 checks passed

davemgreen deleted the gh-a64-sveoperandinfo branch February 21, 2025 10:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TTI][AArch64] Detect OperandInfo from scalable splats. #122469

[TTI][AArch64] Detect OperandInfo from scalable splats. #122469

davemgreen commented Jan 10, 2025

Uh oh!

llvmbot commented Jan 10, 2025 •

edited

Loading

Uh oh!

hassnaaHamdi Jan 12, 2025 •

edited

Loading

Uh oh!

davemgreen Jan 13, 2025

Uh oh!

david-arm Jan 13, 2025

Uh oh!

sdesmalen-arm left a comment

Uh oh!

sdesmalen-arm Jan 13, 2025

Uh oh!

davemgreen commented Feb 1, 2025

Uh oh!

davemgreen commented Feb 13, 2025

Uh oh!

davemgreen left a comment

Uh oh!

sdesmalen-arm Feb 21, 2025

Uh oh!

davemgreen Feb 21, 2025

Uh oh!

Uh oh!

Uh oh!

		if (isa<ConstantVector>(V) \|\| isa<ConstantDataVector>(V) \|\|
		isa<ConstantExpr>(V)) {

[TTI][AArch64] Detect OperandInfo from scalable splats. #122469

[TTI][AArch64] Detect OperandInfo from scalable splats. #122469

Conversation

davemgreen commented Jan 10, 2025

Uh oh!

llvmbot commented Jan 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hassnaaHamdi Jan 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davemgreen Jan 13, 2025

Choose a reason for hiding this comment

Uh oh!

david-arm Jan 13, 2025

Choose a reason for hiding this comment

Uh oh!

sdesmalen-arm left a comment

Choose a reason for hiding this comment

Uh oh!

sdesmalen-arm Jan 13, 2025

Choose a reason for hiding this comment

Uh oh!

davemgreen commented Feb 1, 2025

Uh oh!

davemgreen commented Feb 13, 2025

Uh oh!

davemgreen left a comment

Choose a reason for hiding this comment

Uh oh!

sdesmalen-arm Feb 21, 2025

Choose a reason for hiding this comment

Uh oh!

davemgreen Feb 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Jan 10, 2025 •

edited

Loading

hassnaaHamdi Jan 12, 2025 •

edited

Loading