[InstCombine][X86] Fold blendv(x,y,shuffle(bitcast(sext(m)))) -> select(shuffle(m),x,y) #96882

RKSimon · 2024-06-27T09:52:37Z

We already handle blendv(x,y,bitcast(sext(m))) -> select(m,x,y) cases, but this adds support for peeking through one-use shuffles as well. VectorCombine should already have canonicalized the IR to shuffle(bitcast(...)) for us.

The particular use case is where we have split generic 256/512-bit code to use target-specific blendv intrinsics (e.g. AVX1 spoofing AVX2 256-bit ops).

Fixes #58895

llvmbot · 2024-06-27T09:53:06Z

@llvm/pr-subscribers-backend-x86

@llvm/pr-subscribers-llvm-transforms

Author: Simon Pilgrim (RKSimon)

Changes

We already handle blendv(x,y,bitcast(sext(m))) -> select(m,x,y) cases, but this adds support for peeking through one-use shuffles as well. VectorCombine should already have canonicalized the IR to shuffle(bitcast(...)) for us.

The particular use case is where we have split generic 256/512-bit code to use target-specific blendv intrinsics (e.g. AVX1 spoofing AVX2 256-bit ops).

Fixes #58895

Patch is 60.84 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/96882.diff

3 Files Affected:

(modified) llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp (+34-2)
(modified) llvm/test/Transforms/PhaseOrdering/X86/blendv-select.ll (+121-165)
(modified) llvm/test/Transforms/PhaseOrdering/X86/pr67803.ll (+20-17)

diff --git a/llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp b/llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp
index 8cf502d820e90..374511945db59 100644
--- a/llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp
+++ b/llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp
@@ -2694,6 +2694,23 @@ X86TTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const {
       return SelectInst::Create(NewSelector, Op1, Op0, "blendv");
     }
 
+    // Peek through a one-use shuffle - VectorCombine should have simplified
+    // this for cases where we're splitting wider vectors to use blendv
+    // intrinsics.
+    Value *MaskSrc = nullptr;
+    ArrayRef<int> ShuffleMask;
+    if (match(Mask, PatternMatch::m_OneUse(PatternMatch::m_Shuffle(
+                        PatternMatch::m_Value(MaskSrc), PatternMatch::m_Undef(),
+                        PatternMatch::m_Mask(ShuffleMask))))) {
+      // Bail if the shuffle was irregular or contains undefs.
+      int NumElts = cast<FixedVectorType>(MaskSrc->getType())->getNumElements();
+      if (NumElts < ShuffleMask.size() || !isPowerOf2_32(NumElts) ||
+          any_of(ShuffleMask,
+                 [NumElts](int M) { return M < 0 || M >= NumElts; }))
+        break;
+      Mask = MaskSrc;
+    }
+
     // Convert to a vector select if we can bypass casts and find a boolean
     // vector condition value.
     Value *BoolVec;
@@ -2703,11 +2720,26 @@ X86TTIImpl::instCombineIntrinsic(InstCombiner &IC, IntrinsicInst &II) const {
         BoolVec->getType()->getScalarSizeInBits() == 1) {
       auto *MaskTy = cast<FixedVectorType>(Mask->getType());
       auto *OpTy = cast<FixedVectorType>(II.getType());
+      unsigned NumMaskElts = MaskTy->getNumElements();
+      unsigned NumOperandElts = OpTy->getNumElements();
+
+      // If we peeked through a shuffle, reapply the shuffle to the bool vector.
+      if (MaskSrc) {
+        unsigned NumMaskSrcElts =
+            cast<FixedVectorType>(MaskSrc->getType())->getNumElements();
+        NumMaskElts = (ShuffleMask.size() * NumMaskElts) / NumMaskSrcElts;
+        // Multiple mask bits maps to the same operand element - bail out.
+        if (NumMaskElts > NumOperandElts)
+          break;
+        SmallVector<int> ScaledMask;
+        if (!llvm::scaleShuffleMaskElts(NumMaskElts, ShuffleMask, ScaledMask))
+          break;
+        BoolVec = IC.Builder.CreateShuffleVector(BoolVec, ScaledMask);
+        MaskTy = FixedVectorType::get(MaskTy->getElementType(), NumMaskElts);
+      }
       assert(MaskTy->getPrimitiveSizeInBits() ==
                  OpTy->getPrimitiveSizeInBits() &&
              "Not expecting mask and operands with different sizes");
-      unsigned NumMaskElts = MaskTy->getNumElements();
-      unsigned NumOperandElts = OpTy->getNumElements();
 
       if (NumMaskElts == NumOperandElts) {
         return SelectInst::Create(BoolVec, Op1, Op0);
diff --git a/llvm/test/Transforms/PhaseOrdering/X86/blendv-select.ll b/llvm/test/Transforms/PhaseOrdering/X86/blendv-select.ll
index bccd189d12a82..67c9c333987f6 100644
--- a/llvm/test/Transforms/PhaseOrdering/X86/blendv-select.ll
+++ b/llvm/test/Transforms/PhaseOrdering/X86/blendv-select.ll
@@ -4,7 +4,7 @@
 ; RUN: opt < %s -O3 -S -mtriple=x86_64-- -mcpu=x86-64-v4 | FileCheck %s
 
 ;
-; TODO: PR58895 - replace shuffled _mm_blendv_epi8+icmp with select+icmp
+; PR58895 - replace shuffled _mm_blendv_epi8+icmp with select+icmp
 ;
 
 ;
@@ -13,21 +13,21 @@
 
 define <4 x double> @x86_pblendvb_v4f64_v2f64(<4 x double> %a, <4 x double> %b, <4 x double> %c, <4 x double> %d) {
 ; CHECK-LABEL: @x86_pblendvb_v4f64_v2f64(
-; CHECK-NEXT:    [[A_BC:%.*]] = bitcast <4 x double> [[A:%.*]] to <32 x i8>
-; CHECK-NEXT:    [[B_BC:%.*]] = bitcast <4 x double> [[B:%.*]] to <32 x i8>
-; CHECK-NEXT:    [[A_LO:%.*]] = shufflevector <32 x i8> [[A_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[B_LO:%.*]] = shufflevector <32 x i8> [[B_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[A_HI:%.*]] = shufflevector <32 x i8> [[A_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
-; CHECK-NEXT:    [[B_HI:%.*]] = shufflevector <32 x i8> [[B_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
 ; CHECK-NEXT:    [[CMP:%.*]] = fcmp olt <4 x double> [[C:%.*]], [[D:%.*]]
-; CHECK-NEXT:    [[SEXT:%.*]] = sext <4 x i1> [[CMP]] to <4 x i64>
-; CHECK-NEXT:    [[SEXT_BC:%.*]] = bitcast <4 x i64> [[SEXT]] to <32 x i8>
-; CHECK-NEXT:    [[SEXT_LO:%.*]] = shufflevector <32 x i8> [[SEXT_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[SEXT_HI:%.*]] = shufflevector <32 x i8> [[SEXT_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
-; CHECK-NEXT:    [[SEL_LO:%.*]] = tail call <16 x i8> @llvm.x86.sse41.pblendvb(<16 x i8> [[A_LO]], <16 x i8> [[B_LO]], <16 x i8> [[SEXT_LO]])
-; CHECK-NEXT:    [[SEL_HI:%.*]] = tail call <16 x i8> @llvm.x86.sse41.pblendvb(<16 x i8> [[A_HI]], <16 x i8> [[B_HI]], <16 x i8> [[SEXT_HI]])
-; CHECK-NEXT:    [[CONCAT:%.*]] = shufflevector <16 x i8> [[SEL_LO]], <16 x i8> [[SEL_HI]], <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
-; CHECK-NEXT:    [[RES:%.*]] = bitcast <32 x i8> [[CONCAT]] to <4 x double>
+; CHECK-NEXT:    [[TMP1:%.*]] = shufflevector <4 x i1> [[CMP]], <4 x i1> poison, <2 x i32> <i32 0, i32 1>
+; CHECK-NEXT:    [[TMP2:%.*]] = bitcast <4 x double> [[A:%.*]] to <4 x i64>
+; CHECK-NEXT:    [[TMP3:%.*]] = shufflevector <4 x i64> [[TMP2]], <4 x i64> poison, <2 x i32> <i32 0, i32 1>
+; CHECK-NEXT:    [[TMP4:%.*]] = bitcast <4 x double> [[B:%.*]] to <4 x i64>
+; CHECK-NEXT:    [[TMP5:%.*]] = shufflevector <4 x i64> [[TMP4]], <4 x i64> poison, <2 x i32> <i32 0, i32 1>
+; CHECK-NEXT:    [[TMP6:%.*]] = select <2 x i1> [[TMP1]], <2 x i64> [[TMP5]], <2 x i64> [[TMP3]]
+; CHECK-NEXT:    [[TMP7:%.*]] = shufflevector <4 x i1> [[CMP]], <4 x i1> poison, <2 x i32> <i32 2, i32 3>
+; CHECK-NEXT:    [[TMP8:%.*]] = bitcast <4 x double> [[A]] to <4 x i64>
+; CHECK-NEXT:    [[TMP9:%.*]] = shufflevector <4 x i64> [[TMP8]], <4 x i64> poison, <2 x i32> <i32 2, i32 3>
+; CHECK-NEXT:    [[TMP10:%.*]] = bitcast <4 x double> [[B]] to <4 x i64>
+; CHECK-NEXT:    [[TMP11:%.*]] = shufflevector <4 x i64> [[TMP10]], <4 x i64> poison, <2 x i32> <i32 2, i32 3>
+; CHECK-NEXT:    [[TMP12:%.*]] = select <2 x i1> [[TMP7]], <2 x i64> [[TMP11]], <2 x i64> [[TMP9]]
+; CHECK-NEXT:    [[TMP13:%.*]] = shufflevector <2 x i64> [[TMP6]], <2 x i64> [[TMP12]], <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[RES:%.*]] = bitcast <4 x i64> [[TMP13]] to <4 x double>
 ; CHECK-NEXT:    ret <4 x double> [[RES]]
 ;
   %a.bc = bitcast <4 x double> %a to <32 x i8>
@@ -50,21 +50,21 @@ define <4 x double> @x86_pblendvb_v4f64_v2f64(<4 x double> %a, <4 x double> %b,
 
 define <8 x float> @x86_pblendvb_v8f32_v4f32(<8 x float> %a, <8 x float> %b, <8 x float> %c, <8 x float> %d) {
 ; CHECK-LABEL: @x86_pblendvb_v8f32_v4f32(
-; CHECK-NEXT:    [[A_BC:%.*]] = bitcast <8 x float> [[A:%.*]] to <32 x i8>
-; CHECK-NEXT:    [[B_BC:%.*]] = bitcast <8 x float> [[B:%.*]] to <32 x i8>
-; CHECK-NEXT:    [[A_LO:%.*]] = shufflevector <32 x i8> [[A_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[B_LO:%.*]] = shufflevector <32 x i8> [[B_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[A_HI:%.*]] = shufflevector <32 x i8> [[A_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
-; CHECK-NEXT:    [[B_HI:%.*]] = shufflevector <32 x i8> [[B_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
 ; CHECK-NEXT:    [[CMP:%.*]] = fcmp olt <8 x float> [[C:%.*]], [[D:%.*]]
-; CHECK-NEXT:    [[SEXT:%.*]] = sext <8 x i1> [[CMP]] to <8 x i32>
-; CHECK-NEXT:    [[SEXT_BC:%.*]] = bitcast <8 x i32> [[SEXT]] to <32 x i8>
-; CHECK-NEXT:    [[SEXT_LO:%.*]] = shufflevector <32 x i8> [[SEXT_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[SEXT_HI:%.*]] = shufflevector <32 x i8> [[SEXT_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
-; CHECK-NEXT:    [[SEL_LO:%.*]] = tail call <16 x i8> @llvm.x86.sse41.pblendvb(<16 x i8> [[A_LO]], <16 x i8> [[B_LO]], <16 x i8> [[SEXT_LO]])
-; CHECK-NEXT:    [[SEL_HI:%.*]] = tail call <16 x i8> @llvm.x86.sse41.pblendvb(<16 x i8> [[A_HI]], <16 x i8> [[B_HI]], <16 x i8> [[SEXT_HI]])
-; CHECK-NEXT:    [[CONCAT:%.*]] = shufflevector <16 x i8> [[SEL_LO]], <16 x i8> [[SEL_HI]], <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
-; CHECK-NEXT:    [[RES:%.*]] = bitcast <32 x i8> [[CONCAT]] to <8 x float>
+; CHECK-NEXT:    [[TMP1:%.*]] = shufflevector <8 x i1> [[CMP]], <8 x i1> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[TMP2:%.*]] = bitcast <8 x float> [[A:%.*]] to <8 x i32>
+; CHECK-NEXT:    [[TMP3:%.*]] = shufflevector <8 x i32> [[TMP2]], <8 x i32> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[TMP4:%.*]] = bitcast <8 x float> [[B:%.*]] to <8 x i32>
+; CHECK-NEXT:    [[TMP5:%.*]] = shufflevector <8 x i32> [[TMP4]], <8 x i32> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[TMP6:%.*]] = select <4 x i1> [[TMP1]], <4 x i32> [[TMP5]], <4 x i32> [[TMP3]]
+; CHECK-NEXT:    [[TMP7:%.*]] = shufflevector <8 x i1> [[CMP]], <8 x i1> poison, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[TMP8:%.*]] = bitcast <8 x float> [[A]] to <8 x i32>
+; CHECK-NEXT:    [[TMP9:%.*]] = shufflevector <8 x i32> [[TMP8]], <8 x i32> poison, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[TMP10:%.*]] = bitcast <8 x float> [[B]] to <8 x i32>
+; CHECK-NEXT:    [[TMP11:%.*]] = shufflevector <8 x i32> [[TMP10]], <8 x i32> poison, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[TMP12:%.*]] = select <4 x i1> [[TMP7]], <4 x i32> [[TMP11]], <4 x i32> [[TMP9]]
+; CHECK-NEXT:    [[TMP13:%.*]] = shufflevector <4 x i32> [[TMP6]], <4 x i32> [[TMP12]], <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[RES:%.*]] = bitcast <8 x i32> [[TMP13]] to <8 x float>
 ; CHECK-NEXT:    ret <8 x float> [[RES]]
 ;
   %a.bc = bitcast <8 x float> %a to <32 x i8>
@@ -87,22 +87,9 @@ define <8 x float> @x86_pblendvb_v8f32_v4f32(<8 x float> %a, <8 x float> %b, <8
 
 define <4 x i64> @x86_pblendvb_v4i64_v2i64(<4 x i64> %a, <4 x i64> %b, <4 x i64> %c, <4 x i64> %d) {
 ; CHECK-LABEL: @x86_pblendvb_v4i64_v2i64(
-; CHECK-NEXT:    [[A_BC:%.*]] = bitcast <4 x i64> [[A:%.*]] to <32 x i8>
-; CHECK-NEXT:    [[B_BC:%.*]] = bitcast <4 x i64> [[B:%.*]] to <32 x i8>
-; CHECK-NEXT:    [[A_LO:%.*]] = shufflevector <32 x i8> [[A_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[B_LO:%.*]] = shufflevector <32 x i8> [[B_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[A_HI:%.*]] = shufflevector <32 x i8> [[A_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
-; CHECK-NEXT:    [[B_HI:%.*]] = shufflevector <32 x i8> [[B_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp slt <4 x i64> [[C:%.*]], [[D:%.*]]
-; CHECK-NEXT:    [[SEXT:%.*]] = sext <4 x i1> [[CMP]] to <4 x i64>
-; CHECK-NEXT:    [[SEXT_BC:%.*]] = bitcast <4 x i64> [[SEXT]] to <32 x i8>
-; CHECK-NEXT:    [[SEXT_LO:%.*]] = shufflevector <32 x i8> [[SEXT_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[SEXT_HI:%.*]] = shufflevector <32 x i8> [[SEXT_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
-; CHECK-NEXT:    [[SEL_LO:%.*]] = tail call <16 x i8> @llvm.x86.sse41.pblendvb(<16 x i8> [[A_LO]], <16 x i8> [[B_LO]], <16 x i8> [[SEXT_LO]])
-; CHECK-NEXT:    [[SEL_HI:%.*]] = tail call <16 x i8> @llvm.x86.sse41.pblendvb(<16 x i8> [[A_HI]], <16 x i8> [[B_HI]], <16 x i8> [[SEXT_HI]])
-; CHECK-NEXT:    [[CONCAT:%.*]] = shufflevector <16 x i8> [[SEL_LO]], <16 x i8> [[SEL_HI]], <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
-; CHECK-NEXT:    [[RES:%.*]] = bitcast <32 x i8> [[CONCAT]] to <4 x i64>
-; CHECK-NEXT:    ret <4 x i64> [[RES]]
+; CHECK-NEXT:    [[TMP1:%.*]] = select <4 x i1> [[CMP]], <4 x i64> [[B:%.*]], <4 x i64> [[A:%.*]]
+; CHECK-NEXT:    ret <4 x i64> [[TMP1]]
 ;
   %a.bc = bitcast <4 x i64> %a to <32 x i8>
   %b.bc = bitcast <4 x i64> %b to <32 x i8>
@@ -124,23 +111,23 @@ define <4 x i64> @x86_pblendvb_v4i64_v2i64(<4 x i64> %a, <4 x i64> %b, <4 x i64>
 
 define <4 x i64> @x86_pblendvb_v8i32_v4i32(<4 x i64> %a, <4 x i64> %b, <4 x i64> %c, <4 x i64> %d) {
 ; CHECK-LABEL: @x86_pblendvb_v8i32_v4i32(
-; CHECK-NEXT:    [[A_BC:%.*]] = bitcast <4 x i64> [[A:%.*]] to <32 x i8>
-; CHECK-NEXT:    [[B_BC:%.*]] = bitcast <4 x i64> [[B:%.*]] to <32 x i8>
 ; CHECK-NEXT:    [[C_BC:%.*]] = bitcast <4 x i64> [[C:%.*]] to <8 x i32>
 ; CHECK-NEXT:    [[D_BC:%.*]] = bitcast <4 x i64> [[D:%.*]] to <8 x i32>
-; CHECK-NEXT:    [[A_LO:%.*]] = shufflevector <32 x i8> [[A_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[B_LO:%.*]] = shufflevector <32 x i8> [[B_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[A_HI:%.*]] = shufflevector <32 x i8> [[A_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
-; CHECK-NEXT:    [[B_HI:%.*]] = shufflevector <32 x i8> [[B_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp slt <8 x i32> [[C_BC]], [[D_BC]]
-; CHECK-NEXT:    [[SEXT:%.*]] = sext <8 x i1> [[CMP]] to <8 x i32>
-; CHECK-NEXT:    [[SEXT_BC:%.*]] = bitcast <8 x i32> [[SEXT]] to <32 x i8>
-; CHECK-NEXT:    [[SEXT_LO:%.*]] = shufflevector <32 x i8> [[SEXT_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[SEXT_HI:%.*]] = shufflevector <32 x i8> [[SEXT_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
-; CHECK-NEXT:    [[SEL_LO:%.*]] = tail call <16 x i8> @llvm.x86.sse41.pblendvb(<16 x i8> [[A_LO]], <16 x i8> [[B_LO]], <16 x i8> [[SEXT_LO]])
-; CHECK-NEXT:    [[SEL_HI:%.*]] = tail call <16 x i8> @llvm.x86.sse41.pblendvb(<16 x i8> [[A_HI]], <16 x i8> [[B_HI]], <16 x i8> [[SEXT_HI]])
-; CHECK-NEXT:    [[CONCAT:%.*]] = shufflevector <16 x i8> [[SEL_LO]], <16 x i8> [[SEL_HI]], <32 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15, i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
-; CHECK-NEXT:    [[RES:%.*]] = bitcast <32 x i8> [[CONCAT]] to <4 x i64>
+; CHECK-NEXT:    [[TMP1:%.*]] = shufflevector <8 x i1> [[CMP]], <8 x i1> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[TMP2:%.*]] = bitcast <4 x i64> [[A:%.*]] to <8 x i32>
+; CHECK-NEXT:    [[TMP3:%.*]] = shufflevector <8 x i32> [[TMP2]], <8 x i32> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[TMP4:%.*]] = bitcast <4 x i64> [[B:%.*]] to <8 x i32>
+; CHECK-NEXT:    [[TMP5:%.*]] = shufflevector <8 x i32> [[TMP4]], <8 x i32> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[TMP6:%.*]] = select <4 x i1> [[TMP1]], <4 x i32> [[TMP5]], <4 x i32> [[TMP3]]
+; CHECK-NEXT:    [[TMP7:%.*]] = shufflevector <8 x i1> [[CMP]], <8 x i1> poison, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[TMP8:%.*]] = bitcast <4 x i64> [[A]] to <8 x i32>
+; CHECK-NEXT:    [[TMP9:%.*]] = shufflevector <8 x i32> [[TMP8]], <8 x i32> poison, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[TMP10:%.*]] = bitcast <4 x i64> [[B]] to <8 x i32>
+; CHECK-NEXT:    [[TMP11:%.*]] = shufflevector <8 x i32> [[TMP10]], <8 x i32> poison, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[TMP12:%.*]] = select <4 x i1> [[TMP7]], <4 x i32> [[TMP11]], <4 x i32> [[TMP9]]
+; CHECK-NEXT:    [[TMP13:%.*]] = shufflevector <4 x i32> [[TMP6]], <4 x i32> [[TMP12]], <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[RES:%.*]] = bitcast <8 x i32> [[TMP13]] to <4 x i64>
 ; CHECK-NEXT:    ret <4 x i64> [[RES]]
 ;
   %a.bc = bitcast <4 x i64> %a to <32 x i8>
@@ -165,23 +152,23 @@ define <4 x i64> @x86_pblendvb_v8i32_v4i32(<4 x i64> %a, <4 x i64> %b, <4 x i64>
 
 define <4 x i64> @x86_pblendvb_v16i16_v8i16(<4 x i64> %a, <4 x i64> %b, <4 x i64> %c, <4 x i64> %d) {
 ; CHECK-LABEL: @x86_pblendvb_v16i16_v8i16(
-; CHECK-NEXT:    [[A_BC:%.*]] = bitcast <4 x i64> [[A:%.*]] to <32 x i8>
-; CHECK-NEXT:    [[B_BC:%.*]] = bitcast <4 x i64> [[B:%.*]] to <32 x i8>
 ; CHECK-NEXT:    [[C_BC:%.*]] = bitcast <4 x i64> [[C:%.*]] to <16 x i16>
 ; CHECK-NEXT:    [[D_BC:%.*]] = bitcast <4 x i64> [[D:%.*]] to <16 x i16>
-; CHECK-NEXT:    [[A_LO:%.*]] = shufflevector <32 x i8> [[A_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[B_LO:%.*]] = shufflevector <32 x i8> [[B_BC]], <32 x i8> poison, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7, i32 8, i32 9, i32 10, i32 11, i32 12, i32 13, i32 14, i32 15>
-; CHECK-NEXT:    [[A_HI:%.*]] = shufflevector <32 x i8> [[A_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
-; CHECK-NEXT:    [[B_HI:%.*]] = shufflevector <32 x i8> [[B_BC]], <32 x i8> poison, <16 x i32> <i32 16, i32 17, i32 18, i32 19, i32 20, i32 21, i32 22, i32 23, i32 24, i32 25, i32 26, i32 27, i32 28, i32 29, i32 30, i32 31>
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp slt <16 x i16> [[C_BC]], [[D_BC...
[truncated]

RKSimon · 2024-07-02T14:30:05Z

ping?

KanRobert

LGTM

goldsteinn · 2024-07-03T07:17:06Z

llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp

+    ArrayRef<int> ShuffleMask;
+    if (match(Mask, PatternMatch::m_OneUse(PatternMatch::m_Shuffle(
+                        PatternMatch::m_Value(MaskSrc), PatternMatch::m_Undef(),
+                        PatternMatch::m_Mask(ShuffleMask))))) {


nit: We should probably just start using PatternMatch namespace at the top of the function. That can be a later NFC though.

goldsteinn · 2024-07-03T07:19:35Z

llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp

+    // Peek through a one-use shuffle - VectorCombine should have simplified
+    // this for cases where we're splitting wider vectors to use blendv
+    // intrinsics.
+    Value *MaskSrc = nullptr;


Should we peekThroughBitcast on Mask here?

I'll investigate this as a followup, but I've already tried very hard on VectorCombine to make sure we simplify bitcast(shuffle(bitcast(shuffle()))) chains that we would have to peek through

goldsteinn · 2024-07-03T07:28:43Z

llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp

+      if (MaskSrc) {
+        unsigned NumMaskSrcElts =
+            cast<FixedVectorType>(MaskSrc->getType())->getNumElements();
+        NumMaskElts = (ShuffleMask.size() * NumMaskElts) / NumMaskSrcElts;


Isn't ShuffleMask.size() equal to NumMaskSrcElts?

IR can have different result and argument vector widths (this patch is mainly to help extract subvector style shuffles after all) - only DAG is fixed

goldsteinn · 2024-07-03T07:32:42Z

llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp

+            cast<FixedVectorType>(MaskSrc->getType())->getNumElements();
+        NumMaskElts = (ShuffleMask.size() * NumMaskElts) / NumMaskSrcElts;
+        // Multiple mask bits maps to the same operand element - bail out.
+        if (NumMaskElts > NumOperandElts)


isn't the scaleShuffleMaskElts check sufficient? If the shuffle is only smaller type but keep all OpTy elements together isn't that fine?

we don't handle NumMaskElts > NumOperandElts (IIRC due to endianess issues) - this is an early out so we don't bother creating an unused shuffle instruction

…ct(shuffle(m),x,y) We already handle blendv(x,y,bitcast(sext(m))) -> select(m,x,y) cases, but this adds support for peeking through one-use shuffles as well. VectorCombine should already have canonicalized the IR to shuffle(bitcast(...)) for us. The particular use case is where we have split generic 256/512-bit code to use target-specific blendv intrinsics (e.g. AVX1 spoofing AVX2 256-bit ops). Fixes llvm#58895

llvm-ci · 2024-07-03T11:41:35Z

LLVM Buildbot has detected a new failure on builder ppc64le-lld-multistage-test running on ppc64le-lld-multistage-test while building llvm at step 12 "build-stage2-unified-tree".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/168/builds/662

Here is the relevant piece of the build log for the reference:

Step 12 (build-stage2-unified-tree) failure: build (failure)
...
302.763 [905/440/4816] Building AMDGPUGenPreLegalizeGICombiner.inc...
302.798 [905/439/4817] Building CXX object lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64CallingConvention.cpp.o
302.817 [905/438/4818] Building AMDGPUGenDisassemblerTables.inc...
302.822 [905/437/4819] Building CXX object tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaPPC.cpp.o
302.830 [905/436/4820] Building CXX object lib/Target/ARM/Disassembler/CMakeFiles/LLVMARMDisassembler.dir/ARMDisassembler.cpp.o
302.858 [904/436/4821] Linking CXX static library lib/libLLVMARMDisassembler.a
302.893 [904/435/4822] Building CXX object lib/Target/XCore/CMakeFiles/LLVMXCoreCodeGen.dir/XCoreTargetMachine.cpp.o
302.904 [903/435/4823] Building AMDGPUGenMCPseudoLowering.inc...
302.925 [903/434/4824] Linking CXX static library lib/libLLVMXCoreCodeGen.a
302.954 [903/433/4825] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86InstCombineIntrinsic.cpp.o
FAILED: lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86InstCombineIntrinsic.cpp.o 
ccache /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/install/stage1/bin/clang++ -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage2/lib/Target/X86 -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/lib/Target/X86 -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/build/stage2/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/include -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -O3 -DNDEBUG -std=c++17 -fvisibility=hidden  -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86InstCombineIntrinsic.cpp.o -MF lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86InstCombineIntrinsic.cpp.o.d -o lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86InstCombineIntrinsic.cpp.o -c /home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp
/home/buildbots/llvm-external-buildbots/workers/ppc64le-lld-multistage-test/ppc64le-lld-multistage-test/llvm-project/llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp:2813:19: error: comparison of integers of different signs: 'int' and 'size_t' (aka 'unsigned long') [-Werror,-Wsign-compare]
 2813 |       if (NumElts < ShuffleMask.size() || !isPowerOf2_32(NumElts) ||
      |           ~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~
1 error generated.
302.975 [903/432/4826] Building CXX object tools/clang/lib/CodeGen/CMakeFiles/obj.clangCodeGen.dir/CGException.cpp.o
303.053 [903/431/4827] Building AMDGPUGenPostLegalizeGICombiner.inc...
303.102 [903/430/4828] Building CXX object lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64StackTaggingPreRA.cpp.o
303.142 [903/429/4829] Building CXX object tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaWasm.cpp.o
303.160 [903/428/4830] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyLowerEmscriptenEHSjLj.cpp.o
303.175 [903/427/4831] Building CXX object tools/clang/lib/Interpreter/CMakeFiles/obj.clangInterpreter.dir/IncrementalParser.cpp.o
303.181 [903/426/4832] Building CXX object lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64A57FPLoadBalancing.cpp.o
303.198 [903/425/4833] Building CXX object tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/CheckExprLifetime.cpp.o
303.241 [903/424/4834] Building CXX object unittests/ExecutionEngine/Orc/CMakeFiles/OrcJITTests.dir/SharedMemoryMapperTest.cpp.o
303.294 [903/423/4835] Building CXX object lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64CollectLOH.cpp.o
303.333 [903/422/4836] Building CXX object tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaAccess.cpp.o
303.432 [903/421/4837] Building CXX object tools/clang/unittests/StaticAnalyzer/CMakeFiles/StaticAnalysisTests.dir/Z3CrosscheckOracleTest.cpp.o
303.498 [903/420/4838] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86MCInstLower.cpp.o
303.517 [903/419/4839] Building CXX object unittests/ExecutionEngine/Orc/CMakeFiles/OrcJITTests.dir/RTDyldObjectLinkingLayerTest.cpp.o
303.549 [903/418/4840] Building CXX object lib/Target/WebAssembly/CMakeFiles/LLVMWebAssemblyCodeGen.dir/WebAssemblyISelLowering.cpp.o
303.559 [903/417/4841] Building CXX object tools/clang/lib/CodeGen/CMakeFiles/obj.clangCodeGen.dir/CGVTables.cpp.o
303.590 [903/416/4842] Building CXX object unittests/Target/VE/CMakeFiles/VETests.dir/MachineInstrTest.cpp.o
303.618 [903/415/4843] Building CXX object lib/Target/AArch64/CMakeFiles/LLVMAArch64CodeGen.dir/AArch64ExpandPseudoInsts.cpp.o
303.698 [903/414/4844] Building CXX object tools/clang/lib/Driver/CMakeFiles/obj.clangDriver.dir/Driver.cpp.o
303.745 [903/413/4845] Building CXX object tools/clang/lib/Sema/CMakeFiles/obj.clangSema.dir/SemaModule.cpp.o
303.746 [903/412/4846] Building CXX object lib/Target/ARM/CMakeFiles/LLVMARMCodeGen.dir/ARMISelDAGToDAG.cpp.o
303.751 [903/411/4847] Building CXX object tools/clang/lib/CodeGen/CMakeFiles/obj.clangCodeGen.dir/CGBlocks.cpp.o
303.753 [903/410/4848] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86PreTileConfig.cpp.o
303.797 [903/409/4849] Building CXX object tools/clang/lib/CodeGen/CMakeFiles/obj.clangCodeGen.dir/ItaniumCXXABI.cpp.o
303.970 [903/408/4850] Building AMDGPUGenSubtargetInfo.inc...
303.997 [903/407/4851] Building CXX object tools/clang/lib/Index/CMakeFiles/obj.clangIndex.dir/USRGeneration.cpp.o
304.000 [903/406/4852] Building CXX object lib/Target/X86/CMakeFiles/LLVMX86CodeGen.dir/X86CallingConv.cpp.o
304.027 [903/405/4853] Building RISCVGenDAGISel.inc...
304.040 [903/404/4854] Building CXX object lib/Target/ARM/CMakeFiles/LLVMARMCodeGen.dir/ARMInstructionSelector.cpp.o
304.067 [903/403/4855] Building CXX object unittests/Target/AArch64/CMakeFiles/AArch64Tests.dir/MatrixRegisterAliasing.cpp.o
304.213 [903/402/4856] Building CXX object tools/clang/lib/Serialization/CMakeFiles/obj.clangSerialization.dir/ASTWriterDecl.cpp.o
304.226 [903/401/4857] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/Stmt.cpp.o
304.286 [903/400/4858] Building CXX object tools/clang/lib/Serialization/CMakeFiles/obj.clangSerialization.dir/ASTWriterStmt.cpp.o

…ct(shuffle(m),x,y) (llvm#96882) We already handle blendv(x,y,bitcast(sext(m))) -> select(m,x,y) cases, but this adds support for peeking through one-use shuffles as well. VectorCombine should already have canonicalized the IR to shuffle(bitcast(...)) for us. The particular use case is where we have split generic 256/512-bit code to use target-specific blendv intrinsics (e.g. AVX1 spoofing AVX2 256-bit ops). Fixes llvm#58895

…NFC. Followup requested on #96882

…huffle+bitcast sequence to fold BLENDV to SELECT Mentioned on #96882

… folding BLENDV to SELECT Mentioned on #96882

…ct(shuffle(m),x,y) (llvm#96882) We already handle blendv(x,y,bitcast(sext(m))) -> select(m,x,y) cases, but this adds support for peeking through one-use shuffles as well. VectorCombine should already have canonicalized the IR to shuffle(bitcast(...)) for us. The particular use case is where we have split generic 256/512-bit code to use target-specific blendv intrinsics (e.g. AVX1 spoofing AVX2 256-bit ops). Fixes llvm#58895

…NFC. Followup requested on llvm#96882

…huffle+bitcast sequence to fold BLENDV to SELECT Mentioned on llvm#96882

… folding BLENDV to SELECT Mentioned on llvm#96882

RKSimon requested review from phoebewang, KanRobert and goldsteinn June 27, 2024 09:52

llvmbot added backend:X86 llvm:transforms labels Jun 27, 2024

RKSimon mentioned this pull request Jun 27, 2024

[VectorCombine] foldShuffleToIdentity can't handle BITCAST nodes #96884

Open

KanRobert approved these changes Jul 2, 2024

View reviewed changes

RKSimon force-pushed the x86-blendv-select branch from 822a6f7 to 65f5db4 Compare July 2, 2024 17:28

goldsteinn reviewed Jul 3, 2024

View reviewed changes

RKSimon force-pushed the x86-blendv-select branch from 65f5db4 to 0e5ff35 Compare July 3, 2024 10:54

RKSimon merged commit 7de7f50 into llvm:main Jul 3, 2024
4 of 6 checks passed

RKSimon deleted the x86-blendv-select branch July 3, 2024 11:21

RKSimon added a commit that referenced this pull request Jul 5, 2024

[InstCombine][X86] Pull out repeated uses of PatternMatch namespace. …

2bc474b

…NFC. Followup requested on #96882

RKSimon added a commit that referenced this pull request Jul 5, 2024

[InstCombine][X86] Add test showing failure to peek through bitcast+s…

0488f21

…huffle+bitcast sequence to fold BLENDV to SELECT Mentioned on #96882

RKSimon added a commit that referenced this pull request Jul 5, 2024

[InstCombine][X86] Peek through bitcast+shuffle+bitcast sequence when…

6c1c97c

… folding BLENDV to SELECT Mentioned on #96882

kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024

[InstCombine][X86] Pull out repeated uses of PatternMatch namespace. …

7f53b68

…NFC. Followup requested on llvm#96882

kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024

[InstCombine][X86] Add test showing failure to peek through bitcast+s…

08a03a2

…huffle+bitcast sequence to fold BLENDV to SELECT Mentioned on llvm#96882

kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024

[InstCombine][X86] Peek through bitcast+shuffle+bitcast sequence when…

fb53853

… folding BLENDV to SELECT Mentioned on llvm#96882

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[InstCombine][X86] Fold blendv(x,y,shuffle(bitcast(sext(m)))) -> select(shuffle(m),x,y) #96882

[InstCombine][X86] Fold blendv(x,y,shuffle(bitcast(sext(m)))) -> select(shuffle(m),x,y) #96882

Uh oh!

RKSimon commented Jun 27, 2024

Uh oh!

llvmbot commented Jun 27, 2024 •

edited

Loading

Uh oh!

RKSimon commented Jul 2, 2024

Uh oh!

KanRobert left a comment

Uh oh!

goldsteinn Jul 3, 2024

Uh oh!

goldsteinn Jul 3, 2024

Uh oh!

RKSimon Jul 3, 2024

Uh oh!

goldsteinn Jul 3, 2024

Uh oh!

RKSimon Jul 3, 2024

Uh oh!

goldsteinn Jul 3, 2024

Uh oh!

RKSimon Jul 3, 2024

Uh oh!

Uh oh!

llvm-ci commented Jul 3, 2024

Uh oh!

Uh oh!

[InstCombine][X86] Fold blendv(x,y,shuffle(bitcast(sext(m)))) -> select(shuffle(m),x,y) #96882

[InstCombine][X86] Fold blendv(x,y,shuffle(bitcast(sext(m)))) -> select(shuffle(m),x,y) #96882

Uh oh!

Conversation

RKSimon commented Jun 27, 2024

Uh oh!

llvmbot commented Jun 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RKSimon commented Jul 2, 2024

Uh oh!

KanRobert left a comment

Choose a reason for hiding this comment

Uh oh!

goldsteinn Jul 3, 2024

Choose a reason for hiding this comment

Uh oh!

goldsteinn Jul 3, 2024

Choose a reason for hiding this comment

Uh oh!

RKSimon Jul 3, 2024

Choose a reason for hiding this comment

Uh oh!

goldsteinn Jul 3, 2024

Choose a reason for hiding this comment

Uh oh!

RKSimon Jul 3, 2024

Choose a reason for hiding this comment

Uh oh!

goldsteinn Jul 3, 2024

Choose a reason for hiding this comment

Uh oh!

RKSimon Jul 3, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvm-ci commented Jul 3, 2024

Uh oh!

Uh oh!

llvmbot commented Jun 27, 2024 •

edited

Loading