[VectorCombine] Fold ZExt/SExt (Shuffle (ZExt/SExt %src)) to ZExt/SExt (Shuffle %src). #141109

fhahn · 2025-05-22T17:39:21Z

Add a new combine that folds a chain of (scalar load)->ext->ext (with
shuffles/casts/inserts in between) to a single vector load and wide
extend.

This makes the IR simpler to analyze and to process, while the backend
can still decide to break them up. Code like that comes from code
written with vector intrinsics. Some examples of real-world use are in
https://github.com/ARM-software/astc-encoder/.

Alive2 proof for presering NNeg https://alive2.llvm.org/ce/z/HyfyPo

llvmbot · 2025-05-22T17:39:59Z

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

Add a new combine that folds a chain of (scalar load)->ext->ext (with
shuffles/casts/inserts in between) to a single vector load and wide
extend.

This makes the IR simpler to analyze and to process, while the backend
can still decide to break them up. Code like that comes from code
written with vector intrinsics. Some examples of real-world use are in
https://github.com/ARM-software/astc-encoder/.

Patch is 26.44 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/141109.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/VectorCombine.cpp (+51)
(added) llvm/test/Transforms/VectorCombine/AArch64/combine-shuffle-ext.ll (+472)

diff --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index fe1d930f295ce..3de60adcd4b2f 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -127,6 +127,7 @@ class VectorCombine {
   bool foldShuffleOfShuffles(Instruction &I);
   bool foldShuffleOfIntrinsics(Instruction &I);
   bool foldShuffleToIdentity(Instruction &I);
+  bool foldShuffleExtExtracts(Instruction &I);
   bool foldShuffleFromReductions(Instruction &I);
   bool foldCastFromReductions(Instruction &I);
   bool foldSelectShuffle(Instruction &I, bool FromReduction = false);
@@ -2777,6 +2778,55 @@ bool VectorCombine::foldShuffleToIdentity(Instruction &I) {
   return true;
 }
 
+bool VectorCombine::foldShuffleExtExtracts(Instruction &I) {
+  // Try to fold vector zero- and sign-extends split across multiple operations
+  // into a single extend, removing redundant inserts and shuffles.
+
+  // Check if we have an extended shuffle that selects the first vector, which
+  // itself is another extend fed by a load.
+  Instruction *L;
+  if (!match(
+          &I,
+          m_OneUse(m_Shuffle(
+              m_OneUse(m_ZExtOrSExt(m_OneUse(m_BitCast(m_OneUse(m_InsertElt(
+                  m_Value(), m_OneUse(m_Instruction(L)), m_SpecificInt(0))))))),
+              m_Value()))) ||
+      !cast<ShuffleVectorInst>(&I)->isIdentityWithExtract() ||
+      !isa<LoadInst>(L))
+    return false;
+  auto *InnerExt = cast<Instruction>(I.getOperand(0));
+  auto *OuterExt = dyn_cast<Instruction>(*I.user_begin());
+  if (!isa<SExtInst, ZExtInst>(OuterExt))
+    return false;
+
+  // If the inner extend is a sign extend and the outer one isnt (i.e. a
+  // zero-extend), don't fold. If the first one is zero-extend, it doesn't
+  // matter if the second one is a sign- or zero-extend.
+  if (isa<SExtInst>(InnerExt) && !isa<SExtInst>(OuterExt))
+    return false;
+
+  // Don't try to convert the load if it has an odd size.
+  if (!DL->typeSizeEqualsStoreSize(L->getType()))
+    return false;
+  auto *DstTy = cast<FixedVectorType>(OuterExt->getType());
+  auto *SrcTy =
+      FixedVectorType::get(InnerExt->getOperand(0)->getType()->getScalarType(),
+                           DstTy->getNumElements());
+  if (DL->getTypeStoreSize(SrcTy) != DL->getTypeStoreSize(L->getType()))
+    return false;
+
+  // Convert to a vector load feeding a single wide extend.
+  Builder.SetInsertPoint(*L->getInsertionPointAfterDef());
+  auto *NewLoad = cast<LoadInst>(
+      Builder.CreateLoad(SrcTy, L->getOperand(0), L->getName() + ".vec"));
+  auto *NewExt = isa<ZExtInst>(InnerExt) ? Builder.CreateZExt(NewLoad, DstTy)
+                                         : Builder.CreateSExt(NewLoad, DstTy);
+  OuterExt->replaceAllUsesWith(NewExt);
+  replaceValue(*OuterExt, *NewExt);
+  Worklist.pushValue(NewLoad);
+  return true;
+}
+
 /// Given a commutative reduction, the order of the input lanes does not alter
 /// the results. We can use this to remove certain shuffles feeding the
 /// reduction, removing the need to shuffle at all.
@@ -3551,6 +3601,7 @@ bool VectorCombine::run() {
         break;
       case Instruction::ShuffleVector:
         MadeChange |= widenSubvectorLoad(I);
+        MadeChange |= foldShuffleExtExtracts(I);
         break;
       default:
         break;
diff --git a/llvm/test/Transforms/VectorCombine/AArch64/combine-shuffle-ext.ll b/llvm/test/Transforms/VectorCombine/AArch64/combine-shuffle-ext.ll
new file mode 100644
index 0000000000000..6dca21471a0e9
--- /dev/null
+++ b/llvm/test/Transforms/VectorCombine/AArch64/combine-shuffle-ext.ll
@@ -0,0 +1,472 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -p vector-combine -mtriple=arm64-apple-macosx -S %s | FileCheck %s
+
+declare void @use.i32(i32)
+declare void @use.v2i32(<2 x i32>)
+declare void @use.v8i8(<8 x i8>)
+declare void @use.v8i16(<8 x i16>)
+declare void @use.v4i16(<4 x i16>)
+
+define <4 x i32> @load_i32_zext_to_v4i32(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L_VEC:%.*]] = load <4 x i8>, ptr [[DI]], align 4
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext <4 x i8> [[L_VEC]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_both_nneg(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_both_nneg(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L_VEC:%.*]] = load <4 x i8>, ptr [[DI]], align 4
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext <4 x i8> [[L_VEC]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext nneg <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_clobber_after_load(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_clobber_after_load(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L_VEC:%.*]] = load <4 x i8>, ptr [[DI]], align 4
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext <4 x i8> [[L_VEC]] to <4 x i32>
+; CHECK-NEXT:    call void @use.i32(i32 0)
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  call void @use.i32(i32 0)
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_sext_zext_to_v4i32(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_sext_zext_to_v4i32(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[E_1:%.*]] = sext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[E_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = sext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_load_other_users(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_load_other_users(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[E_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[E_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    call void @use.i32(i32 [[L]])
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  call void @use.i32(i32 %l)
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_ins_other_users(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_ins_other_users(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[E_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[E_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    call void @use.v2i32(<2 x i32> [[VEC_INS]])
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  call void @use.v2i32(<2 x i32> %vec.ins)
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_bc_other_users(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_bc_other_users(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[E_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[E_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    call void @use.v8i8(<8 x i8> [[VEC_BC]])
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  call void @use.v8i8(<8 x i8> %vec.bc)
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_ext_other_users(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_ext_other_users(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[E_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[E_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    call void @use.v8i16(<8 x i16> [[E_1]])
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  call void @use.v8i16(<8 x i16> %e.1)
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_shuffle_other_users(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_shuffle_other_users(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[E_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[E_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    call void @use.v8i16(<4 x i16> [[VEC_SHUFFLE]])
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  call void @use.v8i16(<4 x i16> %vec.shuffle)
+  ret <4 x i32> %ext.2
+}
+
+define <8 x i32> @load_i64_zext_to_v8i32(ptr %di) {
+; CHECK-LABEL: define <8 x i32> @load_i64_zext_to_v8i32(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L_VEC:%.*]] = load <8 x i8>, ptr [[DI]], align 8
+; CHECK-NEXT:    [[OUTER_EXT:%.*]] = zext <8 x i8> [[L_VEC]] to <8 x i32>
+; CHECK-NEXT:    ret <8 x i32> [[OUTER_EXT]]
+;
+entry:
+  %l = load i64, ptr %di
+  %vec.ins = insertelement <2 x i64> <i64 poison, i64 0>, i64 %l, i64 0
+  %vec.bc = bitcast <2 x i64> %vec.ins to <16 x i8>
+  %ext.1 = zext <16 x i8> %vec.bc to <16 x i16>
+  %vec.shuffle = shufflevector <16 x i16> %ext.1, <16 x i16> poison, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
+  %outer.ext = zext nneg <8 x i16> %vec.shuffle to <8 x i32>
+  ret <8 x i32> %outer.ext
+}
+
+define <3 x i32> @load_i24_zext_to_v3i32(ptr %di) {
+; CHECK-LABEL: define <3 x i32> @load_i24_zext_to_v3i32(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L_VEC:%.*]] = load <3 x i8>, ptr [[DI]], align 4
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext <3 x i8> [[L_VEC]] to <3 x i32>
+; CHECK-NEXT:    ret <3 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i24, ptr %di
+  %vec.ins = insertelement <2 x i24> <i24 poison, i24 0>, i24 %l, i64 0
+  %vec.bc = bitcast <2 x i24> %vec.ins to <6 x i8>
+  %ext.1 = zext <6 x i8> %vec.bc to <6 x i16>
+  %vec.shuffle = shufflevector <6 x i16> %ext.1, <6 x i16> poison, <3 x i32> <i32 0, i32 1, i32 2>
+  %ext.2 = zext nneg <3 x i16> %vec.shuffle to <3 x i32>
+  ret <3 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_insert_idx_1_sext(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_insert_idx_1_sext(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 0, i32 poison>, i32 [[L]], i64 1
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[EXT_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[EXT_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 0, i32 poison>, i32 %l, i64 1
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %ext.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %ext.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2 = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @mask_extracts_not_all_elements_1_sext(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @mask_extracts_not_all_elements_1_sext(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[EXT_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[EXT_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 2>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %ext.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %ext.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 2>
+  %ext.2 = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @mask_extracts_not_all_elements_2_sext(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @mask_extracts_not_all_elements_2_sext(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[EXT_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[EXT_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 4>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %ext.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %ext.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 4>
+  %ext.2 = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @mask_extracts_second_vector_sext(ptr %di, <8 x i16> %other) {
+; CHECK-LABEL: define <4 x i32> @mask_extracts_second_vector_sext(
+; CHECK-SAME: ptr [[DI:%.*]], <8 x i16> [[OTHER:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[EXT_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[EXT_1]], <8 x i16> [[OTHER]], <4 x i32> <i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %ext.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %ext.1, <8 x i16> %other, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
+  %ext.2 = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_sext_to_v4i32(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_sext_to_v4i32(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L_VEC:%.*]] = load <4 x i8>, ptr [[DI]], align 4
+; CHECK-NEXT:    [[EXT_2:%.*]] = sext <4 x i8> [[L_VEC]] to <4 x i32>
+; CHEC...
[truncated]

llvmbot · 2025-05-22T17:39:59Z

@llvm/pr-subscribers-vectorizers

Author: Florian Hahn (fhahn)

Changes

Add a new combine that folds a chain of (scalar load)->ext->ext (with
shuffles/casts/inserts in between) to a single vector load and wide
extend.

This makes the IR simpler to analyze and to process, while the backend
can still decide to break them up. Code like that comes from code
written with vector intrinsics. Some examples of real-world use are in
https://github.com/ARM-software/astc-encoder/.

Patch is 26.44 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/141109.diff

2 Files Affected:

(modified) llvm/lib/Transforms/Vectorize/VectorCombine.cpp (+51)
(added) llvm/test/Transforms/VectorCombine/AArch64/combine-shuffle-ext.ll (+472)

diff --git a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
index fe1d930f295ce..3de60adcd4b2f 100644
--- a/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+++ b/llvm/lib/Transforms/Vectorize/VectorCombine.cpp
@@ -127,6 +127,7 @@ class VectorCombine {
   bool foldShuffleOfShuffles(Instruction &I);
   bool foldShuffleOfIntrinsics(Instruction &I);
   bool foldShuffleToIdentity(Instruction &I);
+  bool foldShuffleExtExtracts(Instruction &I);
   bool foldShuffleFromReductions(Instruction &I);
   bool foldCastFromReductions(Instruction &I);
   bool foldSelectShuffle(Instruction &I, bool FromReduction = false);
@@ -2777,6 +2778,55 @@ bool VectorCombine::foldShuffleToIdentity(Instruction &I) {
   return true;
 }
 
+bool VectorCombine::foldShuffleExtExtracts(Instruction &I) {
+  // Try to fold vector zero- and sign-extends split across multiple operations
+  // into a single extend, removing redundant inserts and shuffles.
+
+  // Check if we have an extended shuffle that selects the first vector, which
+  // itself is another extend fed by a load.
+  Instruction *L;
+  if (!match(
+          &I,
+          m_OneUse(m_Shuffle(
+              m_OneUse(m_ZExtOrSExt(m_OneUse(m_BitCast(m_OneUse(m_InsertElt(
+                  m_Value(), m_OneUse(m_Instruction(L)), m_SpecificInt(0))))))),
+              m_Value()))) ||
+      !cast<ShuffleVectorInst>(&I)->isIdentityWithExtract() ||
+      !isa<LoadInst>(L))
+    return false;
+  auto *InnerExt = cast<Instruction>(I.getOperand(0));
+  auto *OuterExt = dyn_cast<Instruction>(*I.user_begin());
+  if (!isa<SExtInst, ZExtInst>(OuterExt))
+    return false;
+
+  // If the inner extend is a sign extend and the outer one isnt (i.e. a
+  // zero-extend), don't fold. If the first one is zero-extend, it doesn't
+  // matter if the second one is a sign- or zero-extend.
+  if (isa<SExtInst>(InnerExt) && !isa<SExtInst>(OuterExt))
+    return false;
+
+  // Don't try to convert the load if it has an odd size.
+  if (!DL->typeSizeEqualsStoreSize(L->getType()))
+    return false;
+  auto *DstTy = cast<FixedVectorType>(OuterExt->getType());
+  auto *SrcTy =
+      FixedVectorType::get(InnerExt->getOperand(0)->getType()->getScalarType(),
+                           DstTy->getNumElements());
+  if (DL->getTypeStoreSize(SrcTy) != DL->getTypeStoreSize(L->getType()))
+    return false;
+
+  // Convert to a vector load feeding a single wide extend.
+  Builder.SetInsertPoint(*L->getInsertionPointAfterDef());
+  auto *NewLoad = cast<LoadInst>(
+      Builder.CreateLoad(SrcTy, L->getOperand(0), L->getName() + ".vec"));
+  auto *NewExt = isa<ZExtInst>(InnerExt) ? Builder.CreateZExt(NewLoad, DstTy)
+                                         : Builder.CreateSExt(NewLoad, DstTy);
+  OuterExt->replaceAllUsesWith(NewExt);
+  replaceValue(*OuterExt, *NewExt);
+  Worklist.pushValue(NewLoad);
+  return true;
+}
+
 /// Given a commutative reduction, the order of the input lanes does not alter
 /// the results. We can use this to remove certain shuffles feeding the
 /// reduction, removing the need to shuffle at all.
@@ -3551,6 +3601,7 @@ bool VectorCombine::run() {
         break;
       case Instruction::ShuffleVector:
         MadeChange |= widenSubvectorLoad(I);
+        MadeChange |= foldShuffleExtExtracts(I);
         break;
       default:
         break;
diff --git a/llvm/test/Transforms/VectorCombine/AArch64/combine-shuffle-ext.ll b/llvm/test/Transforms/VectorCombine/AArch64/combine-shuffle-ext.ll
new file mode 100644
index 0000000000000..6dca21471a0e9
--- /dev/null
+++ b/llvm/test/Transforms/VectorCombine/AArch64/combine-shuffle-ext.ll
@@ -0,0 +1,472 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -p vector-combine -mtriple=arm64-apple-macosx -S %s | FileCheck %s
+
+declare void @use.i32(i32)
+declare void @use.v2i32(<2 x i32>)
+declare void @use.v8i8(<8 x i8>)
+declare void @use.v8i16(<8 x i16>)
+declare void @use.v4i16(<4 x i16>)
+
+define <4 x i32> @load_i32_zext_to_v4i32(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L_VEC:%.*]] = load <4 x i8>, ptr [[DI]], align 4
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext <4 x i8> [[L_VEC]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_both_nneg(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_both_nneg(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L_VEC:%.*]] = load <4 x i8>, ptr [[DI]], align 4
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext <4 x i8> [[L_VEC]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext nneg <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_clobber_after_load(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_clobber_after_load(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L_VEC:%.*]] = load <4 x i8>, ptr [[DI]], align 4
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext <4 x i8> [[L_VEC]] to <4 x i32>
+; CHECK-NEXT:    call void @use.i32(i32 0)
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  call void @use.i32(i32 0)
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_sext_zext_to_v4i32(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_sext_zext_to_v4i32(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[E_1:%.*]] = sext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[E_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = sext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_load_other_users(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_load_other_users(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[E_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[E_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    call void @use.i32(i32 [[L]])
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  call void @use.i32(i32 %l)
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_ins_other_users(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_ins_other_users(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[E_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[E_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    call void @use.v2i32(<2 x i32> [[VEC_INS]])
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  call void @use.v2i32(<2 x i32> %vec.ins)
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_bc_other_users(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_bc_other_users(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[E_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[E_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    call void @use.v8i8(<8 x i8> [[VEC_BC]])
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  call void @use.v8i8(<8 x i8> %vec.bc)
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_ext_other_users(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_ext_other_users(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[E_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[E_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    call void @use.v8i16(<8 x i16> [[E_1]])
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  call void @use.v8i16(<8 x i16> %e.1)
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_zext_to_v4i32_shuffle_other_users(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_zext_to_v4i32_shuffle_other_users(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[E_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[E_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    call void @use.v8i16(<4 x i16> [[VEC_SHUFFLE]])
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %e.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %e.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2  = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  call void @use.v8i16(<4 x i16> %vec.shuffle)
+  ret <4 x i32> %ext.2
+}
+
+define <8 x i32> @load_i64_zext_to_v8i32(ptr %di) {
+; CHECK-LABEL: define <8 x i32> @load_i64_zext_to_v8i32(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L_VEC:%.*]] = load <8 x i8>, ptr [[DI]], align 8
+; CHECK-NEXT:    [[OUTER_EXT:%.*]] = zext <8 x i8> [[L_VEC]] to <8 x i32>
+; CHECK-NEXT:    ret <8 x i32> [[OUTER_EXT]]
+;
+entry:
+  %l = load i64, ptr %di
+  %vec.ins = insertelement <2 x i64> <i64 poison, i64 0>, i64 %l, i64 0
+  %vec.bc = bitcast <2 x i64> %vec.ins to <16 x i8>
+  %ext.1 = zext <16 x i8> %vec.bc to <16 x i16>
+  %vec.shuffle = shufflevector <16 x i16> %ext.1, <16 x i16> poison, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 4, i32 5, i32 6, i32 7>
+  %outer.ext = zext nneg <8 x i16> %vec.shuffle to <8 x i32>
+  ret <8 x i32> %outer.ext
+}
+
+define <3 x i32> @load_i24_zext_to_v3i32(ptr %di) {
+; CHECK-LABEL: define <3 x i32> @load_i24_zext_to_v3i32(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L_VEC:%.*]] = load <3 x i8>, ptr [[DI]], align 4
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext <3 x i8> [[L_VEC]] to <3 x i32>
+; CHECK-NEXT:    ret <3 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i24, ptr %di
+  %vec.ins = insertelement <2 x i24> <i24 poison, i24 0>, i24 %l, i64 0
+  %vec.bc = bitcast <2 x i24> %vec.ins to <6 x i8>
+  %ext.1 = zext <6 x i8> %vec.bc to <6 x i16>
+  %vec.shuffle = shufflevector <6 x i16> %ext.1, <6 x i16> poison, <3 x i32> <i32 0, i32 1, i32 2>
+  %ext.2 = zext nneg <3 x i16> %vec.shuffle to <3 x i32>
+  ret <3 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_insert_idx_1_sext(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_insert_idx_1_sext(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 0, i32 poison>, i32 [[L]], i64 1
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[EXT_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[EXT_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 0, i32 poison>, i32 %l, i64 1
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %ext.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %ext.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 3>
+  %ext.2 = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @mask_extracts_not_all_elements_1_sext(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @mask_extracts_not_all_elements_1_sext(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[EXT_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[EXT_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 2>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %ext.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %ext.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 2>
+  %ext.2 = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @mask_extracts_not_all_elements_2_sext(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @mask_extracts_not_all_elements_2_sext(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[EXT_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[EXT_1]], <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 4>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %ext.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %ext.1, <8 x i16> poison, <4 x i32> <i32 0, i32 1, i32 2, i32 4>
+  %ext.2 = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @mask_extracts_second_vector_sext(ptr %di, <8 x i16> %other) {
+; CHECK-LABEL: define <4 x i32> @mask_extracts_second_vector_sext(
+; CHECK-SAME: ptr [[DI:%.*]], <8 x i16> [[OTHER:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L:%.*]] = load i32, ptr [[DI]], align 4
+; CHECK-NEXT:    [[VEC_INS:%.*]] = insertelement <2 x i32> <i32 poison, i32 0>, i32 [[L]], i64 0
+; CHECK-NEXT:    [[VEC_BC:%.*]] = bitcast <2 x i32> [[VEC_INS]] to <8 x i8>
+; CHECK-NEXT:    [[EXT_1:%.*]] = zext <8 x i8> [[VEC_BC]] to <8 x i16>
+; CHECK-NEXT:    [[VEC_SHUFFLE:%.*]] = shufflevector <8 x i16> [[EXT_1]], <8 x i16> [[OTHER]], <4 x i32> <i32 4, i32 5, i32 6, i32 7>
+; CHECK-NEXT:    [[EXT_2:%.*]] = zext nneg <4 x i16> [[VEC_SHUFFLE]] to <4 x i32>
+; CHECK-NEXT:    ret <4 x i32> [[EXT_2]]
+;
+entry:
+  %l = load i32, ptr %di
+  %vec.ins = insertelement <2 x i32> <i32 poison, i32 0>, i32 %l, i64 0
+  %vec.bc = bitcast <2 x i32> %vec.ins to <8 x i8>
+  %ext.1 = zext <8 x i8> %vec.bc to <8 x i16>
+  %vec.shuffle = shufflevector <8 x i16> %ext.1, <8 x i16> %other, <4 x i32> <i32 4, i32 5, i32 6, i32 7>
+  %ext.2 = zext nneg <4 x i16> %vec.shuffle to <4 x i32>
+  ret <4 x i32> %ext.2
+}
+
+define <4 x i32> @load_i32_sext_to_v4i32(ptr %di) {
+; CHECK-LABEL: define <4 x i32> @load_i32_sext_to_v4i32(
+; CHECK-SAME: ptr [[DI:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[L_VEC:%.*]] = load <4 x i8>, ptr [[DI]], align 4
+; CHECK-NEXT:    [[EXT_2:%.*]] = sext <4 x i8> [[L_VEC]] to <4 x i32>
+; CHEC...
[truncated]

arsenm · 2025-05-23T08:51:21Z

llvm/lib/Transforms/Vectorize/VectorCombine.cpp

+      !isa<LoadInst>(L))
+    return false;
+  auto *InnerExt = cast<Instruction>(I.getOperand(0));
+  auto *OuterExt = dyn_cast<Instruction>(*I.user_begin());


Suggested change

auto *OuterExt = dyn_cast<Instruction>(*I.user_begin());

auto *OuterExt = cast<Instruction>(*I.user_begin());

Unchecked dyn_cast

Fixed thanks. I might be mis-remembering but I thought there was a way to capture pointers to matched instructions, but I couldn't find how to do it, so I had to resort to the casts after the fact.

Maybe m_Instruction?

arsenm · 2025-05-23T08:52:58Z

llvm/lib/Transforms/Vectorize/VectorCombine.cpp

+  Builder.SetInsertPoint(*L->getInsertionPointAfterDef());
+  auto *NewLoad = cast<LoadInst>(
+      Builder.CreateLoad(SrcTy, L->getOperand(0), L->getName() + ".vec"));
+  auto *NewExt = isa<ZExtInst>(InnerExt) ? Builder.CreateZExt(NewLoad, DstTy)


Loses nneg flag on the cast?

Thanks updated to preserve, Alive2: https://alive2.llvm.org/ce/z/HyfyPo

Precommit tests for #141109.

github-actions · 2025-05-23T13:12:11Z

✅ With the latest revision this PR passed the C/C++ code formatter.

Precommit tests for llvm/llvm-project#141109.

RKSimon · 2025-05-23T16:15:43Z

llvm/lib/Transforms/Vectorize/VectorCombine.cpp

+      !isa<LoadInst>(L))
+    return false;
+  auto *InnerExt = cast<Instruction>(I.getOperand(0));
+  auto *OuterExt = cast<Instruction>(*I.user_begin());


Can OuterExt cast ever fail?

I don't think so, it is a user of an instruction, which in turn should require it itself to be an instruction.

RKSimon

No objections, my only thought is why its in VectorCombine and not InstCombine if it isn't cost/target driven?

preames · 2025-06-03T17:23:22Z

Two high level comments:

Why does this need to be a single unified transform? It really seems like the scalar-load+insert+bitcast --> vector load, and the ext+shuffle+ext are independent transforms. Is there a reason these can't be separated?
All of your tests appear to use identity shuffles. Those can just be removed. Once that's done, we have two adjacent extends, which should already be handled? Either your tests need updated, or there's something off here?

Precommit tests for llvm#141109.

Add a new combine that folds a chain of (scalar load)->ext->ext (with shuffles/casts/inserts in between) to a single vector load and wide extend. This makes the IR simpler to analyze and to process, while the backend can still decide to break them up. Code like that comes from code written with vector intrinsics. Some examples of real-world use are in https://github.com/ARM-software/astc-encoder/.

fhahn

Two high level comments:

1. Why does this need to be a single unified transform?  It really seems like the scalar-load+insert+bitcast --> vector load, and the ext+shuffle+ext are independent transforms.  Is there a reason these can't be separated?

Thanks, they can be separated. I generalized the pattern in the patch to just fold the extends and shuffle. Folding the scalar load to a vector load should already be done in the backend. Folding the extends together makes it easier to scalarize them if needed (will share a follow-up soon).

2. All of your tests appear to use identity shuffles.  Those can just be removed.  Once that's done, we have two adjacent extends, which should already be handled?  Either your tests need updated, or there's something off here?

I may be missing something, but the shuffles are needed to extract half the vector elements. In the current version, I create a new shuffle of the narrower vector type, which is trivially fold-able by instcombine for the test cases.

fhahn

No objections, my only thought is why its in VectorCombine and not InstCombine if it isn't cost/target driven?

@RKSimon I think this should probably be limited to cases where the cost of the new extend isn't worse than the 2 separate extends. Added a check

preames · 2025-06-05T15:13:32Z

2. All of your tests appear to use identity shuffles.  Those can just be removed.  Once that's done, we have two adjacent extends, which should already be handled?  Either your tests need updated, or there's something off here?
I may be missing something, but the shuffles are needed to extract half the vector elements. In the current version, I create a new shuffle of the narrower vector type, which is trivially fold-able by instcombine for the test cases.

Ah, I'd missed these were length changing shuffles.

Another high level question for you. Could we phrase this as a generic shuffle placement canonicalization? If my memory serves, we generally push truncates early, and extends late in instcombine. What would you think of an analogous canonicalization for narrowing and widening shuffles? In this particular case, it would group the extends together, and then let generic extend combines merge them.

preames · 2025-06-05T15:14:42Z

llvm/lib/Transforms/Vectorize/VectorCombine.cpp

+  // Try to fold vector zero- and sign-extends split across multiple operations
+  // into a single extend.
+
+  // Check if we have ZEXT/SEXT (SHUFFLE (ZEXT/SEXT %src), _, identity-mask),


Code structure wise, I think you're rooting this from the wrong node. I think this should start from the outer extend, and then match the whole pattern.

preames · 2025-06-05T15:19:16Z

llvm/lib/Transforms/Vectorize/VectorCombine.cpp

+  // Don't perform the fold if the cost of the new extend is worse than the cost
+  // of the 2 original extends.
+  InstructionCost OriginalCost =
+      TTI.getCastInstrCost(InnerExt->getOpcode(), SrcTy, InnerExt->getType(),


It doesn't look like you have the right params for costing the original outer param - the inner params are repeated twice.

fhahn requested review from arsenm, RKSimon and davemgreen May 22, 2025 17:39

llvmbot added vectorizers llvm:transforms labels May 22, 2025

arsenm reviewed May 23, 2025

View reviewed changes

fhahn added a commit that referenced this pull request May 23, 2025

[VectorCombine] Add tests with combine-able vector-extends.

16fdb4f

Precommit tests for #141109.

fhahn force-pushed the vector-combine-combine-load-ext-chain branch from e1d5787 to 5d5fe9b Compare May 23, 2025 13:09

llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request May 23, 2025

Automerge: [VectorCombine] Add tests with combine-able vector-extends.

c55b44d

Precommit tests for llvm/llvm-project#141109.

RKSimon reviewed May 23, 2025

View reviewed changes

fhahn force-pushed the vector-combine-combine-load-ext-chain branch from 23da6b2 to 4f1c76b Compare May 27, 2025 10:46

RKSimon reviewed May 29, 2025

View reviewed changes

sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Jun 3, 2025

[VectorCombine] Add tests with combine-able vector-extends.

d266b4b

Precommit tests for llvm#141109.

fhahn added 4 commits June 4, 2025 14:42

!fixup address comments, thanks

7c9999a

!fixup fix formatting

7030a9f

!fixup just fold shuffle/ext

7df373e

fhahn force-pushed the vector-combine-combine-load-ext-chain branch from 4f1c76b to 7df373e Compare June 4, 2025 18:17

fhahn changed the title ~~[VectorCombine] Fold chain of (scalar load)->ext->ext to load->ext.~~ [VectorCombine] Fold ZExt/SExt (Shuffle (ZExt/SExt %src)) to ZExt/SExt (Shuffle %src). Jun 4, 2025

fhahn commented Jun 4, 2025

View reviewed changes

fhahn mentioned this pull request Jun 4, 2025

[AArch64] Change cost of (s|z)ext <4 x i8> -> <4 x i32> to 2. #141029

Open

preames reviewed Jun 5, 2025

View reviewed changes

	auto OuterExt = dyn_cast<Instruction>(I.user_begin());
	auto OuterExt = cast<Instruction>(I.user_begin());

[VectorCombine] Fold ZExt/SExt (Shuffle (ZExt/SExt %src)) to ZExt/SExt (Shuffle %src). #141109

Are you sure you want to change the base?

[VectorCombine] Fold ZExt/SExt (Shuffle (ZExt/SExt %src)) to ZExt/SExt (Shuffle %src). #141109

Uh oh!

Conversation

fhahn commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented May 22, 2025

Uh oh!

llvmbot commented May 22, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

RKSimon left a comment

Choose a reason for hiding this comment

Uh oh!

preames commented Jun 3, 2025

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

preames commented Jun 5, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fhahn commented May 22, 2025 •

edited

Loading

github-actions bot commented May 23, 2025 •

edited

Loading