[InstCombine] Fold shuffles through all trivially vectorizable intrinsics #141979

lukel97 · 2025-05-29T16:49:10Z

This addresses a TODO in foldShuffledIntrinsicOperands to use isTriviallyVectorizable instead of a hardcoded list of intrinsics, which in turn allows more intriniscs to be scalarized by VectorCombine.

From what I can tell every intrinsic here should be speculatable so an assertion was added.

Because this enables intrinsics like abs which have a scalar operand, we need to also check isVectorIntrinsicWithScalarOpAtArg.

…sics This addresses a TODO in foldShuffledIntrinsicOperands to use isTriviallyVectorizable instead of a hardcoded list of intrinsics, which in turn allows more intriniscs to be scalarized by VectorCombine. From what I can tell every intrinsic here should be speculatable so an assertion was added. Because this enables intrinsics like abs which have a scalar operand, we need to also check isVectorIntrinsicWithScalarOpAtArg.

llvmbot · 2025-05-29T16:49:45Z

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-llvm-transforms

Author: Luke Lau (lukel97)

Changes

This addresses a TODO in foldShuffledIntrinsicOperands to use isTriviallyVectorizable instead of a hardcoded list of intrinsics, which in turn allows more intriniscs to be scalarized by VectorCombine.

From what I can tell every intrinsic here should be speculatable so an assertion was added.

Because this enables intrinsics like abs which have a scalar operand, we need to also check isVectorIntrinsicWithScalarOpAtArg.

Full diff: https://github.com/llvm/llvm-project/pull/141979.diff

4 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp (+22-19)
(modified) llvm/test/Transforms/InstCombine/abs-1.ll (+11)
(modified) llvm/test/Transforms/InstCombine/fma.ll (+13)
(modified) llvm/test/Transforms/InstCombine/sqrt.ll (+11)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
index e101edf4a6208..5eb466f6c6df1 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
@@ -1401,26 +1401,25 @@ static Instruction *factorizeMinMaxTree(IntrinsicInst *II) {
 /// try to shuffle after the intrinsic.
 Instruction *
 InstCombinerImpl::foldShuffledIntrinsicOperands(IntrinsicInst *II) {
-  // TODO: This should be extended to handle other intrinsics like fshl, ctpop,
-  //       etc. Use llvm::isTriviallyVectorizable() and related to determine
-  //       which intrinsics are safe to shuffle?
-  switch (II->getIntrinsicID()) {
-  case Intrinsic::smax:
-  case Intrinsic::smin:
-  case Intrinsic::umax:
-  case Intrinsic::umin:
-  case Intrinsic::fma:
-  case Intrinsic::fshl:
-  case Intrinsic::fshr:
-    break;
-  default:
+  if (!isTriviallyVectorizable(II->getIntrinsicID()))
+    return nullptr;
+
+  assert(isSafeToSpeculativelyExecute(II) &&
+         "Trivially vectorizable but not safe to speculatively execute?");
+
+  // fabs is canonicalized to fabs (shuffle ...) in foldShuffleOfUnaryOps, so
+  // avoid undoing it.
+  if (match(II, m_FAbs(m_Value())))
     return nullptr;
-  }
 
   Value *X;
   Constant *C;
   ArrayRef<int> Mask;
-  auto *NonConstArg = find_if_not(II->args(), IsaPred<Constant>);
+  auto *NonConstArg = find_if_not(II->args(), [&II](Use &Arg) {
+    return isa<Constant>(Arg.get()) ||
+           isVectorIntrinsicWithScalarOpAtArg(II->getIntrinsicID(),
+                                              Arg.getOperandNo(), nullptr);
+  });
   if (!NonConstArg ||
       !match(NonConstArg, m_Shuffle(m_Value(X), m_Poison(), m_Mask(Mask))))
     return nullptr;
@@ -1432,11 +1431,15 @@ InstCombinerImpl::foldShuffledIntrinsicOperands(IntrinsicInst *II) {
   // See if all arguments are shuffled with the same mask.
   SmallVector<Value *, 4> NewArgs;
   Type *SrcTy = X->getType();
-  for (Value *Arg : II->args()) {
-    if (match(Arg, m_Shuffle(m_Value(X), m_Poison(), m_SpecificMask(Mask))) &&
-        X->getType() == SrcTy)
+  for (Use &Arg : II->args()) {
+    if (isVectorIntrinsicWithScalarOpAtArg(II->getIntrinsicID(),
+                                           Arg.getOperandNo(), nullptr))
+      NewArgs.push_back(Arg);
+    else if (match(&Arg,
+                   m_Shuffle(m_Value(X), m_Poison(), m_SpecificMask(Mask))) &&
+             X->getType() == SrcTy)
       NewArgs.push_back(X);
-    else if (match(Arg, m_ImmConstant(C))) {
+    else if (match(&Arg, m_ImmConstant(C))) {
       // If it's a constant, try find the constant that would be shuffled to C.
       if (Constant *ShuffledC =
               unshuffleConstant(Mask, C, cast<VectorType>(SrcTy)))
diff --git a/llvm/test/Transforms/InstCombine/abs-1.ll b/llvm/test/Transforms/InstCombine/abs-1.ll
index 7037647d116ba..fd67fc3421498 100644
--- a/llvm/test/Transforms/InstCombine/abs-1.ll
+++ b/llvm/test/Transforms/InstCombine/abs-1.ll
@@ -978,3 +978,14 @@ define i32 @abs_diff_signed_slt_no_nsw_swap(i32 %a, i32 %b) {
   %cond = select i1 %cmp, i32 %sub_ba, i32 %sub_ab
   ret i32 %cond
 }
+
+define <2 x i32> @abs_unary_shuffle_ops(<2 x i32> %x) {
+; CHECK-LABEL: @abs_unary_shuffle_ops(
+; CHECK-NEXT:    [[R2:%.*]] = call <2 x i32> @llvm.abs.v2i32(<2 x i32> [[R1:%.*]], i1 false)
+; CHECK-NEXT:    [[R:%.*]] = shufflevector <2 x i32> [[R2]], <2 x i32> poison, <2 x i32> <i32 1, i32 0>
+; CHECK-NEXT:    ret <2 x i32> [[R]]
+;
+  %a = shufflevector <2 x i32> %x, <2 x i32> poison, <2 x i32> <i32 1, i32 0>
+  %r = call <2 x i32> @llvm.abs(<2 x i32> %a, i1 false)
+  ret <2 x i32> %r
+}
diff --git a/llvm/test/Transforms/InstCombine/fma.ll b/llvm/test/Transforms/InstCombine/fma.ll
index f0d4f776a5d90..e3d3e722bcc23 100644
--- a/llvm/test/Transforms/InstCombine/fma.ll
+++ b/llvm/test/Transforms/InstCombine/fma.ll
@@ -972,6 +972,19 @@ define <2 x half> @fma_negone_vec_partial_undef(<2 x half> %x, <2 x half> %y) {
   ret <2 x half> %sub
 }
 
+define <2 x float> @fmuladd_unary_shuffle_ops(<2 x float> %x, <2 x float> %y, <2 x float> %z) {
+; CHECK-LABEL: @fmuladd_unary_shuffle_ops(
+; CHECK-NEXT:    [[R:%.*]] = call <2 x float> @llvm.fmuladd.v2f32(<2 x float> [[A:%.*]], <2 x float> [[B:%.*]], <2 x float> [[C:%.*]])
+; CHECK-NEXT:    [[R1:%.*]] = shufflevector <2 x float> [[R]], <2 x float> poison, <2 x i32> <i32 1, i32 0>
+; CHECK-NEXT:    ret <2 x float> [[R1]]
+;
+  %a = shufflevector <2 x float> %x, <2 x float> poison, <2 x i32> <i32 1, i32 0>
+  %b = shufflevector <2 x float> %y, <2 x float> poison, <2 x i32> <i32 1, i32 0>
+  %c = shufflevector <2 x float> %z, <2 x float> poison, <2 x i32> <i32 1, i32 0>
+  %r = call <2 x float> @llvm.fmuladd(<2 x float> %a, <2 x float> %b, <2 x float> %c)
+  ret <2 x float> %r
+}
+
 ; negative tests
 
 define half @fma_non_negone(half %x, half %y) {
diff --git a/llvm/test/Transforms/InstCombine/sqrt.ll b/llvm/test/Transforms/InstCombine/sqrt.ll
index 0f4db3b3a65ae..2fda5bc37d023 100644
--- a/llvm/test/Transforms/InstCombine/sqrt.ll
+++ b/llvm/test/Transforms/InstCombine/sqrt.ll
@@ -201,6 +201,17 @@ define <2 x float> @sqrt_exp_vec(<2 x float> %x) {
   ret <2 x float> %res
 }
 
+define <2 x float> @sqrt_unary_shuffle_ops(<2 x float> %x) {
+; CHECK-LABEL: @sqrt_unary_shuffle_ops(
+; CHECK-NEXT:    [[R:%.*]] = call <2 x float> @llvm.sqrt.v2f32(<2 x float> [[A:%.*]])
+; CHECK-NEXT:    [[R1:%.*]] = shufflevector <2 x float> [[R]], <2 x float> poison, <2 x i32> <i32 1, i32 0>
+; CHECK-NEXT:    ret <2 x float> [[R1]]
+;
+  %a = shufflevector <2 x float> %x, <2 x float> poison, <2 x i32> <i32 1, i32 0>
+  %r = call <2 x float> @llvm.sqrt(<2 x float> %a)
+  ret <2 x float> %r
+}
+
 declare i32 @foo(double)
 declare double @sqrt(double) readnone
 declare float @sqrtf(float)

dtcxzyw · 2025-05-30T10:14:00Z

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

+  if (!isTriviallyVectorizable(II->getIntrinsicID()))
+    return nullptr;
+
+  assert(isSafeToSpeculativelyExecute(II) &&


If the call base has noundef on the return value, it is not safe to be speculatively executed.

It is safe in the sense of isSafeToSpeculativelyExecute though, right? We create a new call without the attribute.

I mean this assertion is not needed.

Should this be turned into just a check then? I could imagine at some point we might want to mark e.g. udiv.fix as trivially vectorizable even if it's not speculatable. In which case we would not be able to perform this combine.

Can we just check Callee->isSpeculatable()?

What about noundef etc? isSafeToSpeculativelyExecute also checks hasUBImplyingAttrs

EDIT: I see by default IgnoreUBImplyingAttrs is actually true

As we finally create a new intrinsic call instead of reusing/in-place modifying the original one, noundef attributes are ignorable.

nikic · 2025-05-30T11:07:10Z

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

@@ -1432,11 +1431,15 @@ InstCombinerImpl::foldShuffledIntrinsicOperands(IntrinsicInst *II) {
  // See if all arguments are shuffled with the same mask.


Pre-existing, but the one use check above should be about the shuffle operands only, right? It doesn't help us if a constant argument is one-use (which it always is).

Good point, I went poking around though and it looks like constants don't return true for hasOneUse(). They seem to show up as 0 uses, so it happens to work today by coincidence.

Will fix anyway since I think with this patch we also need to worry about non-constant scalar args.

nikic · 2025-05-30T11:07:49Z

llvm/test/Transforms/InstCombine/abs-1.ll

+; CHECK-NEXT:    [[R:%.*]] = shufflevector <2 x i32> [[R2]], <2 x i32> poison, <2 x i32> <i32 1, i32 0>
+; CHECK-NEXT:    ret <2 x i32> [[R]]
+;
+  %a = shufflevector <2 x i32> %x, <2 x i32> poison, <2 x i32> <i32 1, i32 0>


For example, what happens if you add an extra use to this shufflevector? I think it will fold thanks to the i1 false argument.

dtcxzyw · 2025-05-30T11:18:13Z

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

+  if (!isTriviallyVectorizable(II->getIntrinsicID()))
+    return nullptr;
+
+  assert(isSafeToSpeculativelyExecute(II) &&


I mean this assertion is not needed.

dtcxzyw · 2025-05-30T11:21:27Z

llvm/test/Transforms/SLPVectorizer/AMDGPU/add_sub_sat-inseltpoison.ll

-; GFX8-NEXT:    [[TMP0:%.*]] = shufflevector <3 x i16> [[ARG0]], <3 x i16> poison, <2 x i32> <i32 0, i32 1>
-; GFX8-NEXT:    [[TMP1:%.*]] = shufflevector <3 x i16> [[ARG1]], <3 x i16> poison, <2 x i32> <i32 0, i32 1>
-; GFX8-NEXT:    [[TMP2:%.*]] = call <2 x i16> @llvm.uadd.sat.v2i16(<2 x i16> [[TMP0]], <2 x i16> [[TMP1]])
+; GFX8-NEXT:    [[TMP3:%.*]] = call <3 x i16> @llvm.uadd.sat.v3i16(<3 x i16> [[ARG0]], <3 x i16> [[ARG1]])


This fold changes the vector length of intrinsics.

Yeah it's a bit strange, but I guess that was the existing behaviour. Here's a test case in test/InstCombine/fma.ll:

define <2 x float> @fma_unary_shuffle_ops_narrowing(<3 x float> %x, <3 x float> %y, <3 x float> %z) { ; CHECK-LABEL: @fma_unary_shuffle_ops_narrowing( ; CHECK-NEXT: [[B:%.*]] = shufflevector <3 x float> [[Y:%.*]], <3 x float> poison, <2 x i32> <i32 1, i32 0> ; CHECK-NEXT: call void @use_vec(<2 x float> [[B]]) ; CHECK-NEXT: [[TMP1:%.*]] = call nnan nsz <3 x float> @llvm.fma.v3f32(<3 x float> [[X:%.*]], <3 x float> [[Y]], <3 x float> [[Z:%.*]]) ; CHECK-NEXT: [[R:%.*]] = shufflevector <3 x float> [[TMP1]], <3 x float> poison, <2 x i32> <i32 1, i32 0> ; CHECK-NEXT: ret <2 x float> [[R]] ; %a = shufflevector <3 x float> %x, <3 x float> poison, <2 x i32> <i32 1, i32 0> %b = shufflevector <3 x float> %y, <3 x float> poison, <2 x i32> <i32 1, i32 0> call void @use_vec(<2 x float> %b) %c = shufflevector <3 x float> %z, <3 x float> poison, <2 x i32> <i32 1, i32 0> %r = call nnan nsz <2 x float> @llvm.fma.v2f32(<2 x float> %a, <2 x float> %b, <2 x float> %c) ret <2 x float> %r }

I'd like to avoid this fold when shufflevectors change the type (as a follow-up). cc @RKSimon

Should still allow the case where we combine into a smaller type, e.g.

define <3 x float> @fma_unary_shuffle_ops_widening(<2 x float> %x, <2 x float> %y, <2 x float> %z) { ; CHECK-LABEL: @fma_unary_shuffle_ops_widening( ; CHECK-NEXT: [[A:%.*]] = shufflevector <2 x float> [[X:%.*]], <2 x float> poison, <3 x i32> <i32 1, i32 0, i32 1> ; CHECK-NEXT: call void @use_vec3(<3 x float> [[A]]) ; CHECK-NEXT: [[TMP1:%.*]] = call fast <2 x float> @llvm.fma.v2f32(<2 x float> [[X]], <2 x float> [[Y:%.*]], <2 x float> [[Z:%.*]]) ; CHECK-NEXT: [[R:%.*]] = shufflevector <2 x float> [[TMP1]], <2 x float> poison, <3 x i32> <i32 1, i32 0, i32 1> ; CHECK-NEXT: ret <3 x float> [[R]] ; %a = shufflevector <2 x float> %x, <2 x float> poison, <3 x i32> <i32 1, i32 0, i32 1> call void @use_vec3(<3 x float> %a) %b = shufflevector <2 x float> %y, <2 x float> poison, <3 x i32> <i32 1, i32 0, i32 1> %c = shufflevector <2 x float> %z, <2 x float> poison, <3 x i32> <i32 1, i32 0, i32 1> %r = call fast <3 x float> @llvm.fma.v3f32(<3 x float> %a, <3 x float> %b, <3 x float> %c) ret <3 x float> %r }

I agree either way, this would make it more consistent with the binop combine in InstCombinerImpl::foldVectorBinop which currently only allows the same type for two non-constant operands. For a constant operand, it looks like it's allowed to narrow it.

Previously it was using the overloaded types constructor with a singleton arrayref: powi has multiple overloaded types

dtcxzyw · 2025-05-30T13:05:08Z

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

@@ -1450,7 +1454,7 @@ InstCombinerImpl::foldShuffledIntrinsicOperands(IntrinsicInst *II) {
  // intrinsic (shuf X, M), (shuf Y, M), ... --> shuf (intrinsic X, Y, ...), M
  Instruction *FPI = isa<FPMathOperator>(II) ? II : nullptr;
  Value *NewIntrinsic =
-      Builder.CreateIntrinsic(II->getIntrinsicID(), SrcTy, NewArgs, FPI);
+      Builder.CreateIntrinsic(SrcTy, II->getIntrinsicID(), NewArgs, FPI);


Crash reproducer:

; bin/opt -passes=instcombine test.ll -S define <3 x i4> @smin_unary_shuffle_ops_uses_const(<3 x i8> %x, <3 x i8> %y) { %sx = shufflevector <3 x i8> %x, <3 x i8> poison, <3 x i32> <i32 1, i32 0, i32 2> %sy = shufflevector <3 x i8> %y, <3 x i8> poison, <3 x i32> <i32 1, i32 0, i32 2> %r = call <3 x i4> @llvm.scmp.v3i4.v3i8(<3 x i8> %sx, <3 x i8> %sy) ret <3 x i4> %r }

opt: /home/dtcxzyw/WorkSpace/Projects/compilers/llvm-project/llvm/lib/IR/Value.cpp:519: void llvm::Value::doRAUW(llvm::Value*, ReplaceMetadataUses): Assertion `New->getType() == getType() && "replaceAllUses of value with new value of different type!"' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: bin/opt -passes=instcombine test.ll -S 1. Running pass "function(instcombine<max-iterations=1;verify-fixpoint>)" on module "test.ll" 2. Running pass "instcombine<max-iterations=1;verify-fixpoint>" on function "smin_unary_shuffle_ops_uses_const" #0 0x00007e0a1d627ab2 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build/bin/../lib/libLLVMSupport.so.21.0git+0x227ab2) #1 0x00007e0a1d62498f llvm::sys::RunSignalHandlers() (/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build/bin/../lib/libLLVMSupport.so.21.0git+0x22498f) #2 0x00007e0a1d624ad4 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0 #3 0x00007e0a1d045330 (/lib/x86_64-linux-gnu/libc.so.6+0x45330) #4 0x00007e0a1d09eb2c __pthread_kill_implementation ./nptl/pthread_kill.c:44:76 #5 0x00007e0a1d09eb2c __pthread_kill_internal ./nptl/pthread_kill.c:78:10 #6 0x00007e0a1d09eb2c pthread_kill ./nptl/pthread_kill.c:89:10 #7 0x00007e0a1d04527e raise ./signal/../sysdeps/posix/raise.c:27:6 #8 0x00007e0a1d0288ff abort ./stdlib/abort.c:81:7 #9 0x00007e0a1d02881b _nl_load_domain ./intl/loadmsgcat.c:1177:9 #10 0x00007e0a1d03b517 (/lib/x86_64-linux-gnu/libc.so.6+0x3b517) #11 0x00007e0a14363cd2 (/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build/bin/../lib/../lib/libLLVMCore.so.21.0git+0x363cd2) #12 0x00007e0a1546e812 llvm::InstCombinerImpl::run() (/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build/bin/../lib/../lib/libLLVMInstCombine.so.21.0git+0x67812) #13 0x00007e0a1546f8b3 combineInstructionsOverFunction(llvm::Function&, llvm::InstructionWorklist&, llvm::AAResults*, llvm::AssumptionCache&, llvm::TargetLibraryInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, llvm::OptimizationRemarkEmitter&, llvm::BlockFrequencyInfo*, llvm::BranchProbabilityInfo*, llvm::ProfileSummaryInfo*, llvm::InstCombineOptions const&) InstructionCombining.cpp:0:0 #14 0x00007e0a154708c2 llvm::InstCombinePass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build/bin/../lib/../lib/libLLVMInstCombine.so.21.0git+0x698c2) #15 0x00007e0a179ac525 llvm::detail::PassModel<llvm::Function, llvm::InstCombinePass, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build/bin/../lib/../lib/libPolly.so.21.0git+0x1ac525) #16 0x00007e0a143273f4 llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build/bin/../lib/../lib/libLLVMCore.so.21.0git+0x3273f4) #17 0x00007e0a1c2db975 llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build/bin/../lib/../lib/libLLVMX86CodeGen.so.21.0git+0xdb975) #18 0x00007e0a14327910 llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build/bin/../lib/../lib/libLLVMCore.so.21.0git+0x327910) #19 0x00007e0a1c2dc335 llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build/bin/../lib/../lib/libLLVMX86CodeGen.so.21.0git+0xdc335) #20 0x00007e0a14328a95 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build/bin/../lib/../lib/libLLVMCore.so.21.0git+0x328a95) #21 0x00007e0a1d8572e9 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) (/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build/bin/../lib/libLLVMOptDriver.so.21.0git+0x2c2e9) #22 0x00007e0a1d862306 optMain (/home/dtcxzyw/WorkSpace/Projects/compilers/LLVM/llvm-build/bin/../lib/libLLVMOptDriver.so.21.0git+0x37306) #23 0x00007e0a1d02a1ca __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3 #24 0x00007e0a1d02a28b call_init ./csu/../csu/libc-start.c:128:20 #25 0x00007e0a1d02a28b __libc_start_main ./csu/../csu/libc-start.c:347:5 #26 0x000062d8062ed095 _start (bin/opt+0x1095) Aborted (core dumped)

Thanks, fixed in eae67be

Fixes crashes with scmp where the element types aren't necessarily the same.

lukel97 added 2 commits May 29, 2025 17:44

Precommit tests

46d23a7

lukel97 requested a review from dtcxzyw May 29, 2025 16:49

lukel97 requested a review from nikic as a code owner May 29, 2025 16:49

llvmbot added llvm:instcombine llvm:transforms labels May 29, 2025

Update SLP tests

3ab1864

llvmbot added the backend:AMDGPU label May 29, 2025

dtcxzyw mentioned this pull request May 30, 2025

Fuzz PR141979 dtcxzyw/llvm-fuzz-service#75

Closed

dtcxzyw reviewed May 30, 2025

View reviewed changes

nikic reviewed May 30, 2025

View reviewed changes

dtcxzyw reviewed May 30, 2025

View reviewed changes

lukel97 added 3 commits May 30, 2025 13:31

Use return type based CreateIntrinsic overload to fix crash with powi

8cf0455

Previously it was using the overloaded types constructor with a singleton arrayref: powi has multiple overloaded types

Fix hasOneUse check to only consider shuffles

7bcb1c9

Make isSafeToSpeculativelyExecute assertion a isSpeculatable check

bfaacf7

dtcxzyw reviewed May 30, 2025

View reviewed changes

Use original return type's element type for return type

eae67be

Fixes crashes with scmp where the element types aren't necessarily the same.

		@@ -1432,11 +1431,15 @@ InstCombinerImpl::foldShuffledIntrinsicOperands(IntrinsicInst *II) {
		// See if all arguments are shuffled with the same mask.

[InstCombine] Fold shuffles through all trivially vectorizable intrinsics #141979

Are you sure you want to change the base?

[InstCombine] Fold shuffles through all trivially vectorizable intrinsics #141979

Conversation

lukel97 commented May 29, 2025

Uh oh!

llvmbot commented May 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lukel97 May 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvmbot commented May 29, 2025 •

edited

Loading

lukel97 May 30, 2025 •

edited

Loading