[SLP] Make getSameOpcode support interchangeable instructions. #133888

HanKuanChen · 2025-04-01T09:27:49Z

We use the term "interchangeable instructions" to refer to different
operators that have the same meaning (e.g., add x, 0 is equivalent to
mul x, 1).
Non-constant values are not supported, as they may incur high costs with
little benefit.

Co-authored-by: Alexey Bataev a.bataev@gmx.com

…132887) We use the term "interchangeable instructions" to refer to different operators that have the same meaning (e.g., `add x, 0` is equivalent to `mul x, 1`). Non-constant values are not supported, as they may incur high costs with little benefit. --------- Co-authored-by: Alexey Bataev <a.bataev@gmx.com>

…eInstruction

github-actions · 2025-04-01T09:30:17Z

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:

git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' 'HEAD~1' HEAD llvm/test/Transforms/SLPVectorizer/X86/BinOpSameOpcodeHelper.ll llvm/test/Transforms/SLPVectorizer/isOpcodeOrAlt.ll llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp llvm/test/Transforms/SLPVectorizer/AArch64/vec3-base.ll llvm/test/Transforms/SLPVectorizer/RISCV/reversed-strided-node-with-external-ptr.ll llvm/test/Transforms/SLPVectorizer/RISCV/vec3-base.ll llvm/test/Transforms/SLPVectorizer/X86/barriercall.ll llvm/test/Transforms/SLPVectorizer/X86/bottom-to-top-reorder.ll llvm/test/Transforms/SLPVectorizer/X86/buildvector-postpone-for-dependency.ll llvm/test/Transforms/SLPVectorizer/X86/bv-shuffle-mask.ll llvm/test/Transforms/SLPVectorizer/X86/extract-scalar-from-undef.ll llvm/test/Transforms/SLPVectorizer/X86/extractcost.ll llvm/test/Transforms/SLPVectorizer/X86/gathered-delayed-nodes-with-reused-user.ll llvm/test/Transforms/SLPVectorizer/X86/minbitwidth-drop-wrapping-flags.ll llvm/test/Transforms/SLPVectorizer/X86/multi-extracts-bv-combined.ll llvm/test/Transforms/SLPVectorizer/X86/non-scheduled-inst-reused-as-last-inst.ll llvm/test/Transforms/SLPVectorizer/X86/propagate_ir_flags.ll llvm/test/Transforms/SLPVectorizer/X86/reduced-val-vectorized-in-transform.ll llvm/test/Transforms/SLPVectorizer/X86/reorder_diamond_match.ll llvm/test/Transforms/SLPVectorizer/X86/shuffle-mask-emission.ll llvm/test/Transforms/SLPVectorizer/X86/vec3-base.ll llvm/test/Transforms/SLPVectorizer/X86/vect_copyable_in_binops.ll llvm/test/Transforms/SLPVectorizer/alternate-opcode-sindle-bv.ll llvm/test/Transforms/SLPVectorizer/resized-alt-shuffle-after-minbw.ll llvm/test/Transforms/SLPVectorizer/shuffle-mask-resized.ll

The following files introduce new uses of undef:

llvm/test/Transforms/SLPVectorizer/X86/extract-scalar-from-undef.ll

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

HanKuanChen · 2025-04-01T09:33:15Z

I test the following libraries.

llvm-test-suite: O3 + -mavx2 -mf16c, O3 + -mavx10.2-512
ffmpeg: O3 + -mavx2 -mf16c, O3 + -mavx10.2-512
libjpeg-turbo: O3 + -mavx2 -mf16c, O3 + -mavx10.2-512
eigen: O3 + -mavx, O3 + -mavx512f -mfma

gbossu

I brushed through the code, looks sensible. But I don't know too much about the SLPVectorizer yet, so do what you want with my comments 😉

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

HanKuanChen · 2025-04-07T10:18:33Z

ping

…-instruction

alexey-bataev · 2025-04-10T13:20:33Z

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

+static bool allSameOpcode(ArrayRef<Value *> VL) {
+  auto *It = find_if(VL, IsaPred<Instruction>);
+  if (It == VL.end())
+    return true;
+  Instruction *MainOp = cast<Instruction>(*It);
+  unsigned Opcode = MainOp->getOpcode();
+  bool IsCmpOp = isa<CmpInst>(MainOp);
+  CmpInst::Predicate BasePred = IsCmpOp ? cast<CmpInst>(MainOp)->getPredicate()
+                                        : CmpInst::BAD_ICMP_PREDICATE;
+  return std::all_of(It, VL.end(), [&](Value *V) {
+    if (auto *CI = dyn_cast<CmpInst>(V))
+      return BasePred == CI->getPredicate();
+    if (auto *I = dyn_cast<Instruction>(V))
+      return I->getOpcode() == Opcode;
+    return isa<PoisonValue>(V);
+  });
+}
+


We have a function isAlternateInstruction, can we use it here?

Cannot.
isAlternateInstruction consider interchangeable instruction.
allSameOpcode does not consider interchangeable instruction.

Can you add a parameter to isAlternateInstruction to skip interchangeable instructions to avoid code duplication?

Cannot implement allSameOpcode by isAlternateInstruction.

The MainOp and AltOp of isAlternateInstruction come from getSameOpcode, but getSameOpcode already takes interchangeable instructions into account.

However, the current allSameOpcode "refinds" MainOp and AltOp without considering alternate or interchangeable instructions.

mikaelholmen · 2025-04-15T07:49:03Z

@HanKuanChen and @alexey-bataev :
The following starts crashing with this patch:

opt -passes=slp-vectorizer bbi-106161.ll -o /dev/null

Result:

opt: ../include/llvm/ADT/APInt.h:1331: void llvm::APInt::setBit(unsigned int): Assertion `BitPosition < BitWidth && "BitPosition out of range"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: build-all/bin/opt -passes=slp-vectorizer bbi-106161.ll -o /dev/null
1.	Running pass "function(slp-vectorizer)" on module "bbi-106161.ll"
2.	Running pass "slp-vectorizer" on function "f_768_3162"
 #0 0x0000560032c2ed06 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (build-all/bin/opt+0x45e9d06)
 #1 0x0000560032c2c74e llvm::sys::RunSignalHandlers() (build-all/bin/opt+0x45e774e)
 #2 0x0000560032c2f589 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
 #3 0x00007f9d0a3c8d10 __restore_rt (/lib64/libpthread.so.0+0x12d10)
 #4 0x00007f9d07d6852f raise (/lib64/libc.so.6+0x4e52f)
 #5 0x00007f9d07d3be65 abort (/lib64/libc.so.6+0x21e65)
 #6 0x00007f9d07d3bd39 _nl_load_domain.cold.0 (/lib64/libc.so.6+0x21d39)
 #7 0x00007f9d07d60e86 (/lib64/libc.so.6+0x46e86)
 #8 0x000056003469d911 llvm::slpvectorizer::BoUpSLP::VLOperands::VLOperands(llvm::ArrayRef<llvm::Value*>, (anonymous namespace)::InstructionsState const&, llvm::slpvectorizer::BoUpSLP const&) SLPVectorizer.cpp:0:0
 #9 0x000056003469b0f1 llvm::slpvectorizer::BoUpSLP::TreeEntry::setOperand(llvm::slpvectorizer::BoUpSLP const&, bool) SLPVectorizer.cpp:0:0
#10 0x000056003468cdfd llvm::slpvectorizer::BoUpSLP::buildTree_rec(llvm::ArrayRef<llvm::Value*>, unsigned int, llvm::slpvectorizer::BoUpSLP::EdgeInfo const&, unsigned int) (build-all/bin/opt+0x6047dfd)
#11 0x00005600346f02cc llvm::SLPVectorizerPass::tryToVectorizeList(llvm::ArrayRef<llvm::Value*>, llvm::slpvectorizer::BoUpSLP&, bool) (build-all/bin/opt+0x60ab2cc)
#12 0x00005600346f1440 llvm::SLPVectorizerPass::tryToVectorize(llvm::Instruction*, llvm::slpvectorizer::BoUpSLP&) (build-all/bin/opt+0x60ac440)
#13 0x00005600346f45bc llvm::SLPVectorizerPass::vectorizeRootInstruction(llvm::PHINode*, llvm::Instruction*, llvm::BasicBlock*, llvm::slpvectorizer::BoUpSLP&) (build-all/bin/opt+0x60af5bc)
#14 0x00005600346e92ec llvm::SLPVectorizerPass::vectorizeChainsInBlock(llvm::BasicBlock*, llvm::slpvectorizer::BoUpSLP&) (build-all/bin/opt+0x60a42ec)
#15 0x00005600346e6204 llvm::SLPVectorizerPass::runImpl(llvm::Function&, llvm::ScalarEvolution*, llvm::TargetTransformInfo*, llvm::TargetLibraryInfo*, llvm::AAResults*, llvm::LoopInfo*, llvm::DominatorTree*, llvm::AssumptionCache*, llvm::DemandedBits*, llvm::OptimizationRemarkEmitter*) (build-all/bin/opt+0x60a1204)
#16 0x00005600346e5787 llvm::SLPVectorizerPass::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (build-all/bin/opt+0x60a0787)
#17 0x00005600340c17dd llvm::detail::PassModel<llvm::Function, llvm::SLPVectorizerPass, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) PassBuilderPipelines.cpp:0:0
#18 0x0000560032e653e7 llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) (build-all/bin/opt+0x48203e7)
#19 0x00005600340bd9ed llvm::detail::PassModel<llvm::Function, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::AnalysisManager<llvm::Function>>::run(llvm::Function&, llvm::AnalysisManager<llvm::Function>&) PassBuilderPipelines.cpp:0:0
#20 0x0000560032e69fbe llvm::ModuleToFunctionPassAdaptor::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (build-all/bin/opt+0x4824fbe)
#21 0x00005600340b99cd llvm::detail::PassModel<llvm::Module, llvm::ModuleToFunctionPassAdaptor, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) PassBuilderPipelines.cpp:0:0
#22 0x0000560032e640d7 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (build-all/bin/opt+0x481f0d7)
#23 0x000056003404631c llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) (build-all/bin/opt+0x5a0131c)
#24 0x0000560032bf10fe optMain (build-all/bin/opt+0x45ac0fe)
#25 0x00007f9d07d547e5 __libc_start_main (/lib64/libc.so.6+0x3a7e5)
#26 0x0000560032beebee _start (build-all/bin/opt+0x45a9bee)
Abort (core dumped)

bbi-106161.ll.gz

#133888)" This reverts commit 123993f.

…133888) We use the term "interchangeable instructions" to refer to different operators that have the same meaning (e.g., `add x, 0` is equivalent to `mul x, 1`). Non-constant values are not supported, as they may incur high costs with little benefit. --------- Co-authored-by: Alexey Bataev <a.bataev@gmx.com>

llvm#133888)" This reverts commit 123993f.

aeubanks · 2025-04-24T00:18:06Z

Bisected a libc++ strict weak ordering check failure to this commit on the following IR:

$ cat /tmp/a.ll
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-grtev4-linux-gnu"

define i64 @foo(i64 %arg, i64 %arg1) {
bb:
  %icmp = icmp eq i64 %arg1, 0
  br i1 %icmp, label %bb7, label %bb2

bb2:                                              ; preds = %bb2, %bb
  %phi = phi i64 [ %sub, %bb2 ], [ %arg1, %bb ]
  %phi3 = phi i64 [ %add, %bb2 ], [ %arg, %bb ]
  %phi4 = phi i64 [ %add5, %bb2 ], [ %arg, %bb ]
  %add = add i64 %phi3, 1
  %call = tail call i64 @pluto(i64 %phi3)
  %add5 = add i64 %phi4, %arg
  %sub = sub i64 %phi, %arg
  %icmp6 = icmp eq i64 %sub, 0
  br i1 %icmp6, label %bb7, label %bb2

bb7:                                              ; preds = %bb2, %bb
  %phi8 = phi i64 [ %arg, %bb ], [ %add5, %bb2 ]
  ret i64 %phi8
}

declare i64 @pluto(i64)

$ opt -p slp-vectorizer -disable-output /tmp/a.ll

it's pointing at the stable_sort here

HanKuanChen · 2025-04-24T02:41:47Z

@aeubanks I have a new version already #135797.

rnk · 2025-04-24T03:06:07Z

@aeubanks I have a new version already #135797.

OK, but can you confirm that the issue has been fixed? If you aren't confident that both issues have been fixed, it's our project policy to revert first and ask questions later:
https://llvm.org/docs/DeveloperPolicy.html#patch-reversion-policy
"As a community, we strongly value having the tip of tree in a good state while allowing rapid iterative development."

At EuroLLVM I spoke to multiple other individuals for whom the SLP vectorizer is the #1 source of toolchain instability. I have seen issues reported to SLP v developers with responses like along the lines of, this feedback is too late, we've already moved on, it will be hard to rollback. IMO these are not valid excuses for allowing tip-of-tree to be unstable. When a user reports a correctness issue in LLVM, as compiler developers, correctness is paramount, and we should take it seriously.

So, with that in mind, I would appreciate confirmation of whether the reported issue has been fixed.

HanKuanChen · 2025-04-24T03:21:36Z

I built using commit ea0dbee, and no bugs were found.

aeubanks · 2025-04-24T17:28:11Z

I can still reproduce this at main (as of a couple minutes ago). You need -stdlib=libc++ -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_DEBUG as CXXFLAGS. Locally I have libc++ 19 installed.

alexey-bataev · 2025-04-24T18:13:41Z

I can still reproduce this at main (as of a couple minutes ago). You need -stdlib=libc++ -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_DEBUG as CXXFLAGS. Locally I have libc++ 19 installed.

Hope f427890 will fix it

hiraditya · 2025-04-24T18:22:02Z

If possible, let's try to keep a new optimization disabled by default and let it soak for a bit (maybe a week or two?) during which more testing can be done.

aeubanks · 2025-04-24T20:43:34Z

thanks for the fix @alexey-bataev, confirmed that fixes the repro

mikaelholmen · 2025-04-25T10:06:22Z

At EuroLLVM I spoke to multiple other individuals for whom the SLP vectorizer is the #1 source of toolchain instability.

@rnk: Thank you for raising this!

This is the case for us. We have an out of tree backend and we merge from upstream trunk llvm several times per day, and fairly often we see new problems with the SLP vectorizer.

When I report a problem it does get fixed quickly though so thumbs up for that, but that the issues appear in the first place is turning into a problem. Digging out a reproducer, and then converting it to something that reproduces on an in-tree target often takes an effort since the pass depends quite a bit on the target.

We have even raised the question internally if we should disable SLP vectorizer since the maintenance cost of dealing with the crashes is sticking out.

Of course bugs and problems will always occur but SLP vectorizer unfortunately stands out in a negative way for us due to the asserts we sometimes see after merging new patches.

alexey-bataev · 2025-04-25T13:20:32Z

At EuroLLVM I spoke to multiple other individuals for whom the SLP vectorizer is the #1 source of toolchain instability.

@rnk: Thank you for raising this!

This is the case for us. We have an out of tree backend and we merge from upstream trunk llvm several times per day, and fairly often we see new problems with the SLP vectorizer.

When I report a problem it does get fixed quickly though so thumbs up for that, but that the issues appear in the first place is turning into a problem. Digging out a reproducer, and then converting it to something that reproduces on an in-tree target often takes an effort since the pass depends quite a bit on the target.

We have even raised the question internally if we should disable SLP vectorizer since the maintenance cost of dealing with the crashes is sticking out.

Of course bugs and problems will always occur but SLP vectorizer unfortunately stands out in a negative way for us due to the asserts we sometimes see after merging new patches.

I'm sorry this causes so many troubles on your end, and I appreciate your help with fixing the bugs. I hope this situation will be improved soon. There is just one big functional change expected, all other changes will be mostly related to redesign and fixes.

HanKuanChen and others added 5 commits March 31, 2025 22:50

add const to isBinOpWithConstantInt

9bfcd78

NFC. Move isConvertible into isOpcodeOrAlt.

2318795

NFC. Add isMainInstruction.

cb4de08

apply interchangeable instruction to isMainInstruction and isAlternat…

f639835

…eInstruction

HanKuanChen requested a review from alexey-bataev April 1, 2025 09:27

HanKuanChen added 3 commits April 1, 2025 03:17

refactor isOpcodeOrAlt

9835c72

refactor isMainInstruction and isAlternateInstruction

585428c

add assert for isOpcodeOrAlt

6d70b91

alexey-bataev requested a review from hiraditya April 1, 2025 13:13

HanKuanChen added 2 commits April 1, 2025 08:23

Pre-commit test.

f48ee28

fix 9835c72

64d3405

gbossu reviewed Apr 2, 2025

View reviewed changes

HanKuanChen added 2 commits April 6, 2025 19:46

add findInstructionWithOpcode

2e06c25

rename isOpcodeOrAlt to getMatchingMainOpOrAltOp

cab23cf

HanKuanChen added 2 commits April 10, 2025 03:55

Merge remote-tracking branch 'upstream/main' into slp-interchangeable…

eccafa0

…-instruction

fix merge conflict

8763ce1

alexey-bataev reviewed Apr 10, 2025

View reviewed changes

alexey-bataev approved these changes Apr 14, 2025

View reviewed changes

HanKuanChen merged commit 123993f into llvm:main Apr 14, 2025
10 of 11 checks passed

HanKuanChen deleted the slp-interchangeable-instruction branch April 14, 2025 11:23

HanKuanChen added a commit that referenced this pull request Apr 15, 2025

Revert "[SLP] Make getSameOpcode support interchangeable instructions. (

e1382b3

#133888)" This reverts commit 123993f.

var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025

Revert "[SLP] Make getSameOpcode support interchangeable instructions. (

33b37aa

llvm#133888)" This reverts commit 123993f.

[SLP] Make getSameOpcode support interchangeable instructions. #133888

[SLP] Make getSameOpcode support interchangeable instructions. #133888

Uh oh!

Conversation

HanKuanChen commented Apr 1, 2025

Uh oh!

github-actions bot commented Apr 1, 2025

Uh oh!

HanKuanChen commented Apr 1, 2025

Uh oh!

gbossu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HanKuanChen commented Apr 7, 2025

Uh oh!

alexey-bataev Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

HanKuanChen Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

alexey-bataev Apr 10, 2025

Choose a reason for hiding this comment

Uh oh!

HanKuanChen Apr 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mikaelholmen commented Apr 15, 2025

Uh oh!

aeubanks commented Apr 24, 2025

Uh oh!

HanKuanChen commented Apr 24, 2025

Uh oh!

rnk commented Apr 24, 2025

Uh oh!

HanKuanChen commented Apr 24, 2025

Uh oh!

aeubanks commented Apr 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexey-bataev commented Apr 24, 2025

Uh oh!

hiraditya commented Apr 24, 2025

Uh oh!

aeubanks commented Apr 24, 2025

Uh oh!

mikaelholmen commented Apr 25, 2025

Uh oh!

alexey-bataev commented Apr 25, 2025

Uh oh!

Uh oh!

aeubanks commented Apr 24, 2025 •

edited

Loading