Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoopVectorize] Remove redundant code in emitSCEVChecks #111132

Merged
merged 1 commit into from
Oct 7, 2024

Conversation

david-arm
Copy link
Contributor

There was some code in emitSCEVChecks to update the dominator
tree if LoopBypassBlocks is empty, however there are no tests
that fail when replacing this code with an assert. I built
both SPEC2017 and the LLVM test suite and also didn't see any
build failures. I've removed the code for now and added an
assert to guard this in case anything changes, since it seems
pointless to have code that's impossible to defend.

There was some code in emitSCEVChecks to update the dominator
tree if LoopBypassBlocks is empty, however there are no tests
that fail when replacing this code with an assert. I built
both SPEC2017 and the LLVM test suite and also didn't see any
build failures. I've removed the code for now and added an
assert to guard this in case anything changes, since it seems
pointless to have code that's impossible to defend.
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 4, 2024

@llvm/pr-subscribers-llvm-transforms

Author: David Sherwood (david-arm)

Changes

There was some code in emitSCEVChecks to update the dominator
tree if LoopBypassBlocks is empty, however there are no tests
that fail when replacing this code with an assert. I built
both SPEC2017 and the LLVM test suite and also didn't see any
build failures. I've removed the code for now and added an
assert to guard this in case anything changes, since it seems
pointless to have code that's impossible to defend.


Full diff: https://github.com/llvm/llvm-project/pull/111132.diff

1 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+2-12)
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index de3b981a4fe390..7c7d6693a1dccd 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -2486,18 +2486,8 @@ BasicBlock *InnerLoopVectorizer::emitSCEVChecks(BasicBlock *Bypass) {
            (OptForSizeBasedOnProfile &&
             Cost->Hints->getForce() != LoopVectorizeHints::FK_Enabled)) &&
          "Cannot SCEV check stride or overflow when optimizing for size");
-
-
-  // Update dominator only if this is first RT check.
-  if (LoopBypassBlocks.empty()) {
-    DT->changeImmediateDominator(Bypass, SCEVCheckBlock);
-    if (!Cost->requiresScalarEpilogue(VF.isVector()))
-      // If there is an epilogue which must run, there's no edge from the
-      // middle block to exit blocks  and thus no need to update the immediate
-      // dominator of the exit blocks.
-      DT->changeImmediateDominator(LoopExitBlock, SCEVCheckBlock);
-  }
-
+  assert(!LoopBypassBlocks.empty() &&
+         "Should already be a bypass block due to iteration count check");
   LoopBypassBlocks.push_back(SCEVCheckBlock);
   AddedSafetyChecks = true;
   return SCEVCheckBlock;

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! This should never be the first RT check in the current version.

@david-arm david-arm merged commit 66b2820 into llvm:main Oct 7, 2024
12 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Oct 7, 2024

LLVM Buildbot has detected a new failure on builder bolt-x86_64-ubuntu-shared running on bolt-worker while building llvm at step 6 "test-build-bolt-check-bolt".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/151/builds/2658

Here is the relevant piece of the build log for the reference
Step 6 (test-build-bolt-check-bolt) failure: test (failure)
******************** TEST 'BOLT :: perf2bolt/perf_test.test' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 5: /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/clang /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.c -fuse-ld=lld -Wl,--script=/home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.lds -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp
+ /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/clang /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.c -fuse-ld=lld -Wl,--script=/home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/Inputs/perf_test.lds -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp
RUN: at line 6: perf record -Fmax -e cycles:u -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 -- /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp
+ perf record -Fmax -e cycles:u -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 -- /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp
info: Using a maximum frequency rate of 2000 Hz
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 (11 samples) ]
RUN: at line 7: /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/perf2bolt /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp -p=/home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp3 -nl -ignore-build-id 2>&1 | /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/FileCheck /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test
+ /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/FileCheck /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test
+ /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/bin/perf2bolt /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp -p=/home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp2 -o /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp3 -nl -ignore-build-id
/home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test:10:12: error: CHECK-NOT: excluded string found in input
CHECK-NOT: !! WARNING !! This high mismatch ratio indicates the input binary is probably not the same binary used during profiling collection.
           ^
<stdin>:26:2: note: found here
 !! WARNING !! This high mismatch ratio indicates the input binary is probably not the same binary used during profiling collection. The generated data may be ineffective for improving performance.
 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Input file: <stdin>
Check file: /home/worker/bolt-worker2/llvm-project/bolt/test/perf2bolt/perf_test.test

-dump-input=help explains the following input dump.

Input was:
<<<<<<
        .
        .
        .
       21: PERF2BOLT: waiting for perf mem events collection to finish... 
       22: PERF2BOLT: processing basic events (without LBR)... 
       23: PERF2BOLT: read 11 samples 
       24: PERF2BOLT: out of range samples recorded in unknown regions: 9 (81.8%) 
       25:  
       26:  !! WARNING !! This high mismatch ratio indicates the input binary is probably not the same binary used during profiling collection. The generated data may be ineffective for improving performance. 
not:10      !~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                                   error: no match expected
       27:  
       28: PERF2BOLT: wrote 2 objects and 0 memory objects to /home/worker/bolt-worker2/bolt-x86_64-ubuntu-shared/build/tools/bolt/test/perf2bolt/Output/perf_test.test.tmp3 
       29: BOLT-INFO: 2 out of 13 functions in the binary (15.4%) have non-empty execution profile 
>>>>>>

--

********************


Kyvangka1610 added a commit to Kyvangka1610/llvm-project that referenced this pull request Oct 7, 2024
* commit 'FETCH_HEAD':
  [X86] getIntImmCostInst - pull out repeated Imm.getBitWidth() calls. NFC.
  [X86] Add test coverage for llvm#111323
  [Driver] Use empty multilib file in another test (llvm#111352)
  [clang][OpenMP][test] Use x86_64-linux-gnu triple for test referencing avx512f feature (llvm#111337)
  [doc] Fix Kaleidoscope tutorial chapter 3 code snippet and full listing discrepancies (llvm#111289)
  [Flang][OpenMP] Improve entry block argument creation and binding (llvm#110267)
  [x86] combineMul - handle 0/-1 KnownBits cases before MUL_IMM logic (REAPPLIED)
  [llvm-dis] Fix non-deterministic disassembly across multiple inputs (llvm#110988)
  [lldb][test] TestDataFormatterLibcxxOptionalSimulator.py: change order of ifdefs
  [lldb][test] Add libcxx-simulators test for std::optional (llvm#111133)
  [x86] combineMul - use computeKnownBits directly to find MUL_IMM constant splat. (REAPPLIED)
  Reland "[lldb][test] TestDataFormatterLibcxxStringSimulator.py: add new padding layout" (llvm#111123)
  Revert "[x86] combineMul - use computeKnownBits directly to find MUL_IMM constant splat."
  update_test_checks: fix a simple regression  (llvm#111347)
  [LegalizeVectorTypes] Always widen fabs (llvm#111298)
  [lsan] Make ReportUnsuspendedThreads return bool also for Fuchsia
  [mlir][vector] Add more tests for ConvertVectorToLLVM (6/n) (llvm#111121)
  [bazel] port 9144fed
  [SystemZ] Remove inlining threshold multiplier. (llvm#106058)
  [LegalizeVectorTypes] When widening don't check for libcalls if promoted (llvm#111297)
  [clang][Driver] Improve multilib custom error reporting (llvm#110804)
  [clang][Driver] Rename "FatalError" key to "Error" in multilib.yaml (llvm#110804)
  [LLVM][Maintainers] Update release managers (llvm#111164)
  [Clang][Driver] Add option to provide path for multilib's YAML config file (llvm#109640)
  [LoopVectorize] Remove redundant code in emitSCEVChecks (llvm#111132)
  [AMDGPU] Only emit SCOPE_SYS global_wb (llvm#110636)
  [ELF] Change Ctx::target to unique_ptr (llvm#111260)
  [ELF] Pass Ctx & to some free functions
  [RISCV] Only disassemble fcvtmod.w.d if the rounding mode is rtz. (llvm#111308)
  [Clang] Remove the special-casing for RequiresExprBodyDecl in BuildResolvedCallExpr() after fd87d76 (llvm#111277)
  [ELF] Pass Ctx & to InputFile
  [clang-format] Add AlignFunctionDeclarations to AlignConsecutiveDeclarations (llvm#108241)
  [AMDGPU] Support preloading hidden kernel arguments (llvm#98861)
  [ELF] Move static nextGroupId isInGroup to LinkerDriver
  [clangd] Add ArgumentLists config option under Completion (llvm#111322)
  [ELF] Pass Ctx & to SyntheticSections
  [ELF] Pass Ctx & to Symbols
  [ELF] Pass Ctx & to Symbols
  [ELF] getRelocTargetVA: pass Ctx and Relocation. NFC
  [clang-tidy] Avoid capturing a local variable in a static lambda in UseRangesCheck (llvm#111282)
  [VPlan] Use pointer to member 0 as VPInterleaveRecipe's pointer arg. (llvm#106431)
  [clangd] Simplify ternary expressions with std::optional::value_or (NFC) (llvm#111309)
  [libc++][format][2/3] Optimizes c-string arguments. (llvm#101805)
  [RISCV] Combine RVBUnary and RVKUnary into classes that are more similar to ALU(W)_r(r/i). NFC (llvm#111279)
  [ELF] Pass Ctx & to InputFiles
  [libc] GPU RPC interface: add return value to `rpc_host_call` (llvm#111288)

Signed-off-by: kyvangka1610 <kyvangka2002@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants