Skip to content

[LICM] TSVC s113: not vectorized because LICM doesn't work #74262

Closed
@yus3710-fj

Description

@yus3710-fj

Flang can't vectorize the loop in s113 of TSVC while Clang can vectorize the loop written in C.

! Fortran version
      do 1 nl = 1,ntimes
      do 10 i = 2,n
        a(i) = a(1) + b(i)
   10 continue
      call dummy(ld,n,a,b,c,d,e,aa,bb,cc,1.)
   1  continue
// C version
for (int nl = 0; nl < ntimes; nl++) {
  for (int i = 1; i < n; i++) {
    a[i] = a[0] + b[i];
  }
  dummy(a, b, c, d, e, aa, bb, cc, 0.);
}
$ flang-new -v -Ofast s113.f -S -Rpass=licm\|vector -falias-analysis
flang-new version 18.0.0 (https://github.com/llvm/llvm-project.git 1c1227846425883a3d39ff56700660236a97152c)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /path/to/install/bin
Found candidate GCC installation: /path/to/lib/gcc/aarch64-unknown-linux-gnu/11.2.0
Selected GCC installation: /path/to/lib/gcc/aarch64-unknown-linux-gnu/11.2.0
Candidate multilib: .;@m64
Selected multilib: .;@m64
 "/path/to/install/bin/flang-new" -fc1 -triple aarch64-unknown-linux-gnu -S -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -ffast-math -target-cpu generic -target-feature +neon -target-feature +v8a -fstack-arrays -fversion-loops-for-stride -falias-analysis -Rpass=vector -O3 -o s113.s -x f95-cpp-input s113.f
$ clang -Ofast s113.c -Rpass=licm\|vector
/path/to/s113.c:16:11: remark: hoisting load [-Rpass=licm]
   16 |                         a[i] = a[0] + b[i];
      |                                ^
/path/to/s113.c:15:3: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize]
   15 |                 for (int i = 1; i < LEN; i++) {
      |                 ^

It can be reproduced with the following C code which is the same program as the above C code essentially.

// C version
for (int nl = 0; nl < ntimes; nl++) {
  for (int i = 2; i <= n; i++) {
    a[i-1] = a[0] + b[i-1];
  }
  dummy(a, b, c, d, e, aa, bb, cc, 0.);
}

Actually, Flang generates LLVM IR like this C code.

LICM is necessary for vectorization because LoopAccessAnalysis can't analyze a[0] correctly.
It seems that LICM doesn't work due to the linear expression in indices of arrays.

Metadata

Metadata

Assignees

No one assigned

    Labels

    llvm:analysisIncludes value tracking, cost tables and constant folding

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions