[LAA] Keep pointer checks on partial analysis #139719

john-brawn-arm · 2025-05-13T12:35:36Z

Currently if there's any memory access that AccessAnalysis couldn't analyze then all of the runtime pointer check results are discarded. This patch changes that so they're not, and we generate the runtime check information for those pointers that we could analyze, as transformations may still be able to make use of the partial information.

Of the transformations that use LoopAccessAnalysis, only LoopVersioningLICM changes behaviour as a result of this change. This is because the others either:

Check canVectorizeMemory, which will return false when we have partial pointer information as analyzeLoop() will return false.
Examine the dependencies returned by getDepChecker(), which will be empty as we exit analyzeLoop if we have partial pointer information before calling areDepsSafe(), which is what fills in the dependency information.

Currently if there's any memory access that AccessAnalysis couldn't analyze then all of the runtime pointer check results are discarded. This patch changes that so they're not, and we generate the runtime check information for those pointers that we could analyze, as transformations may still be able to make use of the partial information. Of the transformations that use LoopAccessAnalysis, only LoopVersioningLICM changes behaviour as a result of this change. This is because the others either: * Check canVectorizeMemory, which will return false when we have partial pointer information as analyzeLoop() will return false. * Examine the dependencies returned by getDepChecker(), which will be empty as we exit analyzeLoop if we have partial pointer information before calling areDepsSafe(), which is what fills in the dependency information.

llvmbot · 2025-05-13T12:36:11Z

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Author: John Brawn (john-brawn-arm)

Changes

Currently if there's any memory access that AccessAnalysis couldn't analyze then all of the runtime pointer check results are discarded. This patch changes that so they're not, and we generate the runtime check information for those pointers that we could analyze, as transformations may still be able to make use of the partial information.

Of the transformations that use LoopAccessAnalysis, only LoopVersioningLICM changes behaviour as a result of this change. This is because the others either:

Check canVectorizeMemory, which will return false when we have partial pointer information as analyzeLoop() will return false.
Examine the dependencies returned by getDepChecker(), which will be empty as we exit analyzeLoop if we have partial pointer information before calling areDepsSafe(), which is what fills in the dependency information.

Patch is 64.45 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/139719.diff

9 Files Affected:

(modified) llvm/include/llvm/Analysis/LoopAccessAnalysis.h (+7-2)
(modified) llvm/lib/Analysis/LoopAccessAnalysis.cpp (+9-10)
(modified) llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll (+254-53)
(modified) llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-backward.ll (+12)
(modified) llvm/test/Analysis/LoopAccessAnalysis/non-constant-strides-forward.ll (+9)
(modified) llvm/test/Analysis/LoopAccessAnalysis/pointer-phis.ll (+12-9)
(modified) llvm/test/Analysis/LoopAccessAnalysis/retry-runtime-checks-after-dependence-analysis-forked-pointers.ll (+14)
(modified) llvm/test/Analysis/LoopAccessAnalysis/underlying-object-loop-varying-phi.ll (+14)
(modified) llvm/test/Transforms/LoopVersioningLICM/load-from-unknown-address.ll (+140-28)

diff --git a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
index f715e0ec8dbb4..0abd7f1182108 100644
--- a/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
+++ b/llvm/include/llvm/Analysis/LoopAccessAnalysis.h
@@ -192,7 +192,8 @@ class MemoryDepChecker {
   /// of a write access.
   void addAccess(LoadInst *LI);
 
-  /// Check whether the dependencies between the accesses are safe.
+  /// Check whether the dependencies between the accesses are safe, and records
+  /// the dependence information in Dependences if so.
   ///
   /// Only checks sets with elements in \p CheckDeps.
   bool areDepsSafe(const DepCandidates &AccessSets,
@@ -779,10 +780,14 @@ class LoopAccessInfo {
 
   /// We need to check that all of the pointers in this list are disjoint
   /// at runtime. Using std::unique_ptr to make using move ctor simpler.
+  /// This list may contain only partial information when we've failed to
+  /// analyze all the memory accesses in the loop (i.e. CanVecMem is false).
   std::unique_ptr<RuntimePointerChecking> PtrRtChecking;
 
-  /// the Memory Dependence Checker which can determine the
+  /// The Memory Dependence Checker which can determine the
   /// loop-independent and loop-carried dependences between memory accesses.
+  /// This will be empty if we've failed to analyze all the memory access in the
+  /// loop (i.e. CanVecMem is false).
   std::unique_ptr<MemoryDepChecker> DepChecker;
 
   Loop *TheLoop;
diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index af1a3c593c514..4d912abdf59a8 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -529,8 +529,10 @@ void RuntimePointerChecking::groupChecks(
     // equivalence class, the iteration order is deterministic.
     for (auto M : DepCands.members(Access)) {
       auto PointerI = PositionMap.find(M.getPointer());
-      assert(PointerI != PositionMap.end() &&
-             "pointer in equivalence class not found in PositionMap");
+      // If we can't find the pointer in PositionMap that means we can't
+      // generate a memcheck for it.
+      if (PointerI == PositionMap.end())
+        continue;
       for (unsigned Pointer : PointerI->second) {
         bool Merged = false;
         // Mark this pointer as seen.
@@ -682,7 +684,9 @@ class AccessAnalysis {
   /// non-intersection.
   ///
   /// Returns true if we need no check or if we do and we can generate them
-  /// (i.e. the pointers have computable bounds).
+  /// (i.e. the pointers have computable bounds). A return value of false means
+  /// we couldn't analyze and generate runtime checks for all pointers in the
+  /// loop, but we will have checks for those pointers we could analyze.
   bool canCheckPtrAtRT(RuntimePointerChecking &RtCheck, ScalarEvolution *SE,
                        Loop *TheLoop,
                        const DenseMap<Value *, const SCEV *> &Strides,
@@ -1273,7 +1277,6 @@ bool AccessAnalysis::canCheckPtrAtRT(
                                   /*Assume=*/true)) {
           CanDoAliasSetRT = false;
           UncomputablePtr = Access.getPointer();
-          break;
         }
       }
     }
@@ -1313,7 +1316,7 @@ bool AccessAnalysis::canCheckPtrAtRT(
     }
   }
 
-  if (MayNeedRTCheck && CanDoRT)
+  if (MayNeedRTCheck)
     RtCheck.generateChecks(DepCands, IsDepCheckNeeded);
 
   LLVM_DEBUG(dbgs() << "LAA: We need to do " << RtCheck.getNumberOfChecks()
@@ -1323,11 +1326,7 @@ bool AccessAnalysis::canCheckPtrAtRT(
   // are needed. This can happen when all pointers point to the same underlying
   // object for example.
   RtCheck.Need = CanDoRT ? RtCheck.getNumberOfChecks() != 0 : MayNeedRTCheck;
-
-  bool CanDoRTIfNeeded = !RtCheck.Need || CanDoRT;
-  if (!CanDoRTIfNeeded)
-    RtCheck.reset();
-  return CanDoRTIfNeeded;
+  return !RtCheck.Need || CanDoRT;
 }
 
 void AccessAnalysis::processMemAccesses() {
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll b/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
index cd388b4ee87f2..f0885cf0a6043 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll
@@ -633,7 +633,18 @@ define dso_local void @forked_ptrs_same_base_different_offset(ptr nocapture read
 ; CHECK-NEXT:      Report: cannot identify array bounds
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:      Run-time memory checks:
+; CHECK-NEXT:      Check 0:
+; CHECK-NEXT:        Comparing group ([[GRP47:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
+; CHECK-NEXT:        Against group ([[GRP48:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %arrayidx = getelementptr inbounds i32, ptr %Preds, i64 %indvars.iv
 ; CHECK-NEXT:      Grouped accesses:
+; CHECK-NEXT:        Group [[GRP47]]:
+; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
+; CHECK-NEXT:        Group [[GRP48]]:
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
@@ -645,7 +656,18 @@ define dso_local void @forked_ptrs_same_base_different_offset(ptr nocapture read
 ; RECURSE-NEXT:      Report: cannot identify array bounds
 ; RECURSE-NEXT:      Dependences:
 ; RECURSE-NEXT:      Run-time memory checks:
+; RECURSE-NEXT:      Check 0:
+; RECURSE-NEXT:        Comparing group ([[GRP49:0x[0-9a-f]+]]):
+; RECURSE-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
+; RECURSE-NEXT:        Against group ([[GRP50:0x[0-9a-f]+]]):
+; RECURSE-NEXT:          %arrayidx = getelementptr inbounds i32, ptr %Preds, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
+; RECURSE-NEXT:        Group [[GRP49]]:
+; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
+; RECURSE-NEXT:        Group [[GRP50]]:
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-EMPTY:
 ; RECURSE-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; RECURSE-NEXT:      SCEV assumptions:
@@ -684,24 +706,24 @@ define dso_local void @forked_ptrs_add_to_offset(ptr nocapture readonly %Base, p
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:      Run-time memory checks:
 ; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP47:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Comparing group ([[GRP51:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
-; CHECK-NEXT:        Against group ([[GRP48:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Against group ([[GRP52:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %arrayidx = getelementptr inbounds i32, ptr %Preds, i64 %indvars.iv
 ; CHECK-NEXT:      Check 1:
-; CHECK-NEXT:        Comparing group ([[GRP47]]):
+; CHECK-NEXT:        Comparing group ([[GRP51]]):
 ; CHECK-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
-; CHECK-NEXT:        Against group ([[GRP49:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Against group ([[GRP53:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
 ; CHECK-NEXT:          %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP47]]:
+; CHECK-NEXT:        Group [[GRP51]]:
 ; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
-; CHECK-NEXT:        Group [[GRP48]]:
+; CHECK-NEXT:        Group [[GRP52]]:
 ; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
-; CHECK-NEXT:        Group [[GRP49]]:
+; CHECK-NEXT:        Group [[GRP53]]:
 ; CHECK-NEXT:          (Low: ((4 * %extra_offset) + %Base) High: (404 + (4 * %extra_offset) + %Base))
 ; CHECK-NEXT:            Member: {(4 + (4 * %extra_offset) + %Base),+,4}<%for.body>
 ; CHECK-NEXT:            Member: {((4 * %extra_offset) + %Base),+,4}<%for.body>
@@ -716,7 +738,18 @@ define dso_local void @forked_ptrs_add_to_offset(ptr nocapture readonly %Base, p
 ; RECURSE-NEXT:      Report: cannot identify array bounds
 ; RECURSE-NEXT:      Dependences:
 ; RECURSE-NEXT:      Run-time memory checks:
+; RECURSE-NEXT:      Check 0:
+; RECURSE-NEXT:        Comparing group ([[GRP54:0x[0-9a-f]+]]):
+; RECURSE-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
+; RECURSE-NEXT:        Against group ([[GRP55:0x[0-9a-f]+]]):
+; RECURSE-NEXT:          %arrayidx = getelementptr inbounds i32, ptr %Preds, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
+; RECURSE-NEXT:        Group [[GRP54]]:
+; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
+; RECURSE-NEXT:        Group [[GRP55]]:
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-EMPTY:
 ; RECURSE-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; RECURSE-NEXT:      SCEV assumptions:
@@ -752,24 +785,24 @@ define dso_local void @forked_ptrs_sub_from_offset(ptr nocapture readonly %Base,
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:      Run-time memory checks:
 ; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP50:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Comparing group ([[GRP56:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
-; CHECK-NEXT:        Against group ([[GRP51:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Against group ([[GRP57:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %arrayidx = getelementptr inbounds i32, ptr %Preds, i64 %indvars.iv
 ; CHECK-NEXT:      Check 1:
-; CHECK-NEXT:        Comparing group ([[GRP50]]):
+; CHECK-NEXT:        Comparing group ([[GRP56]]):
 ; CHECK-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
-; CHECK-NEXT:        Against group ([[GRP52:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Against group ([[GRP58:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
 ; CHECK-NEXT:          %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP50]]:
+; CHECK-NEXT:        Group [[GRP56]]:
 ; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
-; CHECK-NEXT:        Group [[GRP51]]:
+; CHECK-NEXT:        Group [[GRP57]]:
 ; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
-; CHECK-NEXT:        Group [[GRP52]]:
+; CHECK-NEXT:        Group [[GRP58]]:
 ; CHECK-NEXT:          (Low: ((-4 * %extra_offset) + %Base) High: (404 + (-4 * %extra_offset) + %Base))
 ; CHECK-NEXT:            Member: {(4 + (-4 * %extra_offset) + %Base),+,4}<%for.body>
 ; CHECK-NEXT:            Member: {((-4 * %extra_offset) + %Base),+,4}<%for.body>
@@ -784,7 +817,18 @@ define dso_local void @forked_ptrs_sub_from_offset(ptr nocapture readonly %Base,
 ; RECURSE-NEXT:      Report: cannot identify array bounds
 ; RECURSE-NEXT:      Dependences:
 ; RECURSE-NEXT:      Run-time memory checks:
+; RECURSE-NEXT:      Check 0:
+; RECURSE-NEXT:        Comparing group ([[GRP59:0x[0-9a-f]+]]):
+; RECURSE-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
+; RECURSE-NEXT:        Against group ([[GRP60:0x[0-9a-f]+]]):
+; RECURSE-NEXT:          %arrayidx = getelementptr inbounds i32, ptr %Preds, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
+; RECURSE-NEXT:        Group [[GRP59]]:
+; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
+; RECURSE-NEXT:        Group [[GRP60]]:
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-EMPTY:
 ; RECURSE-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; RECURSE-NEXT:      SCEV assumptions:
@@ -820,24 +864,24 @@ define dso_local void @forked_ptrs_add_sub_offset(ptr nocapture readonly %Base,
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:      Run-time memory checks:
 ; CHECK-NEXT:      Check 0:
-; CHECK-NEXT:        Comparing group ([[GRP53:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Comparing group ([[GRP61:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
-; CHECK-NEXT:        Against group ([[GRP54:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Against group ([[GRP62:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %arrayidx = getelementptr inbounds i32, ptr %Preds, i64 %indvars.iv
 ; CHECK-NEXT:      Check 1:
-; CHECK-NEXT:        Comparing group ([[GRP53]]):
+; CHECK-NEXT:        Comparing group ([[GRP61]]):
 ; CHECK-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
-; CHECK-NEXT:        Against group ([[GRP55:0x[0-9a-f]+]]):
+; CHECK-NEXT:        Against group ([[GRP63:0x[0-9a-f]+]]):
 ; CHECK-NEXT:          %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
 ; CHECK-NEXT:          %arrayidx3 = getelementptr inbounds float, ptr %Base, i64 %offset
 ; CHECK-NEXT:      Grouped accesses:
-; CHECK-NEXT:        Group [[GRP53]]:
+; CHECK-NEXT:        Group [[GRP61]]:
 ; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
 ; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
-; CHECK-NEXT:        Group [[GRP54]]:
+; CHECK-NEXT:        Group [[GRP62]]:
 ; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
 ; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
-; CHECK-NEXT:        Group [[GRP55]]:
+; CHECK-NEXT:        Group [[GRP63]]:
 ; CHECK-NEXT:          (Low: ((4 * %to_add) + (-4 * %to_sub) + %Base) High: (404 + (4 * %to_add) + (-4 * %to_sub) + %Base))
 ; CHECK-NEXT:            Member: {(4 + (4 * %to_add) + (-4 * %to_sub) + %Base),+,4}<%for.body>
 ; CHECK-NEXT:            Member: {((4 * %to_add) + (-4 * %to_sub) + %Base),+,4}<%for.body>
@@ -852,7 +896,18 @@ define dso_local void @forked_ptrs_add_sub_offset(ptr nocapture readonly %Base,
 ; RECURSE-NEXT:      Report: cannot identify array bounds
 ; RECURSE-NEXT:      Dependences:
 ; RECURSE-NEXT:      Run-time memory checks:
+; RECURSE-NEXT:      Check 0:
+; RECURSE-NEXT:        Comparing group ([[GRP64:0x[0-9a-f]+]]):
+; RECURSE-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
+; RECURSE-NEXT:        Against group ([[GRP65:0x[0-9a-f]+]]):
+; RECURSE-NEXT:          %arrayidx = getelementptr inbounds i32, ptr %Preds, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
+; RECURSE-NEXT:        Group [[GRP64]]:
+; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
+; RECURSE-NEXT:        Group [[GRP65]]:
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-EMPTY:
 ; RECURSE-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; RECURSE-NEXT:      SCEV assumptions:
@@ -890,7 +945,18 @@ define dso_local void @forked_ptrs_mul_by_offset(ptr nocapture readonly %Base, p
 ; CHECK-NEXT:      Report: cannot identify array bounds
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:      Run-time memory checks:
+; CHECK-NEXT:      Check 0:
+; CHECK-NEXT:        Comparing group ([[GRP66:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
+; CHECK-NEXT:        Against group ([[GRP67:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %arrayidx = getelementptr inbounds i32, ptr %Preds, i64 %indvars.iv
 ; CHECK-NEXT:      Grouped accesses:
+; CHECK-NEXT:        Group [[GRP66]]:
+; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
+; CHECK-NEXT:        Group [[GRP67]]:
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
@@ -902,7 +968,18 @@ define dso_local void @forked_ptrs_mul_by_offset(ptr nocapture readonly %Base, p
 ; RECURSE-NEXT:      Report: cannot identify array bounds
 ; RECURSE-NEXT:      Dependences:
 ; RECURSE-NEXT:      Run-time memory checks:
+; RECURSE-NEXT:      Check 0:
+; RECURSE-NEXT:        Comparing group ([[GRP68:0x[0-9a-f]+]]):
+; RECURSE-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
+; RECURSE-NEXT:        Against group ([[GRP69:0x[0-9a-f]+]]):
+; RECURSE-NEXT:          %arrayidx = getelementptr inbounds i32, ptr %Preds, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
+; RECURSE-NEXT:        Group [[GRP68]]:
+; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
+; RECURSE-NEXT:        Group [[GRP69]]:
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-EMPTY:
 ; RECURSE-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; RECURSE-NEXT:      SCEV assumptions:
@@ -940,7 +1017,18 @@ define dso_local void @forked_ptrs_uniform_and_strided_forks(ptr nocapture reado
 ; CHECK-NEXT:      Report: cannot identify array bounds
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:      Run-time memory checks:
+; CHECK-NEXT:      Check 0:
+; CHECK-NEXT:        Comparing group ([[GRP70:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
+; CHECK-NEXT:        Against group ([[GRP71:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %arrayidx = getelementptr inbounds i32, ptr %Preds, i64 %indvars.iv
 ; CHECK-NEXT:      Grouped accesses:
+; CHECK-NEXT:        Group [[GRP70]]:
+; CHECK-NEXT:          (Low: %Dest High: (400 + %Dest))
+; CHECK-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
+; CHECK-NEXT:        Group [[GRP71]]:
+; CHECK-NEXT:          (Low: %Preds High: (400 + %Preds))
+; CHECK-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; CHECK-EMPTY:
 ; CHECK-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; CHECK-NEXT:      SCEV assumptions:
@@ -952,7 +1040,18 @@ define dso_local void @forked_ptrs_uniform_and_strided_forks(ptr nocapture reado
 ; RECURSE-NEXT:      Report: cannot identify array bounds
 ; RECURSE-NEXT:      Dependences:
 ; RECURSE-NEXT:      Run-time memory checks:
+; RECURSE-NEXT:      Check 0:
+; RECURSE-NEXT:        Comparing group ([[GRP72:0x[0-9a-f]+]]):
+; RECURSE-NEXT:          %arrayidx5 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
+; RECURSE-NEXT:        Against group ([[GRP73:0x[0-9a-f]+]]):
+; RECURSE-NEXT:          %arrayidx = getelementptr inbounds i32, ptr %Preds, i64 %indvars.iv
 ; RECURSE-NEXT:      Grouped accesses:
+; RECURSE-NEXT:        Group [[GRP72]]:
+; RECURSE-NEXT:          (Low: %Dest High: (400 + %Dest))
+; RECURSE-NEXT:            Member: {%Dest,+,4}<nuw><%for.body>
+; RECURSE-NEXT:        Group [[GRP73]]:
+; RECURSE-NEXT:          (Low: %Preds High: (400 + %Preds))
+; RECURSE-NEXT:            Member: {%Preds,+,4}<nuw><%for.body>
 ; RECURSE-EMPTY:
 ; RECURSE-NEXT:      Non vectorizable stores to invariant address were not found in loop.
 ; RECURSE-NEXT:      SCEV assumptions:
@@ -995,7 +1094,18 @@ define dso_local void @forked_ptrs_gather_and_contiguous_forks(ptr nocapture rea
 ; CHECK-NEXT:      Report: cannot identify array bounds
 ; CHECK-NEXT:      Dependences:
 ; CHECK-NEXT:      Run-time memory checks:
+; CHECK-NEXT:      Check 0:
+; CHECK-NEXT:        Comparing group ([[GRP74:0x[0-9a-f]+]]):
+; CHECK-NEXT:          %1 = getelementptr inbounds float, ptr %Dest, i64 %indvars.iv
+; CHECK-...
[truncated]

artagnon

I think this is a sound change, and the test updates to LoopVersioningLICM look good!

artagnon · 2025-05-13T13:48:50Z

llvm/test/Analysis/LoopAccessAnalysis/forked-pointers.ll

@@ -633,7 +633,18 @@ define dso_local void @forked_ptrs_same_base_different_offset(ptr nocapture read
 ; CHECK-NEXT:      Report: cannot identify array bounds


Sorry, but why does the report say this? Can we update this message to "Cannot generate RT checks for pointers: ..."?

This piece of text is what goes in the -Rpass-analysis=loop-vectorize output, so here it's

remarks.c:8:12: remark: loop not vectorized: cannot identify array bounds [-Rpass-analysis=loop-vectorize] 8 | B[i] = A[offset]; | ^

It's intended to mean "cannot identify array bounds of this specific memory access".

Oh. Hm, then is there something we can add to the output to indicate that runtime checks were not generated for some pointers? Otherwise, the output seems a bit misleading.

I've added an extra line to the output saying this.

llvm/lib/Analysis/LoopAccessAnalysis.cpp

… all checks

artagnon

This should be good to go, modulo the spurious messages. I checked LVerLICM, and it seems to do the right thing:

    // Don't allow stores that we don't have runtime checks for, as we won't be
    // able to mark them noalias meaning they would prevent any code motion.
    auto &Pointers = LAI->getRuntimePointerChecking()->Pointers;
    if (!any_of(Pointers, [&](auto &P) { return P.PointerValue == Ptr; })) {
      LLVM_DEBUG(dbgs() << "    Found a store without a runtime check.\n");
      return false;
    }

llvm/test/Analysis/LoopAccessAnalysis/pointer-phis.ll

artagnon · 2025-05-15T17:31:09Z

llvm/lib/Analysis/LoopAccessAnalysis.cpp

@@ -3002,6 +3001,8 @@ void LoopAccessInfo::print(raw_ostream &OS, unsigned Depth) const {

  // List the pair of accesses need run-time checks to prove independence.
  PtrRtChecking->print(OS, Depth);
+  if (PtrRtChecking->Need && !CanVecMem)


I think this check is a bit too lax, and generates spurious messages. Can you check CheckingGroups against Pointers instead?

This doesn't work, as CheckingGroups is generated from Pointers in groupChecks, so there won't be any Pointer that doesn't have a corresponding CheckingGroup. I've gone instead with recording what the result of canCheckPtrAtRT was.

fhahn

Would it be possible to make this behavior opt-in? Otherwise we unnecessarily increase compile-time for all other users that do not need this, just to benefit LoopVerLICM which isn't on by default.

Expanding low/end for groups can be costly in some cases.

john-brawn-arm · 2025-05-19T16:05:09Z

Would it be possible to make this behavior opt-in? Otherwise we unnecessarily increase compile-time for all other users that do not need this, just to benefit LoopVerLICM which isn't on by default.

Expanding low/end for groups can be costly in some cases.

Doing a simple test with the following:

void fn(char *p, char *q, char **r, int n) {
#define LOOP \
  for (int i = 0; i < n; i++) { \
    p[i] += q[i] + r[0][i]; \
    p[i] += q[i+1] + r[1][i]; \
    p[i] += q[i+2] + r[2][i]; \
    p[i] += q[i+3] + r[3][i]; \
    p[i] += q[i+4] + r[4][i]; \
    p[i] += q[i+5] + r[5][i]; \
  }

#define DO10 LOOP LOOP LOOP LOOP LOOP LOOP LOOP LOOP LOOP LOOP
#define DO100 DO10 DO10 DO10 DO10 DO10 DO10 DO10 DO10 DO10 DO10
#define DO1000 DO100 DO100 DO100 DO100 DO100 DO100 DO100 DO100 DO100 DO100

  DO1000
}

and doing:

Compile with clang -O2 -emit-llvm
Run opt -passes="print<access-info>" -disable-output --time-trace 20 times with and without this change
Get the duration of LoopAccessAnalysis (which is in microseconds) from the time trace file

I get:
Without: mean 1307, median 1347, standard deviation 119
With: mean 1301, median 1335, standard deviation 116
So it looks like any increase in compile-time is negligible and well within the measurement noise, for this test case at least.

artagnon · 2025-05-19T18:11:29Z

See 234cc40 ([LAA] Limit no-overlap check to at least one loop-invariant accesses.): @fhahn has good reason to believe that compile-time will blow up. From the comment:

  // Check if we can prove that Sink only accesses memory after Src's end or
  // vice versa. At the moment this is limited to cases where either source or
  // sink are loop invariant to avoid compile-time increases. This is not
  // required for correctness.

Please use llvm-compile-time-tracker for real measurements.

This records whether canCheckPtrAtRT succeeded, and is used to decide if we print the "checks are incomplete" message.

john-brawn-arm · 2025-05-22T10:07:39Z

LLVM compile-time tracker results: https://llvm-compile-time-tracker.com/compare.php?from=a7ede51b556f40163db9e3cc67c98c27ba2364d8&to=a4dd603c004278e86f80cf6d62597535f0fee126&stat=instructions%3Au
There's a couple of benchmarks (mafft and lencod) that are showing as red, but the largest regression is 0.06% which doesn't seem like a lot. Is it still large enough to need to do something about it?

artagnon · 2025-05-23T08:24:15Z

but the largest regression is 0.06% which doesn't seem like a lot. Is it still large enough to need to do something about it?

Regressions of 0.01-0.02% are noise (the compile time tracker doesn't color them), 0.03% is a noticeable regression (the compile time tracker colors these in light red), and 0.05%+ is serious (compile time tracker colors these in deep red).

john-brawn-arm · 2025-06-03T09:55:18Z

I've added an option to LoopAccessAnalysis to control whether partial results are allowed, and enabled it only in LoopVersioningLICM. With that there are no reported compile time regressions: https://llvm-compile-time-tracker.com/compare.php?from=5c28af409978c19a35021855a29dcaa65e95da00&to=c21056925992f7e536faa766b8773c6582ebd916&stat=instructions:u

artagnon

LGTM, thanks! A few minor suggestions follow.

artagnon · 2025-06-03T13:55:17Z

llvm/lib/Passes/PassRegistry.def

+    [](StringRef Params) {
+      return PassBuilder::parseSinglePassOption(Params, "allow-partial",
+						"LoopAccessInfoPrinterPass");
+    },


Can lift this out into parseLoopAccessInfoPrinterOptions to match the rest of the code?

artagnon · 2025-06-03T13:56:05Z

llvm/lib/Analysis/LoopAccessAnalysis.cpp

@@ -1325,11 +1330,10 @@ bool AccessAnalysis::canCheckPtrAtRT(
  // are needed. This can happen when all pointers point to the same underlying
  // object for example.
  RtCheck.Need = CanDoRT ? RtCheck.getNumberOfChecks() != 0 : MayNeedRTCheck;
-


Stray newline strip?

artagnon · 2025-06-03T13:59:50Z

llvm/test/Analysis/LoopAccessAnalysis/allow-partial.ll

+; RUN: opt -disable-output -passes='print<access-info><allow-partial>,print<access-info>' %s 2>&1 | FileCheck %s --check-prefixes=ALLOW-BEFORE
+; RUN: opt -disable-output -passes='print<access-info>,print<access-info><allow-partial>' %s 2>&1 | FileCheck %s --check-prefixes=ALLOW-AFTER


artagnon · 2025-06-03T14:02:10Z

llvm/include/llvm/Analysis/LoopAccessAnalysis.h

@@ -654,7 +655,8 @@ class RuntimePointerChecking {
 /// For memory dependences that cannot be determined at compile time, it
 /// generates run-time checks to prove independence.  This is done by
 /// AccessAnalysis::canCheckPtrAtRT and the checks are maintained by the
-/// RuntimePointerCheck class.
+/// RuntimePointerCheck class. AllowPartial determines whether partial checks


\p AllowPartial to aid Doxygen?

artagnon · 2025-06-03T14:04:30Z

llvm/lib/Analysis/LoopAccessAnalysis.cpp

-  /// (i.e. the pointers have computable bounds).
+  /// (i.e. the pointers have computable bounds). A return value of false means
+  /// we couldn't analyze and generate runtime checks for all pointers in the
+  /// loop, but we will have checks for those pointers we could analyze.


... if \p AllowPartial is set?

fhahn · 2025-06-04T10:31:38Z

llvm/include/llvm/Analysis/LoopAccessAnalysis.h

  std::unique_ptr<MemoryDepChecker> DepChecker;

  Loop *TheLoop;

+  bool AllowPartial;


Needs some documentation. As AllowPartial is the default, it might be better to have the flag indicate that all runtime that can be generated will be collected?

What about the naming?

I'm not sure what you're asking here?

It just seemed like the comment might have been missed?

fhahn · 2025-06-04T10:34:02Z

llvm/lib/Analysis/LoopAccessAnalysis.cpp

-  bool CanDoRTIfNeeded = Accesses.canCheckPtrAtRT(
-      *PtrRtChecking, TheLoop, SymbolicStrides, UncomputablePtr);
-  if (!CanDoRTIfNeeded) {
+  HasCompletePtrRtChecking = Accesses.canCheckPtrAtRT(


Why does this need to be a member variable? Can we keep the name which still seems accurate?

This is used in LoopAccessInfo::print to decide whether to print the "Generated run-time checks are incomplete" message.

fhahn · 2025-06-04T10:36:37Z

llvm/lib/Analysis/LoopAccessAnalysis.cpp

+  // We need to create the LoopAccessInfo if either we don't already have one,
+  // or if it was created with a different value of AllowPartial.


Could we keep the original one if it has AllowPartial = true and AllowPartial = false?

It might be clearer if the flag is renamed to indicate that all runtime checks are collected, even if not valid.

I had thought about that, but I think having consistent output is more useful than any minor efficiency gain from not having to recompute the LoopAccessInfo.

Ok would have been good to resolve this discussion before merging.

What consistent output are you thinking about? The first only output change would be from the printing pass?

If we consider test/Analysis/LoopAccessAnalysis/allow-partial.ll, then with this change if we compare the output of -passes='print<access-info>' and -passes='print<access-info><allow-partial>,print<access-info>' then the print<access-info> gives different output in the second as it re-uses the LoopAccessInfo from the previous pass.

Potentially we could see the same thing if we has a transform pass that has AllowPartial=true followed by print<access-info>, but the only transform that has AllowPartial=true is LoopVersioningLICM which constructs LoopAccessInfoManager itself instead of using the LoopAccessAnalysis pass, so the analysis results won't be re-used anyway.

Ah I see, thanks. But that would only impact the print<access-info>, which is only used for testing?

fhahn · 2025-06-04T10:38:41Z

llvm/test/Analysis/LoopAccessAnalysis/allow-partial.ll

@@ -0,0 +1,99 @@
+; RUN: opt -disable-output -passes='print<access-info><allow-partial>,print<access-info>' %s 2>&1 | FileCheck %s --check-prefixes=ALLOW-BEFORE


I don't think we have other print passes that take arguments already, but would it be more consistent to pare this as print<access-info<allow-partial>> to indicate this is an argument to access-info instead of print<>?

This unfortunately can't be currently done with how passes are defined in PassRegistry.def: "print<access-info>" is the name of the pass, and the options come after that.

john-brawn-arm requested review from artagnon, nikic, fhahn and VladiKrapp-Arm May 13, 2025 12:35

llvmbot added llvm:analysis llvm:transforms labels May 13, 2025

artagnon reviewed May 13, 2025

View reviewed changes

artagnon changed the title ~~[LoopAccessAnalysis] Keep pointer checks on partial analysis~~ [LAA] Keep pointer checks on partial analysis May 13, 2025

Add extra line to LoopAccessInfo::print when we generate some but not…

6dc3239

… all checks

artagnon reviewed May 15, 2025

View reviewed changes

fhahn reviewed May 15, 2025

View reviewed changes

john-brawn-arm added 3 commits May 21, 2025 10:42

Merge branch 'main' into loop_access_analysis_partial_result

686b462

Add HasCompletePtrRtChecking

d78421a

This records whether canCheckPtrAtRT succeeded, and is used to decide if we print the "checks are incomplete" message.

Merge branch 'main' into loop_access_analysis_partial_result

be7173d

john-brawn-arm added 2 commits May 30, 2025 12:12

Merge branch 'main' into HEAD

5da2cc9

Add an option to control whether partial results are allowed

a77c6f9

artagnon approved these changes Jun 3, 2025

View reviewed changes

Adjust based on review comments.

32dcdac

fhahn reviewed Jun 4, 2025

View reviewed changes

Add an extra comment

2b02a5b

john-brawn-arm merged commit 81d3189 into llvm:main Jun 4, 2025
7 of 10 checks passed

		@@ -633,7 +633,18 @@ define dso_local void @forked_ptrs_same_base_different_offset(ptr nocapture read
		; CHECK-NEXT: Report: cannot identify array bounds

		; RUN: opt -disable-output -passes='print<access-info><allow-partial>,print<access-info>' %s 2>&1 \| FileCheck %s --check-prefixes=ALLOW-BEFORE
		; RUN: opt -disable-output -passes='print<access-info>,print<access-info><allow-partial>' %s 2>&1 \| FileCheck %s --check-prefixes=ALLOW-AFTER

		// We need to create the LoopAccessInfo if either we don't already have one,
		// or if it was created with a different value of AllowPartial.

		@@ -0,0 +1,99 @@
		; RUN: opt -disable-output -passes='print<access-info><allow-partial>,print<access-info>' %s 2>&1 \| FileCheck %s --check-prefixes=ALLOW-BEFORE

[LAA] Keep pointer checks on partial analysis #139719

[LAA] Keep pointer checks on partial analysis #139719

Uh oh!

Conversation

john-brawn-arm commented May 13, 2025

Uh oh!

llvmbot commented May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

artagnon left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

artagnon left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

artagnon May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fhahn left a comment

Choose a reason for hiding this comment

Uh oh!

john-brawn-arm commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

artagnon commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

john-brawn-arm commented May 22, 2025

Uh oh!

artagnon commented May 23, 2025

Uh oh!

john-brawn-arm commented Jun 3, 2025

Uh oh!

artagnon left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

llvmbot commented May 13, 2025 •

edited

Loading

artagnon May 15, 2025 •

edited

Loading

john-brawn-arm commented May 19, 2025 •

edited

Loading

artagnon commented May 19, 2025 •

edited

Loading

john-brawn-arm Jun 4, 2025 •

edited

Loading