Skip to content

[IPO] Added attributor for identifying invariant loads #141800

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

zGoldthorpe
Copy link
Contributor

The attributor conservatively marks pointers whose loads are eligible to be marked as !invariant.load.
It does so by identifying:

  1. Pointers marked noalias and readonly
  2. Pointers whose underlying objects are all eligible for invariant loads.

The attributor then manifests this attribute at non-atomic non-volatile load instructions.

@llvmbot
Copy link
Member

llvmbot commented May 28, 2025

@llvm/pr-subscribers-backend-amdgpu

@llvm/pr-subscribers-llvm-transforms

Author: None (zGoldthorpe)

Changes

The attributor conservatively marks pointers whose loads are eligible to be marked as !invariant.load.
It does so by identifying:

  1. Pointers marked noalias and readonly
  2. Pointers whose underlying objects are all eligible for invariant loads.

The attributor then manifests this attribute at non-atomic non-volatile load instructions.


Patch is 23.62 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/141800.diff

5 Files Affected:

  • (modified) llvm/include/llvm/Transforms/IPO/Attributor.h (+38)
  • (modified) llvm/lib/Transforms/IPO/Attributor.cpp (+2)
  • (modified) llvm/lib/Transforms/IPO/AttributorAttributes.cpp (+245)
  • (modified) llvm/test/Transforms/Attributor/multiple-offsets-pointer-info.ll (+4-4)
  • (added) llvm/test/Transforms/Attributor/tag-invariant-loads.ll (+220)
diff --git a/llvm/include/llvm/Transforms/IPO/Attributor.h b/llvm/include/llvm/Transforms/IPO/Attributor.h
index c628bbb007230..53fa7a04dc5b5 100644
--- a/llvm/include/llvm/Transforms/IPO/Attributor.h
+++ b/llvm/include/llvm/Transforms/IPO/Attributor.h
@@ -6289,6 +6289,44 @@ struct AAUnderlyingObjects : AbstractAttribute {
                           AA::ValueScope Scope = AA::Interprocedural) const = 0;
 };
 
+/// An abstract interface for identifying pointers from which loads can be
+/// marked invariant.
+struct AAInvariantLoadPointer : public AbstractAttribute {
+  AAInvariantLoadPointer(const IRPosition &IRP) : AbstractAttribute(IRP) {}
+
+  /// See AbstractAttribute::isValidIRPositionForInit
+  static bool isValidIRPositionForInit(Attributor &A, const IRPosition &IRP) {
+    if (!IRP.getAssociatedType()->isPointerTy())
+      return false;
+    return AbstractAttribute::isValidIRPositionForInit(A, IRP);
+  }
+
+  /// Create an abstract attribute view for the position \p IRP.
+  static AAInvariantLoadPointer &createForPosition(const IRPosition &IRP,
+                                                   Attributor &A);
+
+  /// Return true if the pointer's contents are known to remain invariant.
+  virtual bool isKnownInvariant() const = 0;
+
+  /// Return true if the pointer's contents are assumed to remain invariant.
+  virtual bool isAssumedInvariant() const = 0;
+
+  /// See AbstractAttribute::getName().
+  StringRef getName() const override { return "AAInvariantLoadPointer"; }
+
+  /// See AbstractAttribute::getIdAddr().
+  const char *getIdAddr() const override { return &ID; }
+
+  /// This function should return true if the type of the \p AA is
+  /// AAInvariantLoadPointer
+  static bool classof(const AbstractAttribute *AA) {
+    return (AA->getIdAddr() == &ID);
+  }
+
+  /// Unique ID (due to the unique address).
+  static const char ID;
+};
+
 /// An abstract interface for address space information.
 struct AAAddressSpace : public StateWrapper<BooleanState, AbstractAttribute> {
   AAAddressSpace(const IRPosition &IRP, Attributor &A)
diff --git a/llvm/lib/Transforms/IPO/Attributor.cpp b/llvm/lib/Transforms/IPO/Attributor.cpp
index cbdbf9ae1494d..1dc576656d12a 100644
--- a/llvm/lib/Transforms/IPO/Attributor.cpp
+++ b/llvm/lib/Transforms/IPO/Attributor.cpp
@@ -3620,6 +3620,8 @@ void Attributor::identifyDefaultAbstractAttributes(Function &F) {
       if (SimplifyAllLoads)
         getAssumedSimplified(IRPosition::value(I), nullptr,
                              UsedAssumedInformation, AA::Intraprocedural);
+      getOrCreateAAFor<AAInvariantLoadPointer>(
+          IRPosition::value(*LI->getPointerOperand()));
       getOrCreateAAFor<AAAddressSpace>(
           IRPosition::value(*LI->getPointerOperand()));
     } else {
diff --git a/llvm/lib/Transforms/IPO/AttributorAttributes.cpp b/llvm/lib/Transforms/IPO/AttributorAttributes.cpp
index 470c5308edca4..f0647747d6c7f 100644
--- a/llvm/lib/Transforms/IPO/AttributorAttributes.cpp
+++ b/llvm/lib/Transforms/IPO/AttributorAttributes.cpp
@@ -191,6 +191,7 @@ PIPE_OPERATOR(AAInterFnReachability)
 PIPE_OPERATOR(AAPointerInfo)
 PIPE_OPERATOR(AAAssumptionInfo)
 PIPE_OPERATOR(AAUnderlyingObjects)
+PIPE_OPERATOR(AAInvariantLoadPointer)
 PIPE_OPERATOR(AAAddressSpace)
 PIPE_OPERATOR(AAAllocationInfo)
 PIPE_OPERATOR(AAIndirectCallInfo)
@@ -12534,6 +12535,248 @@ struct AAIndirectCallInfoCallSite : public AAIndirectCallInfo {
 };
 } // namespace
 
+/// --------------------- Invariant Load Pointer -------------------------------
+namespace {
+
+struct AAInvariantLoadPointerImpl
+    : public StateWrapper<BitIntegerState<uint8_t, 7>, AAInvariantLoadPointer,
+                          uint8_t> {
+  // load invariance is implied by, but not equivalent to IS_NOALIAS |
+  // IS_READONLY, as load invariance is also implied by all underlying objects
+  // being load invariant.
+  //
+  // IS_INVARIANT is set to indicate that the contents of the pointer are
+  // *known* to be invariant.
+  enum {
+    IS_INVARIANT = 1 << 0,
+    IS_NOALIAS = 1 << 1,
+    IS_READONLY = 1 << 2,
+  };
+  static_assert(getBestState() == (IS_INVARIANT | IS_NOALIAS | IS_READONLY),
+                "Unexpected best state!");
+
+  using Base = StateWrapper<BitIntegerState<uint8_t, 7>, AAInvariantLoadPointer,
+                            uint8_t>;
+
+  // the BitIntegerState is optimistic about noalias and readonly, but
+  // pessimistic about invariance
+  AAInvariantLoadPointerImpl(const IRPosition &IRP, Attributor &A)
+      : Base(IRP, IS_NOALIAS | IS_READONLY) {}
+
+  void initialize(Attributor &A) final {
+    // conservatively assume that the pointer's contents are not invariant,
+    // until proven otherwise.
+    removeAssumedBits(IS_INVARIANT);
+  }
+
+  bool isKnownInvariant() const final {
+    return isKnown(IS_INVARIANT) || isKnown(IS_NOALIAS | IS_READONLY);
+  }
+
+  bool isAssumedInvariant() const final {
+    return isAssumed(IS_INVARIANT) || isAssumed(IS_NOALIAS | IS_READONLY);
+  }
+
+  ChangeStatus updateImpl(Attributor &A) override {
+    if (isKnownInvariant())
+      return ChangeStatus::UNCHANGED;
+
+    ChangeStatus Changed = ChangeStatus::UNCHANGED;
+
+    Changed |= updateNoAlias(A);
+    Changed |= updateReadOnly(A);
+
+    bool UsedAssumedInformation = false;
+    const auto IsInvariantLoadIfPointer = [&](const Value &V) {
+      if (!V.getType()->isPointerTy())
+        return true;
+      const auto *IsInvariantLoadPointer =
+          A.getOrCreateAAFor<AAInvariantLoadPointer>(IRPosition::value(V), this,
+                                                     DepClassTy::REQUIRED);
+      if (IsInvariantLoadPointer->isKnownInvariant())
+        return true;
+      if (!IsInvariantLoadPointer->isAssumedInvariant())
+        return false;
+
+      UsedAssumedInformation = true;
+      return true;
+    };
+
+    const auto *AUO = A.getOrCreateAAFor<AAUnderlyingObjects>(
+        getIRPosition(), this, DepClassTy::REQUIRED);
+
+    if (!AUO->forallUnderlyingObjects(IsInvariantLoadIfPointer)) {
+      removeAssumedBits(IS_INVARIANT);
+      return ChangeStatus::CHANGED;
+    }
+
+    if (!UsedAssumedInformation) {
+      // pointer is known (not assumed) to be invariant
+      addKnownBits(IS_INVARIANT);
+      return ChangeStatus::CHANGED;
+    }
+
+    return Changed;
+  }
+
+  ChangeStatus manifest(Attributor &A) override {
+    if (!isKnownInvariant())
+      return ChangeStatus::UNCHANGED;
+
+    ChangeStatus Changed = ChangeStatus::UNCHANGED;
+    Value *Ptr = &getAssociatedValue();
+    const auto TagInvariantLoads = [&](const Use &U, bool &) {
+      if (U.get() != Ptr)
+        return true;
+      auto *I = dyn_cast<Instruction>(U.getUser());
+      if (!I)
+        return true;
+
+      // Ensure that we are only changing uses from the corresponding callgraph
+      // SSC in the case that the AA isn't run on the entire module
+      if (!A.isRunOn(I->getFunction()))
+        return true;
+
+      if (I->hasMetadata(LLVMContext::MD_invariant_load))
+        return true;
+
+      if (auto *LI = dyn_cast<LoadInst>(I)) {
+        if (LI->isVolatile() || LI->isAtomic())
+          return true;
+
+        LI->setMetadata(LLVMContext::MD_invariant_load,
+                        MDNode::get(LI->getContext(), {}));
+        Changed = ChangeStatus::CHANGED;
+      }
+      return true;
+    };
+
+    (void)A.checkForAllUses(TagInvariantLoads, *this, *Ptr);
+    return Changed;
+  }
+
+  /// See AbstractAttribute::getAsStr().
+  const std::string getAsStr(Attributor *) const override {
+    std::string Str;
+    raw_string_ostream OS(Str);
+    OS << "load invariant pointer: " << isKnown() << '\n';
+    return Str;
+  }
+
+  /// See AbstractAttribute::trackStatistics().
+  void trackStatistics() const override {}
+
+protected:
+  ChangeStatus updateNoAlias(Attributor &A) {
+    if (isKnown(IS_NOALIAS) || !isAssumed(IS_NOALIAS))
+      return ChangeStatus::UNCHANGED;
+
+    const auto *ANoAlias = A.getOrCreateAAFor<AANoAlias>(getIRPosition(), this,
+                                                         DepClassTy::REQUIRED);
+    if (!ANoAlias)
+      return tryInferNoAlias(A);
+
+    if (!ANoAlias->isAssumedNoAlias()) {
+      removeAssumedBits(IS_NOALIAS);
+      return ChangeStatus::CHANGED;
+    }
+    if (ANoAlias->isKnownNoAlias())
+      addKnownBits(IS_NOALIAS);
+
+    return ChangeStatus::UNCHANGED;
+  }
+
+  /// Fallback method if updateNoAlias fails to infer noalias information from
+  /// AANoAlias.
+  virtual ChangeStatus tryInferNoAlias(Attributor &A) {
+    return ChangeStatus::UNCHANGED;
+  }
+
+  ChangeStatus updateReadOnly(Attributor &A) {
+    if (isKnown(IS_READONLY) || !isAssumed(IS_READONLY))
+      return ChangeStatus::UNCHANGED;
+
+    // AAMemoryBehavior may crash if value is global
+    if (!getAssociatedFunction())
+      return tryInferReadOnly(A);
+
+    const auto *AMemoryBehavior = A.getOrCreateAAFor<AAMemoryBehavior>(
+        getIRPosition(), this, DepClassTy::REQUIRED);
+    if (!AMemoryBehavior)
+      return tryInferReadOnly(A);
+
+    if (!AMemoryBehavior->isAssumedReadOnly()) {
+      removeAssumedBits(IS_READONLY);
+      return ChangeStatus::CHANGED;
+    }
+    if (AMemoryBehavior->isKnownReadOnly())
+      addKnownBits(IS_READONLY);
+
+    return ChangeStatus::UNCHANGED;
+  }
+
+  /// Fallback method if updateReadOnly fails to infer readonly information from
+  /// AAMemoryBehavior.
+  virtual ChangeStatus tryInferReadOnly(Attributor &A) {
+    return ChangeStatus::UNCHANGED;
+  }
+};
+
+struct AAInvariantLoadPointerFloating final : AAInvariantLoadPointerImpl {
+  AAInvariantLoadPointerFloating(const IRPosition &IRP, Attributor &A)
+      : AAInvariantLoadPointerImpl(IRP, A) {}
+};
+
+struct AAInvariantLoadPointerReturned final : AAInvariantLoadPointerImpl {
+  AAInvariantLoadPointerReturned(const IRPosition &IRP, Attributor &A)
+      : AAInvariantLoadPointerImpl(IRP, A) {}
+};
+
+struct AAInvariantLoadPointerCallSiteReturned final
+    : AAInvariantLoadPointerImpl {
+  AAInvariantLoadPointerCallSiteReturned(const IRPosition &IRP, Attributor &A)
+      : AAInvariantLoadPointerImpl(IRP, A) {}
+};
+
+struct AAInvariantLoadPointerArgument final : AAInvariantLoadPointerImpl {
+  AAInvariantLoadPointerArgument(const IRPosition &IRP, Attributor &A)
+      : AAInvariantLoadPointerImpl(IRP, A) {}
+
+protected:
+  ChangeStatus tryInferNoAlias(Attributor &A) override {
+    const auto *Arg = getAssociatedArgument();
+    if (Arg->hasNoAliasAttr()) {
+      addKnownBits(IS_NOALIAS);
+      return ChangeStatus::UNCHANGED;
+    }
+
+    // noalias information is not provided, and cannot be inferred from
+    // AANoAlias
+    removeAssumedBits(IS_NOALIAS);
+    return ChangeStatus::CHANGED;
+  }
+
+  ChangeStatus tryInferReadOnly(Attributor &A) override {
+    const auto *Arg = getAssociatedArgument();
+    if (Arg->onlyReadsMemory()) {
+      addKnownBits(IS_READONLY);
+      return ChangeStatus::UNCHANGED;
+    }
+
+    // readonly information is not provided, and cannot be inferred from
+    // AAMemoryBehavior
+    removeAssumedBits(IS_READONLY);
+    return ChangeStatus::CHANGED;
+  }
+};
+
+struct AAInvariantLoadPointerCallSiteArgument final
+    : AAInvariantLoadPointerImpl {
+  AAInvariantLoadPointerCallSiteArgument(const IRPosition &IRP, Attributor &A)
+      : AAInvariantLoadPointerImpl(IRP, A) {}
+};
+} // namespace
+
 /// ------------------------ Address Space  ------------------------------------
 namespace {
 
@@ -13031,6 +13274,7 @@ const char AAInterFnReachability::ID = 0;
 const char AAPointerInfo::ID = 0;
 const char AAAssumptionInfo::ID = 0;
 const char AAUnderlyingObjects::ID = 0;
+const char AAInvariantLoadPointer::ID = 0;
 const char AAAddressSpace::ID = 0;
 const char AAAllocationInfo::ID = 0;
 const char AAIndirectCallInfo::ID = 0;
@@ -13165,6 +13409,7 @@ CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAPotentialValues)
 CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoUndef)
 CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AANoFPClass)
 CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAPointerInfo)
+CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAInvariantLoadPointer)
 CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAAddressSpace)
 CREATE_VALUE_ABSTRACT_ATTRIBUTE_FOR_POSITION(AAAllocationInfo)
 
diff --git a/llvm/test/Transforms/Attributor/multiple-offsets-pointer-info.ll b/llvm/test/Transforms/Attributor/multiple-offsets-pointer-info.ll
index f04ac4d73340f..9e58a35107491 100644
--- a/llvm/test/Transforms/Attributor/multiple-offsets-pointer-info.ll
+++ b/llvm/test/Transforms/Attributor/multiple-offsets-pointer-info.ll
@@ -10,7 +10,7 @@ define i8 @select_offsets_simplifiable_1(i1 %cnd1, i1 %cnd2) {
 ; CHECK-LABEL: define {{[^@]+}}@select_offsets_simplifiable_1
 ; CHECK-SAME: (i1 [[CND1:%.*]], i1 [[CND2:%.*]]) {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[BYTES:%.*]] = call ptr @calloc(i64 noundef 1024, i64 noundef 1)
+; CHECK-NEXT:    [[BYTES:%.*]] = call noalias ptr @calloc(i64 noundef 1024, i64 noundef 1)
 ; CHECK-NEXT:    [[GEP23:%.*]] = getelementptr inbounds [1024 x i8], ptr [[BYTES]], i64 0, i64 23
 ; CHECK-NEXT:    store i8 23, ptr [[GEP23]], align 4
 ; CHECK-NEXT:    [[GEP29:%.*]] = getelementptr inbounds [1024 x i8], ptr [[BYTES]], i64 0, i64 29
@@ -190,7 +190,7 @@ define i8 @select_offsets_not_simplifiable_3(i1 %cnd1, i1 %cnd2) {
 ; CHECK-LABEL: define {{[^@]+}}@select_offsets_not_simplifiable_3
 ; CHECK-SAME: (i1 [[CND1:%.*]], i1 [[CND2:%.*]]) {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[BYTES:%.*]] = call ptr @calloc(i64 noundef 1024, i64 noundef 1)
+; CHECK-NEXT:    [[BYTES:%.*]] = call noalias ptr @calloc(i64 noundef 1024, i64 noundef 1)
 ; CHECK-NEXT:    [[SEL0:%.*]] = select i1 [[CND1]], i64 23, i64 29
 ; CHECK-NEXT:    [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 7
 ; CHECK-NEXT:    [[GEP_SEL:%.*]] = getelementptr inbounds [1024 x i8], ptr [[BYTES]], i64 0, i64 [[SEL1]]
@@ -214,7 +214,7 @@ define i8 @select_offsets_not_simplifiable_4(i1 %cnd1, i1 %cnd2) {
 ; CHECK-LABEL: define {{[^@]+}}@select_offsets_not_simplifiable_4
 ; CHECK-SAME: (i1 [[CND1:%.*]], i1 [[CND2:%.*]]) {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[BYTES:%.*]] = call ptr @calloc(i64 noundef 1024, i64 noundef 1)
+; CHECK-NEXT:    [[BYTES:%.*]] = call noalias ptr @calloc(i64 noundef 1024, i64 noundef 1)
 ; CHECK-NEXT:    [[SEL0:%.*]] = select i1 [[CND1]], i64 23, i64 29
 ; CHECK-NEXT:    [[SEL1:%.*]] = select i1 [[CND2]], i64 [[SEL0]], i64 7
 ; CHECK-NEXT:    [[GEP_SEL:%.*]] = getelementptr inbounds [1024 x i8], ptr [[BYTES]], i64 0, i64 [[SEL1]]
@@ -445,7 +445,7 @@ define i8 @phi_gep_not_simplifiable_2(i1 %cnd1, i1 %cnd2) {
 ; CHECK-LABEL: define {{[^@]+}}@phi_gep_not_simplifiable_2
 ; CHECK-SAME: (i1 [[CND1:%.*]], i1 [[CND2:%.*]]) {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[BYTES:%.*]] = call ptr @calloc(i64 noundef 1024, i64 noundef 1)
+; CHECK-NEXT:    [[BYTES:%.*]] = call noalias ptr @calloc(i64 noundef 1024, i64 noundef 1)
 ; CHECK-NEXT:    [[GEP23:%.*]] = getelementptr inbounds [1024 x i8], ptr [[BYTES]], i64 0, i64 23
 ; CHECK-NEXT:    br i1 [[CND1]], label [[THEN:%.*]], label [[ELSE:%.*]]
 ; CHECK:       then:
diff --git a/llvm/test/Transforms/Attributor/tag-invariant-loads.ll b/llvm/test/Transforms/Attributor/tag-invariant-loads.ll
new file mode 100644
index 0000000000000..6df07a0d68bee
--- /dev/null
+++ b/llvm/test/Transforms/Attributor/tag-invariant-loads.ll
@@ -0,0 +1,220 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=attributor %s -S | FileCheck %s
+
+@G = global i32 zeroinitializer, align 4
+
+declare ptr @get_ptr()
+declare noalias ptr @get_noalias_ptr()
+
+define i32 @test_plain(ptr %ptr) {
+; CHECK-LABEL: define i32 @test_plain(
+; CHECK-SAME: ptr nofree noundef nonnull readonly align 4 captures(none) dereferenceable(4) [[PTR:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:    [[VAL:%.*]] = load i32, ptr [[PTR]], align 4
+; CHECK-NEXT:    ret i32 [[VAL]]
+;
+  %val = load i32, ptr %ptr, align 4
+  ret i32 %val
+}
+
+define i32 @test_noalias_ptr(ptr noalias %ptr) {
+; CHECK-LABEL: define i32 @test_noalias_ptr(
+; CHECK-SAME: ptr noalias nofree noundef nonnull readonly align 4 captures(none) dereferenceable(4) [[PTR:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:    [[VAL:%.*]] = load i32, ptr [[PTR]], align 4, !invariant.load [[META0:![0-9]+]]
+; CHECK-NEXT:    ret i32 [[VAL]]
+;
+  %val = load i32, ptr %ptr, align 4
+  ret i32 %val
+}
+
+define i32 @test_swap(ptr noalias %ptr, i32 %write) {
+; CHECK-LABEL: define i32 @test_swap(
+; CHECK-SAME: ptr noalias nofree noundef nonnull align 4 captures(none) dereferenceable(4) [[PTR:%.*]], i32 [[WRITE:%.*]]) #[[ATTR1:[0-9]+]] {
+; CHECK-NEXT:    [[VAL:%.*]] = load i32, ptr [[PTR]], align 4
+; CHECK-NEXT:    store i32 [[WRITE]], ptr [[PTR]], align 4
+; CHECK-NEXT:    ret i32 [[VAL]]
+;
+  %val = load i32, ptr %ptr, align 4
+  store i32 %write, ptr %ptr, align 4
+  ret i32 %val
+}
+
+define i32 @test_volatile_load(ptr noalias %ptr) {
+; CHECK-LABEL: define i32 @test_volatile_load(
+; CHECK-SAME: ptr noalias nofree noundef align 4 [[PTR:%.*]]) #[[ATTR2:[0-9]+]] {
+; CHECK-NEXT:    [[VAL:%.*]] = load volatile i32, ptr [[PTR]], align 4
+; CHECK-NEXT:    ret i32 [[VAL]]
+;
+  %val = load volatile i32, ptr %ptr, align 4
+  ret i32 %val
+}
+
+define i32 @test_atomic_load(ptr noalias %ptr) {
+; CHECK-LABEL: define i32 @test_atomic_load(
+; CHECK-SAME: ptr noalias nofree noundef nonnull readonly align 4 captures(none) dereferenceable(4) [[PTR:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:    [[VAL:%.*]] = load atomic i32, ptr [[PTR]] unordered, align 4
+; CHECK-NEXT:    ret i32 [[VAL]]
+;
+  %val = load atomic i32, ptr %ptr unordered, align 4
+  ret i32 %val
+}
+
+define i32 @test_atomic_volatile_load(ptr noalias %ptr) {
+; CHECK-LABEL: define i32 @test_atomic_volatile_load(
+; CHECK-SAME: ptr noalias nofree noundef align 4 [[PTR:%.*]]) #[[ATTR2]] {
+; CHECK-NEXT:    [[VAL:%.*]] = load atomic volatile i32, ptr [[PTR]] unordered, align 4
+; CHECK-NEXT:    ret i32 [[VAL]]
+;
+  %val = load atomic volatile i32, ptr %ptr unordered, align 4
+  ret i32 %val
+}
+
+define i32 @test_global() {
+; CHECK-LABEL: define i32 @test_global(
+; CHECK-SAME: ) #[[ATTR3:[0-9]+]] {
+; CHECK-NEXT:    [[VAL:%.*]] = load i32, ptr @G, align 4
+; CHECK-NEXT:    ret i32 [[VAL]]
+;
+  %val = load i32, ptr @G, align 4
+  ret i32 %val
+}
+
+define internal i32 @test_internal_noalias_load(ptr %ptr) {
+; CHECK-LABEL: define internal i32 @test_internal_noalias_load(
+; CHECK-SAME: ptr noalias nofree noundef nonnull readonly align 4 captures(none) dereferenceable(4) [[PTR:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:    [[VAL:%.*]] = load i32, ptr [[PTR]], align 4, !invariant.load [[META0]]
+; CHECK-NEXT:    ret i32 [[VAL]]
+;
+  %val = load i32, ptr %ptr, align 4
+  ret i32 %val
+}
+
+define i32 @test_call_internal_noalias(ptr noalias %ptr) {
+; CHECK-LABEL: define i32 @test_call_internal_noalias(
+; CHECK-SAME: ptr noalias nofree readonly captures(none) [[PTR:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:    [[VAL:%.*]] = call i32 @test_internal_noalias_load(ptr noalias nofree noundef readonly align 4 captures(none) [[PTR]]) #[[ATTR4:[0-9]+]]
+; CHECK-NEXT:    ret i32 [[VAL]]
+;
+  %val = call i32 @test_internal_noalias_load(ptr %ptr)
+  ret i32 %val
+}
+
+define internal i32 @test_internal_load(ptr %ptr) {
+; CHECK-LABEL: define internal i32 @test_internal_load(
+; CHECK-SAME: ptr nofree noundef nonnull readonly align 4 captures(none) dereferenceable(4) [[PTR:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:    [[VAL:%.*]] = load i32, ptr [[PTR]], align 4
+; CHECK-NEXT:    ret i32 [[VAL]]
+;
+  %val = load i32, ptr %ptr, align 4
+  ret i32 %val
+}
+
+define i32 @test_call_internal(ptr %ptr) {
+; CHECK-LABEL: define i32 @test_call_internal(
+; CHECK-SAME: ptr nofree readonly captures(none) [[PTR:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT:    [[VAL:%.*]] = call i32 @test_internal_load(ptr nofree noundef readonly align 4 captures(none) [[PTR]]) #[[ATTR4]]
+; CHECK-NEXT:    ret i32 [[VAL]]
+;
+  %val = call i32 @test_internal_load(ptr %ptr)
+  ret i32 %val
+}
+
+define i32 @test_call_ptr() {
+; CHECK-LABEL: define i32 @test_call_ptr() {
+; CHECK-NEXT:    [[PTR:%.*]] = call ptr @get_ptr()
+; CHECK-NEXT:    [[VAL:%.*]] = load i32, ptr [[PTR]], align 4
+; CHECK-NEXT:    ret i32 [[VAL]]
+;
+  %ptr = call ptr @get_ptr()
+  %val = load i32, ptr %ptr, align 4
+  ret i32 %val
+}
+
+define i32 @test_call_noalias_ptr() ...
[truncated]

@shiltian shiltian requested review from arsenm and jdoerfert May 28, 2025 17:07
@zGoldthorpe
Copy link
Contributor Author

zGoldthorpe commented May 28, 2025

I have tried addressing most of the feedback.
However, I haven't done anything regarding atomics, yet (besides never marking atomic loads as !invariant.load).

Assuming !isCallableCC, are nonatomic loads to readonly noalias pointers potentially non-invariant once there is also an atomic load to the same pointer in the same function?

"Side effects" include volatile loads and atomic loads that are at least
monotonic.
@zGoldthorpe zGoldthorpe requested review from krzysz00 and shiltian June 2, 2025 16:57
Copy link
Contributor

@krzysz00 krzysz00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels reasonable, but I don't fully know the attributer framework well enough to review it properly.

Also, do we want to add some mechanism for declaring that ptr(4) and ptr(6) are all definitionally constant and use it here?

Finally, call for a test

define amdgpu_kernel void @test_noalias_buffer_desc(ptr addrspace(1) align 4 noalias %ptr) {
  %buffer.rsrc = call ptr addrspace(7) @llvm.amdgcn.make.buffer.rsrc.p7.p1(ptr addrspace(1) %ptr, i16 0, i32 256, i32 0)
  %val = load i32, ptr addrspace(7) %buffer.rsrc, align 4
  call void @clobber(i32 %val)
  ret void
}

@shiltian
Copy link
Contributor

shiltian commented Jun 3, 2025

Also, do we want to add some mechanism for declaring that ptr(4) and ptr(6) are all definitionally constant and use it here?

No, this file is target agnostic so all target related stuff can be here. If anything target related needed, it has to go through TTI hook.

@krzysz00
Copy link
Contributor

krzysz00 commented Jun 3, 2025

No, this file is target agnostic so all target related stuff can be here. If anything target related needed, it has to go through TTI hook.

Yeah, that's what I meant. TTI::isGlobalConstantsAddressSpace(unsigned) if such a thing doesn't already exist

@zGoldthorpe
Copy link
Contributor Author

zGoldthorpe commented Jun 3, 2025

Yeah, that's what I meant. TTI::isGlobalConstantsAddressSpace(unsigned) if such a thing doesn't already exist

I don't think such a thing already exists, but I feel like adding such support would make more sense as a separate PR.

@krzysz00
Copy link
Contributor

krzysz00 commented Jun 4, 2025

Yeah, separate PR, but worth doing as a followup

@shiltian
Copy link
Contributor

shiltian commented Jun 5, 2025

Yeah, that's what I meant. TTI::isGlobalConstantsAddressSpace(unsigned) if such a thing doesn't already exist

I don't think such a thing already exists, but I feel like adding such support would make more sense as a separate PR.

Ideally yes, but in practice maybe challenging. I've tried to add stuff about address spaces into some common files several times in the past but got a lot of push back. Almost none of those effort succeeded. The essential question is the definition. For this particular case, I'd imagine the challenge is how to define a "global constant" address space? What is its characteristic?

@zGoldthorpe
Copy link
Contributor Author

Ideally yes, but in practice maybe challenging. I've tried to add stuff about address spaces into some common files several times in the past but got a lot of push back. Almost none of those effort succeeded. The essential question is the definition. For this particular case, I'd imagine the challenge is how to define a "global constant" address space? What is its characteristic?

Maybe a constant addrspace should be analogous to (external) constant global variables? E.g., we may assume that the contents of the address space are never modified, and stores to it are UB.

@arsenm
Copy link
Contributor

arsenm commented Jun 5, 2025

Ideally yes, but in practice maybe challenging. I've tried to add stuff about address spaces into some common files several times in the past but got a lot of push back. Almost none of those effort succeeded. The essential question is the definition. For this particular case, I'd imagine the challenge is how to define a "global constant" address space? What is its characteristic?

Maybe a constant addrspace should be analogous to (external) constant global variables? E.g., we may assume that the contents of the address space are never modified, and stores to it are UB.

No, constant global variable is already a constant. The constant address space is wholly redundant. We should work to eliminate the notion of constant address space we have now. I wouldn't put any effort into trying to infer constantness out of the address space

@krzysz00
Copy link
Contributor

krzysz00 commented Jun 5, 2025

If we want to get rid of the constant address space idiom, that's fine by me - we'll just want to get the frontends to tag their loads from constants as invariant

@arsenm
Copy link
Contributor

arsenm commented Jun 5, 2025

If we want to get rid of the constant address space idiom, that's fine by me - we'll just want to get the frontends to tag their loads from constants as invariant

The main problem is invariant load doesn't do anything on memcpy or other memory intrinsics

@zGoldthorpe
Copy link
Contributor Author

@jdoerfert pinging to request review

@krzysz00
Copy link
Contributor

krzysz00 commented Jun 5, 2025

The main problem is invariant load doesn't do anything on memcpy or other memory intrinsics

That feels ... fixable? (Especially since MachineMemOperand has an invariant field)

But that's probably a low-priority eventually sort of thing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants