Skip to content

Conversation

Renaud-K
Copy link
Contributor

@Renaud-K Renaud-K commented Feb 6, 2025

Typically, we do not track memory sources after a load because of the dynamic nature of the load and the fact that the alias analysis is a simple static analysis.

However, the code is written in a way that makes it seem like we are continuing to track memory but in reality we are only doing so when we know that the tracked memory is a leaf and therefore when there will only be one more iteration through the switch statement. In other words, we are iterating one more time, to gather data about a box, anticipating that this will be the last time. This is a hack that helped avoid cut-and-paste from other case statements but gives the wrong impression about the intention of the code and makes it confusing.

To make it clear that there is no more tracking, we gather all the necessary data from the memref of the load, in the case statement for the load, and exit the loop. I am also limiting this data gathering for the case when we load a box reference while we were actually following data, as tests have shows, is the only case when we need it for. Other cases will be handled conservatively, but this can change in the future, on a case-by-case basis.

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir labels Feb 6, 2025
@llvmbot
Copy link
Member

llvmbot commented Feb 6, 2025

@llvm/pr-subscribers-flang-fir-hlfir

Author: Renaud Kauffmann (Renaud-K)

Changes

Typically, we do not track memory sources after a load because of the dynamic nature of the load and the fact that the alias analysis is a simple static analysis.

However the code is written in a way that makes it seem like we are continuing to track memory but in reality we are only doing so when we know that the tracked memory is a leaf and therefore that there will only be one more iteration through the switch statement. In other words, we are iterating one more time, to gather data about a box, anticipating that this will be the last time. This is a hack that helped avoid cut-and-paste from other case statements but gives the wrong impression about the intention of the code and makes it confusing.

To make it clear that there is no more tracking, we gather all the necessary data from the memref of the load, in the case statement for the load, and exit the loop. I am also limiting this data gathering for the case when we load a box reference while we were actually following data, as tests have shows, is the only case when we need it for. Other cases will be handled conservatively, but this can change in the future, on a case-by-case basis.


Full diff: https://github.com/llvm/llvm-project/pull/126156.diff

2 Files Affected:

  • (modified) flang/lib/Optimizer/Analysis/AliasAnalysis.cpp (+37-11)
  • (modified) flang/test/Analysis/AliasAnalysis/alias-analysis-2.fir (+4-2)
diff --git a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
index 01f3a0326db216e..6f65ca18c3074bf 100644
--- a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
+++ b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
@@ -51,7 +51,7 @@ static bool hasGlobalOpTargetAttr(mlir::Value v, fir::AddrOfOp op) {
       v, fir::GlobalOp::getTargetAttrName(globalOpName));
 }
 
-mlir::Value getOriginalDef(mlir::Value v) {
+static mlir::Value getOriginalDef(mlir::Value v) {
   mlir::Operation *defOp;
   bool breakFromLoop = false;
   while (!breakFromLoop && (defOp = v.getDefiningOp())) {
@@ -578,16 +578,7 @@ AliasAnalysis::Source AliasAnalysis::getSource(mlir::Value v,
             breakFromLoop = true;
         })
         .Case<fir::LoadOp>([&](auto op) {
-          // If the load is from a leaf source, return the leaf. Do not track
-          // through indirections otherwise.
-          // TODO: Add support to fir.alloca and fir.allocmem
-          auto def = getOriginalDef(op.getMemref());
-          if (isDummyArgument(def) ||
-              def.template getDefiningOp<fir::AddrOfOp>()) {
-            v = def;
-            defOp = v.getDefiningOp();
-            return;
-          }
+          
           // If load is inside target and it points to mapped item,
           // continue tracking.
           Operation *loadMemrefOp = op.getMemref().getDefiningOp();
@@ -600,6 +591,41 @@ AliasAnalysis::Source AliasAnalysis::getSource(mlir::Value v,
             defOp = v.getDefiningOp();
             return;
           }
+
+          // If we are loading a box reference, but following the data,
+          // we gather the attributes of the box to populate the source
+          // and stop tracking.
+          if (auto boxTy = mlir::dyn_cast<fir::BaseBoxType>(ty);
+              boxTy && followingData) {
+
+            if (mlir::isa<fir::PointerType>(boxTy.getEleTy())) {
+              attributes.set(Attribute::Pointer);
+            }
+
+            auto def = getOriginalDef(op.getMemref());
+            if (auto addrOfOp = def.template getDefiningOp<fir::AddrOfOp>()) {
+              global = addrOfOp.getSymbol();
+
+              if (hasGlobalOpTargetAttr(def, addrOfOp))
+                attributes.set(Attribute::Target);
+
+              type = SourceKind::Global;
+            }
+
+            // TODO: Add support to fir.alloca and fir.allocmem
+            // if (auto allocOp = def.template getDefiningOp<fir::AllocaOp>()) {
+            //   ...
+            // }
+
+            if (isDummyArgument(def)) {
+              defOp = nullptr;
+              v = def;
+            }
+            
+            breakFromLoop = true;
+            return;
+          }
+
           // No further tracking for addresses loaded from memory for now.
           type = SourceKind::Indirect;
           breakFromLoop = true;
diff --git a/flang/test/Analysis/AliasAnalysis/alias-analysis-2.fir b/flang/test/Analysis/AliasAnalysis/alias-analysis-2.fir
index ca97c5900281d64..3fbf29ab2eb2926 100644
--- a/flang/test/Analysis/AliasAnalysis/alias-analysis-2.fir
+++ b/flang/test/Analysis/AliasAnalysis/alias-analysis-2.fir
@@ -49,11 +49,13 @@
 
 // TODO: Can the address in a pointer alias the address of a pointer, even when the
 // pointer has no box. Should this be NoAlias?
-// T3: CHECK-DAG: p1.addr#0 <-> p1.tgt#0: MayAlias
+// T3: 
+// CHECK-DAG: p1.addr#0 <-> p1.tgt#0: MayAlias
 
 // The addresses stored in two different pointers can alias, even if one has no
 // box.  In this program, they happen to be the same address.
-// T4: CHECK-DAG: p1.tgt#0 <-> boxp1.addr#0: MayAlias
+// T4: 
+// CHECK-DAG: p1.tgt#0 <-> boxp1.addr#0: MayAlias
 
 func.func @_QFPtest(%arg0: !fir.ref<f32> {fir.bindc_name = "v1", fir.target}, %arg1: !fir.ref<f32> {fir.bindc_name = "v2", fir.target}, %arg2: !fir.ref<!fir.box<!fir.ptr<f32>>> ) attributes {test.ptr = "func"} {
 

Copy link

github-actions bot commented Feb 6, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Renaud, this makes sense to me. I wonder however if you could not use getSource recursively and decide what attributes must be propagated to the address being analyzed from the address that contains it.

That way, you do not have to implement the walk/attribute gathering again, but only need to decide the "attribute propagation" logic.

attributes.set(Attribute::Target);

type = SourceKind::Global;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't here other cases where the Target attribute should be set (local/dummy allocatables with TARGET attribute for instance)?

Why not calling getSource on op.getMemref() and propagating the TARGET/POINTER attribute/whatever is relevant to propagate from the variable containing the address to the address?

Copy link
Contributor Author

@Renaud-K Renaud-K Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plan was to be doing this in getOriginalDef as in #117785 which has the tests that expose the need for it.
I could do it here, if you have concerns that we might be losing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though, I have added a test that shows that we are reading the target attribute properly. It's only when we add support for local boxes that the attribute needs to be extracted from fir.declare. This is coming.

llvm::isa<omp::TargetOp>(loadMemrefOp->getParentOp())) {
v = op.getMemref();
defOp = v.getDefiningOp();
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea why the OpenMP case should not follow the same logic and continue the walk?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why I added @DominikAdamski to the review. To show that we want to set a better precedent for loads and see if maybe something similar can be done for Omp. I did not quite follow the need for it.

@Renaud-K
Copy link
Contributor Author

Renaud-K commented Feb 7, 2025

Using getSource recursively, is a bit what got us into trouble because it is stretching the original goal of the function which is designed to get the source. To gather information about the memref we have getOriginalDef which is much more limited in scope. In #117785 getOriginalDef will also be gathering more information.

Comment on lines +597 to +598
if (auto boxTy = mlir::dyn_cast<fir::BaseBoxType>(ty);
boxTy && followingData) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a goal of this new condition to ensure cases like fir.alloca !fir.ptr<T> and !fir.ref<!fir.ptr<T>> will never be handled by this code? Thus, T3 and T4 will stay MayAlias, even after we extend this for alloca? If so, that works for my current use cases.

For now (before we extend for alloca), is this PR intended to be NFC?

Copy link
Contributor Author

@Renaud-K Renaud-K Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does do that but it is not really the intention. The functionality should remain vastly the same, so maybe NFCI. Before and after, the load of a fir.alloca will be a SourceKind::Indirect. If we are to extend data vs non-data, we will, it is just code. What is emerging from this change and the fact that everything still works with it, is a new idea for me: it could make sense to have a SourceKind::Emboxed memory source in the future. But this is a topic for a different discussion.

@Renaud-K Renaud-K changed the title [flang] Stop tracking memory source after a load in a more explicit manner. [flang][NFCI] Stop tracking memory source after a load in a more explicit manner. Feb 7, 2025
Renaud-K and others added 2 commits February 7, 2025 11:21
Point to where T3 is discussed.

Co-authored-by: Joel E. Denny <jdenny.ornl@gmail.com>
Slight comment change.

Co-authored-by: Joel E. Denny <jdenny.ornl@gmail.com>
@Renaud-K Renaud-K merged commit 9d7177a into llvm:main Feb 11, 2025
8 checks passed
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
…icit manner. (llvm#126156)

Typically, we do not track memory sources after a load because of the
dynamic nature of the load and the fact that the alias analysis is a
simple static analysis.

However, the code is written in a way that makes it seem like we are
continuing to track memory but in reality we are only doing so when we
know that the tracked memory is a leaf and therefore when there will
only be one more iteration through the switch statement. In other words,
we are iterating one more time, to gather data about a box, anticipating
that this will be the last time. This is a hack that helped avoid
cut-and-paste from other case statements but gives the wrong impression
about the intention of the code and makes it confusing.

To make it clear that there is no more tracking, we gather all the
necessary data from the memref of the load, in the case statement for
the load, and exit the loop. I am also limiting this data gathering for
the case when we load a box reference while we were actually following
data, as tests have shows, is the only case when we need it for. Other
cases will be handled conservatively, but this can change in the future,
on a case-by-case basis.

---------

Co-authored-by: Joel E. Denny <jdenny.ornl@gmail.com>
}

breakFromLoop = true;
return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed that we now return here without setting type = SourceKind::Indirect in some cases when we don't have a dummy arg or AddrOfOp. As a result, some cases that used to be Indirect are now Unknown. That seems fine for AliasAnalysis::alias, whose logic treats them equivalently. Is this change intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it was not. Though, offline @jeanPerier has been keen on having getSource replace getOriginalDef. Maybe we can fix it then. #126156 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I am afraid we will end-up duplicating a lot of logic to walk and gather information, and that people will update getSource without getOriginalDef.

I think it would be more powerfull/generic to call getSource on the memref indirection and to propagate the info from the memref source info to the address being analyzed.

For instance, if the memref is a POINTER, technically the loaded address is not a pointer, it is a TARGET. If we cared about OPTIONAL, this aspect would not propagate from the memref to the loaded address, but things like VOLATILE and ASYNCHRONOUS probably would. TARGET propagates (an allocatable component of a deriver type that is TARGET can be taken by a pointer).... INTENT propagates. I believe Fortran committee wants to introduce a way to specify data vs descriptor "constness" (since INTENT always apply to both), and whatever attribute they come up with would not propagate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say, let's proceed with #117785 which is now approved. Then revisit this with getSource.

flovent pushed a commit to flovent/llvm-project that referenced this pull request Feb 13, 2025
…icit manner. (llvm#126156)

Typically, we do not track memory sources after a load because of the
dynamic nature of the load and the fact that the alias analysis is a
simple static analysis.

However, the code is written in a way that makes it seem like we are
continuing to track memory but in reality we are only doing so when we
know that the tracked memory is a leaf and therefore when there will
only be one more iteration through the switch statement. In other words,
we are iterating one more time, to gather data about a box, anticipating
that this will be the last time. This is a hack that helped avoid
cut-and-paste from other case statements but gives the wrong impression
about the intention of the code and makes it confusing.

To make it clear that there is no more tracking, we gather all the
necessary data from the memref of the load, in the case statement for
the load, and exit the loop. I am also limiting this data gathering for
the case when we load a box reference while we were actually following
data, as tests have shows, is the only case when we need it for. Other
cases will be handled conservatively, but this can change in the future,
on a case-by-case basis.

---------

Co-authored-by: Joel E. Denny <jdenny.ornl@gmail.com>
joaosaffran pushed a commit to joaosaffran/llvm-project that referenced this pull request Feb 14, 2025
…icit manner. (llvm#126156)

Typically, we do not track memory sources after a load because of the
dynamic nature of the load and the fact that the alias analysis is a
simple static analysis.

However, the code is written in a way that makes it seem like we are
continuing to track memory but in reality we are only doing so when we
know that the tracked memory is a leaf and therefore when there will
only be one more iteration through the switch statement. In other words,
we are iterating one more time, to gather data about a box, anticipating
that this will be the last time. This is a hack that helped avoid
cut-and-paste from other case statements but gives the wrong impression
about the intention of the code and makes it confusing.

To make it clear that there is no more tracking, we gather all the
necessary data from the memref of the load, in the case statement for
the load, and exit the loop. I am also limiting this data gathering for
the case when we load a box reference while we were actually following
data, as tests have shows, is the only case when we need it for. Other
cases will be handled conservatively, but this can change in the future,
on a case-by-case basis.

---------

Co-authored-by: Joel E. Denny <jdenny.ornl@gmail.com>
jdenny-ornl added a commit to jdenny-ornl/llvm-project that referenced this pull request Feb 19, 2025
As mentioned at
<llvm#126156 (comment)>:
PR llvm#126156 causes AliasAnalysis::getSource to sometimes return
SourceKind::Unknown when it used to return SourceKind::Indirect for a
LoadOp.  This patch restores that part of the old behavior.  It should
not affect user-visible behavior because AliasAnalysis::alias treats
SourceKind::Unknown and SourceKind::Indirect equivalently, but it does
improve debugging output.
jdenny-ornl added a commit that referenced this pull request Feb 21, 2025
)

As mentioned at
<#126156 (comment)>:
PR #126156 causes AliasAnalysis::getSource to sometimes return
SourceKind::Unknown when it used to return SourceKind::Indirect for a
LoadOp. This patch restores that part of the old behavior. It should not
affect user-visible behavior because AliasAnalysis::alias treats
SourceKind::Unknown and SourceKind::Indirect equivalently, but it does
improve debugging output.
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Feb 21, 2025
…oadOp (#127845)

As mentioned at
<llvm/llvm-project#126156 (comment)>:
PR #126156 causes AliasAnalysis::getSource to sometimes return
SourceKind::Unknown when it used to return SourceKind::Indirect for a
LoadOp. This patch restores that part of the old behavior. It should not
affect user-visible behavior because AliasAnalysis::alias treats
SourceKind::Unknown and SourceKind::Indirect equivalently, but it does
improve debugging output.
sivan-shani pushed a commit to sivan-shani/llvm-project that referenced this pull request Feb 24, 2025
…icit manner. (llvm#126156)

Typically, we do not track memory sources after a load because of the
dynamic nature of the load and the fact that the alias analysis is a
simple static analysis.

However, the code is written in a way that makes it seem like we are
continuing to track memory but in reality we are only doing so when we
know that the tracked memory is a leaf and therefore when there will
only be one more iteration through the switch statement. In other words,
we are iterating one more time, to gather data about a box, anticipating
that this will be the last time. This is a hack that helped avoid
cut-and-paste from other case statements but gives the wrong impression
about the intention of the code and makes it confusing.

To make it clear that there is no more tracking, we gather all the
necessary data from the memref of the load, in the case statement for
the load, and exit the loop. I am also limiting this data gathering for
the case when we load a box reference while we were actually following
data, as tests have shows, is the only case when we need it for. Other
cases will be handled conservatively, but this can change in the future,
on a case-by-case basis.

---------

Co-authored-by: Joel E. Denny <jdenny.ornl@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants