Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mlir] [dataflow] Refactoring the definition of program points in dat… #105656

Merged
merged 3 commits into from
Aug 25, 2024

Conversation

cxy-1993
Copy link
Contributor

@cxy-1993 cxy-1993 commented Aug 22, 2024

…a flow analysis

This patch distinguishes between program points and lattice anchors in data flow analysis, where lattice anchors represent locations where a lattice can be attached, while program points denote points in program execution.

Related discussions: https://discourse.llvm.org/t/rfc-unify-the-semantics-of-program-points/80671/8

…a flow analysis

This path distinguishes between program points and lattice anchors in data flow
analysis, where lattice anchors represent locations where a lattice can be
attached, while program points denote points in program execution.

Related discussions: https://discourse.llvm.org/t/rfc-unify-the-semantics-of-program-points/80671/8
@llvmbot
Copy link
Collaborator

llvmbot commented Aug 22, 2024

@llvm/pr-subscribers-mlir

Author: donald chen (cxy-1993)

Changes

…a flow analysis

This path distinguishes between program points and lattice anchors in data flow analysis, where lattice anchors represent locations where a lattice can be attached, while program points denote points in program execution.

Related discussions: https://discourse.llvm.org/t/rfc-unify-the-semantics-of-program-points/80671/8


Patch is 45.52 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/105656.diff

13 Files Affected:

  • (modified) mlir/include/mlir/Analysis/DataFlow/DeadCodeAnalysis.h (+8-8)
  • (modified) mlir/include/mlir/Analysis/DataFlow/DenseAnalysis.h (+19-18)
  • (modified) mlir/include/mlir/Analysis/DataFlow/IntegerRangeAnalysis.h (+1-1)
  • (modified) mlir/include/mlir/Analysis/DataFlow/SparseAnalysis.h (+4-4)
  • (modified) mlir/include/mlir/Analysis/DataFlowFramework.h (+141-107)
  • (modified) mlir/lib/Analysis/DataFlow/DeadCodeAnalysis.cpp (+16-14)
  • (modified) mlir/lib/Analysis/DataFlow/DenseAnalysis.cpp (+14-18)
  • (modified) mlir/lib/Analysis/DataFlow/IntegerRangeAnalysis.cpp (+1-1)
  • (modified) mlir/lib/Analysis/DataFlow/SparseAnalysis.cpp (+13-19)
  • (modified) mlir/lib/Analysis/DataFlowFramework.cpp (+29-16)
  • (modified) mlir/test/lib/Analysis/DataFlow/TestDeadCodeAnalysis.cpp (+1-1)
  • (modified) mlir/test/lib/Analysis/DataFlow/TestDenseDataFlowAnalysis.h (+1-1)
  • (modified) mlir/test/lib/Analysis/TestDataFlowFramework.cpp (+2-5)
diff --git a/mlir/include/mlir/Analysis/DataFlow/DeadCodeAnalysis.h b/mlir/include/mlir/Analysis/DataFlow/DeadCodeAnalysis.h
index 10ef8b6ba5843a..80c8b86c63678a 100644
--- a/mlir/include/mlir/Analysis/DataFlow/DeadCodeAnalysis.h
+++ b/mlir/include/mlir/Analysis/DataFlow/DeadCodeAnalysis.h
@@ -35,21 +35,21 @@ namespace dataflow {
 //===----------------------------------------------------------------------===//
 
 /// This is a simple analysis state that represents whether the associated
-/// program point (either a block or a control-flow edge) is live.
+/// lattice anchor (either a block or a control-flow edge) is live.
 class Executable : public AnalysisState {
 public:
   using AnalysisState::AnalysisState;
 
-  /// Set the state of the program point to live.
+  /// Set the state of the lattice anchor to live.
   ChangeResult setToLive();
 
-  /// Get whether the program point is live.
+  /// Get whether the lattice anchor is live.
   bool isLive() const { return live; }
 
   /// Print the liveness.
   void print(raw_ostream &os) const override;
 
-  /// When the state of the program point is changed to live, re-invoke
+  /// When the state of the lattice anchor is changed to live, re-invoke
   /// subscribed analyses on the operations in the block and on the block
   /// itself.
   void onUpdate(DataFlowSolver *solver) const override;
@@ -60,8 +60,8 @@ class Executable : public AnalysisState {
   }
 
 private:
-  /// Whether the program point is live. Optimistically assume that the program
-  /// point is dead.
+  /// Whether the lattice anchor is live. Optimistically assume that the lattice
+  /// anchor is dead.
   bool live = false;
 
   /// A set of analyses that should be updated when this state changes.
@@ -140,10 +140,10 @@ class PredecessorState : public AnalysisState {
 // CFGEdge
 //===----------------------------------------------------------------------===//
 
-/// This program point represents a control-flow edge between a block and one
+/// This lattice anchor represents a control-flow edge between a block and one
 /// of its successors.
 class CFGEdge
-    : public GenericProgramPointBase<CFGEdge, std::pair<Block *, Block *>> {
+    : public GenericLatticeAnchorBase<CFGEdge, std::pair<Block *, Block *>> {
 public:
   using Base::Base;
 
diff --git a/mlir/include/mlir/Analysis/DataFlow/DenseAnalysis.h b/mlir/include/mlir/Analysis/DataFlow/DenseAnalysis.h
index 4ad5f3fcd838c0..7917f1e3ba6485 100644
--- a/mlir/include/mlir/Analysis/DataFlow/DenseAnalysis.h
+++ b/mlir/include/mlir/Analysis/DataFlow/DenseAnalysis.h
@@ -91,15 +91,16 @@ class AbstractDenseForwardDataFlowAnalysis : public DataFlowAnalysis {
                                            const AbstractDenseLattice &before,
                                            AbstractDenseLattice *after) = 0;
 
-  /// Get the dense lattice after the execution of the given program point.
-  virtual AbstractDenseLattice *getLattice(ProgramPoint point) = 0;
+  /// Get the dense lattice after the execution of the given lattice anchor.
+  virtual AbstractDenseLattice *getLattice(LatticeAnchor anchor) = 0;
 
   /// Get the dense lattice after the execution of the given program point and
-  /// add it as a dependency to a program point. That is, every time the lattice
-  /// after point is updated, the dependent program point must be visited, and
-  /// the newly triggered visit might update the lattice after dependent.
+  /// add it as a dependency to a lattice anchor. That is, every time the
+  /// lattice after anchor is updated, the dependent program point must be
+  /// visited, and the newly triggered visit might update the lattice after
+  /// dependent.
   const AbstractDenseLattice *getLatticeFor(ProgramPoint dependent,
-                                            ProgramPoint point);
+                                            LatticeAnchor anchor);
 
   /// Set the dense lattice at control flow entry point and propagate an update
   /// if it changed.
@@ -249,9 +250,9 @@ class DenseForwardDataFlowAnalysis
   }
 
 protected:
-  /// Get the dense lattice after this program point.
-  LatticeT *getLattice(ProgramPoint point) override {
-    return getOrCreate<LatticeT>(point);
+  /// Get the dense lattice on this lattice anchor.
+  LatticeT *getLattice(LatticeAnchor anchor) override {
+    return getOrCreate<LatticeT>(anchor);
   }
 
   /// Set the dense lattice at control flow entry point and propagate an update
@@ -331,16 +332,16 @@ class AbstractDenseBackwardDataFlowAnalysis : public DataFlowAnalysis {
                                            const AbstractDenseLattice &after,
                                            AbstractDenseLattice *before) = 0;
 
-  /// Get the dense lattice before the execution of the program point. That is,
+  /// Get the dense lattice before the execution of the lattice anchor. That is,
   /// before the execution of the given operation or after the execution of the
   /// block.
-  virtual AbstractDenseLattice *getLattice(ProgramPoint point) = 0;
+  virtual AbstractDenseLattice *getLattice(LatticeAnchor anchor) = 0;
 
-  /// Get the dense lattice before the execution of the program point `point`
-  /// and declare that the `dependent` program point must be updated every time
-  /// `point` is.
+  /// Get the dense lattice before the execution of the program point in
+  /// `anchor` and declare that the `dependent` program point must be updated
+  /// every time `point` is.
   const AbstractDenseLattice *getLatticeFor(ProgramPoint dependent,
-                                            ProgramPoint point);
+                                            LatticeAnchor anchor);
 
   /// Set the dense lattice before at the control flow exit point and propagate
   /// the update if it changed.
@@ -500,9 +501,9 @@ class DenseBackwardDataFlowAnalysis
   }
 
 protected:
-  /// Get the dense lattice at the given program point.
-  LatticeT *getLattice(ProgramPoint point) override {
-    return getOrCreate<LatticeT>(point);
+  /// Get the dense lattice at the given lattice anchor.
+  LatticeT *getLattice(LatticeAnchor anchor) override {
+    return getOrCreate<LatticeT>(anchor);
   }
 
   /// Set the dense lattice at control flow exit point (after the terminator)
diff --git a/mlir/include/mlir/Analysis/DataFlow/IntegerRangeAnalysis.h b/mlir/include/mlir/Analysis/DataFlow/IntegerRangeAnalysis.h
index d4a5472cfde868..f99eae379596b6 100644
--- a/mlir/include/mlir/Analysis/DataFlow/IntegerRangeAnalysis.h
+++ b/mlir/include/mlir/Analysis/DataFlow/IntegerRangeAnalysis.h
@@ -50,7 +50,7 @@ class IntegerRangeAnalysis
   /// At an entry point, we cannot reason about interger value ranges.
   void setToEntryState(IntegerValueRangeLattice *lattice) override {
     propagateIfChanged(lattice, lattice->join(IntegerValueRange::getMaxRange(
-                                    lattice->getPoint())));
+                                    lattice->getAnchor())));
   }
 
   /// Visit an operation. Invoke the transfer function on each operation that
diff --git a/mlir/include/mlir/Analysis/DataFlow/SparseAnalysis.h b/mlir/include/mlir/Analysis/DataFlow/SparseAnalysis.h
index 89726ae3a855c8..933790b4f2a6eb 100644
--- a/mlir/include/mlir/Analysis/DataFlow/SparseAnalysis.h
+++ b/mlir/include/mlir/Analysis/DataFlow/SparseAnalysis.h
@@ -36,8 +36,8 @@ class AbstractSparseLattice : public AnalysisState {
   /// Lattices can only be created for values.
   AbstractSparseLattice(Value value) : AnalysisState(value) {}
 
-  /// Return the program point this lattice is located at.
-  Value getPoint() const { return AnalysisState::getPoint().get<Value>(); }
+  /// Return the value this lattice is located at.
+  Value getAnchor() const { return AnalysisState::getAnchor().get<Value>(); }
 
   /// Join the information contained in 'rhs' into this lattice. Returns
   /// if the value of the lattice changed.
@@ -86,8 +86,8 @@ class Lattice : public AbstractSparseLattice {
 public:
   using AbstractSparseLattice::AbstractSparseLattice;
 
-  /// Return the program point this lattice is located at.
-  Value getPoint() const { return point.get<Value>(); }
+  /// Return the value this lattice is located at.
+  Value getAnchor() const { return anchor.get<Value>(); }
 
   /// Return the value held by this lattice. This requires that the value is
   /// initialized.
diff --git a/mlir/include/mlir/Analysis/DataFlowFramework.h b/mlir/include/mlir/Analysis/DataFlowFramework.h
index 2580ec28b51902..472c5cd5406946 100644
--- a/mlir/include/mlir/Analysis/DataFlowFramework.h
+++ b/mlir/include/mlir/Analysis/DataFlowFramework.h
@@ -31,9 +31,9 @@ namespace mlir {
 
 /// A result type used to indicate if a change happened. Boolean operations on
 /// ChangeResult behave as though `Change` is truth.
-enum class [[nodiscard]] ChangeResult {
-  NoChange,
-  Change,
+enum class [[nodiscard]] ChangeResult{
+    NoChange,
+    Change,
 };
 inline ChangeResult operator|(ChangeResult lhs, ChangeResult rhs) {
   return lhs == ChangeResult::Change ? lhs : rhs;
@@ -49,79 +49,93 @@ inline ChangeResult operator&(ChangeResult lhs, ChangeResult rhs) {
 /// Forward declare the analysis state class.
 class AnalysisState;
 
+/// Program point represents a specific location in the execution of a program.
+/// A sequence of program points can be combined into a control flow graph.
+struct ProgramPoint : public PointerUnion<Operation *, Block *> {
+  using ParentTy = PointerUnion<Operation *, Block *>;
+  /// Inherit constructors.
+  using ParentTy::PointerUnion;
+  /// Allow implicit conversion from the parent type.
+  ProgramPoint(ParentTy point = nullptr) : ParentTy(point) {}
+  /// Allow implicit conversions from operation wrappers.
+  /// TODO: For Windows only. Find a better solution.
+  template <typename OpT, typename = std::enable_if_t<
+                              std::is_convertible<OpT, Operation *>::value &&
+                              !std::is_same<OpT, Operation *>::value>>
+  ProgramPoint(OpT op) : ParentTy(op) {}
+
+  /// Print the program point.
+  void print(raw_ostream &os) const;
+};
+
 //===----------------------------------------------------------------------===//
-// GenericProgramPoint
+// GenericLatticeAnchor
 //===----------------------------------------------------------------------===//
 
-/// Abstract class for generic program points. In classical data-flow analysis,
-/// programs points represent positions in a program to which lattice elements
+/// Abstract class for generic lattice anchor. In classical data-flow analysis,
+/// lattice anchor represent positions in a program to which lattice elements
 /// are attached. In sparse data-flow analysis, these can be SSA values, and in
 /// dense data-flow analysis, these are the program points before and after
 /// every operation.
 ///
-/// In the general MLIR data-flow analysis framework, program points are an
-/// extensible concept. Program points are uniquely identifiable objects to
-/// which analysis states can be attached. The semantics of program points are
-/// defined by the analyses that specify their transfer functions.
-///
-/// Program points are implemented using MLIR's storage uniquer framework and
+/// Lattice anchor are implemented using MLIR's storage uniquer framework and
 /// type ID system to provide RTTI.
-class GenericProgramPoint : public StorageUniquer::BaseStorage {
+class GenericLatticeAnchor : public StorageUniquer::BaseStorage {
 public:
-  virtual ~GenericProgramPoint();
+  virtual ~GenericLatticeAnchor();
 
-  /// Get the abstract program point's type identifier.
+  /// Get the abstract lattice anchor's type identifier.
   TypeID getTypeID() const { return typeID; }
 
-  /// Get a derived source location for the program point.
+  /// Get a derived source location for the lattice anchor.
   virtual Location getLoc() const = 0;
 
-  /// Print the program point.
+  /// Print the lattice anchor.
   virtual void print(raw_ostream &os) const = 0;
 
 protected:
-  /// Create an abstract program point with type identifier.
-  explicit GenericProgramPoint(TypeID typeID) : typeID(typeID) {}
+  /// Create an abstract lattice anchor with type identifier.
+  explicit GenericLatticeAnchor(TypeID typeID) : typeID(typeID) {}
 
 private:
-  /// The type identifier of the program point.
+  /// The type identifier of the lattice anchor.
   TypeID typeID;
 };
 
 //===----------------------------------------------------------------------===//
-// GenericProgramPointBase
+// GenericLatticeAnchorBase
 //===----------------------------------------------------------------------===//
 
-/// Base class for generic program points based on a concrete program point
+/// Base class for generic lattice anchor based on a concrete lattice anchor
 /// type and a content key. This class defines the common methods required for
 /// operability with the storage uniquer framework.
 ///
-/// The provided key type uniquely identifies the concrete program point
+/// The provided key type uniquely identifies the concrete lattice anchor
 /// instance and are the data members of the class.
 template <typename ConcreteT, typename Value>
-class GenericProgramPointBase : public GenericProgramPoint {
+class GenericLatticeAnchorBase : public GenericLatticeAnchor {
 public:
   /// The concrete key type used by the storage uniquer. This class is uniqued
   /// by its contents.
   using KeyTy = Value;
   /// Alias for the base class.
-  using Base = GenericProgramPointBase<ConcreteT, Value>;
+  using Base = GenericLatticeAnchorBase<ConcreteT, Value>;
 
-  /// Construct an instance of the program point using the provided value and
+  /// Construct an instance of the lattice anchor using the provided value and
   /// the type ID of the concrete type.
   template <typename ValueT>
-  explicit GenericProgramPointBase(ValueT &&value)
-      : GenericProgramPoint(TypeID::get<ConcreteT>()),
+  explicit GenericLatticeAnchorBase(ValueT &&value)
+      : GenericLatticeAnchor(TypeID::get<ConcreteT>()),
         value(std::forward<ValueT>(value)) {}
 
-  /// Get a uniqued instance of this program point class with the given
+  /// Get a uniqued instance of this lattice anchor class with the given
   /// arguments.
   template <typename... Args>
-  static ConcreteT *get(StorageUniquer &uniquer, Args &&...args) {
+  static ConcreteT *get(StorageUniquer &uniquer, Args &&... args) {
     return uniquer.get<ConcreteT>(/*initFn=*/{}, std::forward<Args>(args)...);
   }
 
-  /// Allocate space for a program point and construct it in-place.
+  /// Allocate space for a lattice anchor and construct it in-place.
   template <typename ValueT>
   static ConcreteT *construct(StorageUniquer::StorageAllocator &alloc,
                               ValueT &&value) {
@@ -129,46 +143,48 @@ class GenericProgramPointBase : public GenericProgramPoint {
         ConcreteT(std::forward<ValueT>(value));
   }
 
-  /// Two program points are equal if their values are equal.
+  /// Two lattice anchors are equal if their values are equal.
   bool operator==(const Value &value) const { return this->value == value; }
 
   /// Provide LLVM-style RTTI using type IDs.
-  static bool classof(const GenericProgramPoint *point) {
+  static bool classof(const GenericLatticeAnchor *point) {
     return point->getTypeID() == TypeID::get<ConcreteT>();
   }
 
-  /// Get the contents of the program point.
+  /// Get the contents of the lattice anchor.
   const Value &getValue() const { return value; }
 
 private:
-  /// The program point value.
+  /// The lattice anchor value.
   Value value;
 };
 
 //===----------------------------------------------------------------------===//
-// ProgramPoint
+// LatticeAnchor
 //===----------------------------------------------------------------------===//
 
-/// Fundamental IR components are supported as first-class program points.
-struct ProgramPoint
-    : public PointerUnion<GenericProgramPoint *, Operation *, Value, Block *> {
-  using ParentTy =
-      PointerUnion<GenericProgramPoint *, Operation *, Value, Block *>;
+/// Fundamental IR components are supported as first-class lattice anchor.
+struct LatticeAnchor
+    : public PointerUnion<GenericLatticeAnchor *, ProgramPoint, Value> {
+  using ParentTy = PointerUnion<GenericLatticeAnchor *, ProgramPoint, Value>;
   /// Inherit constructors.
   using ParentTy::PointerUnion;
   /// Allow implicit conversion from the parent type.
-  ProgramPoint(ParentTy point = nullptr) : ParentTy(point) {}
+  LatticeAnchor(ParentTy point = nullptr) : ParentTy(point) {}
   /// Allow implicit conversions from operation wrappers.
   /// TODO: For Windows only. Find a better solution.
   template <typename OpT, typename = std::enable_if_t<
                               std::is_convertible<OpT, Operation *>::value &&
                               !std::is_same<OpT, Operation *>::value>>
-  ProgramPoint(OpT op) : ParentTy(op) {}
+  LatticeAnchor(OpT op) : ParentTy(ProgramPoint(op)) {}
 
-  /// Print the program point.
+  LatticeAnchor(Operation *op) : ParentTy(ProgramPoint(op)) {}
+  LatticeAnchor(Block *block) : ParentTy(ProgramPoint(block)) {}
+
+  /// Print the lattice anchor.
   void print(raw_ostream &os) const;
 
-  /// Get the source location of the program point.
+  /// Get the source location of the lattice anchor.
   Location getLoc() const;
 };
 
@@ -207,8 +223,8 @@ class DataFlowConfig {
 
 /// The general data-flow analysis solver. This class is responsible for
 /// orchestrating child data-flow analyses, running the fixed-point iteration
-/// algorithm, managing analysis state and program point memory, and tracking
-/// dependencies between analyses, program points, and analysis states.
+/// algorithm, managing analysis state and lattice anchor memory, and tracking
+/// dependencies between analyses, lattice anchor, and analysis states.
 ///
 /// Steps to run a data-flow analysis:
 ///
@@ -226,38 +242,39 @@ class DataFlowSolver {
 
   /// Load an analysis into the solver. Return the analysis instance.
   template <typename AnalysisT, typename... Args>
-  AnalysisT *load(Args &&...args);
+  AnalysisT *load(Args &&... args);
 
   /// Initialize the children analyses starting from the provided top-level
   /// operation and run the analysis until fixpoint.
   LogicalResult initializeAndRun(Operation *top);
 
-  /// Lookup an analysis state for the given program point. Returns null if one
+  /// Lookup an analysis state for the given lattice anchor. Returns null if one
   /// does not exist.
-  template <typename StateT, typename PointT>
-  const StateT *lookupState(PointT point) const {
-    auto it = analysisStates.find({ProgramPoint(point), TypeID::get<StateT>()});
+  template <typename StateT, typename AnchorT>
+  const StateT *lookupState(AnchorT anchor) const {
+    auto it =
+        analysisStates.find({LatticeAnchor(anchor), TypeID::get<StateT>()});
     if (it == analysisStates.end())
       return nullptr;
     return static_cast<const StateT *>(it->second.get());
   }
 
-  /// Erase any analysis state associated with the given program point.
-  template <typename PointT>
-  void eraseState(PointT point) {
-    ProgramPoint pp(point);
+  /// Erase any analysis state associated with the given lattice anchor.
+  template <typename AnchorT>
+  void eraseState(AnchorT anchor) {
+    LatticeAnchor la(anchor);
 
     for (auto it = analysisStates.begin(); it != analysisStates.end(); ++it) {
-      if (it->first.first == pp)
+      if (it->first.first == la)
         analysisStates.erase(it);
     }
   }
 
-  /// Get a uniqued program point instance. If one is not present, it is
+  /// Get a uniqued lattice anchor instance. If one is not present, it is
   /// created with the provided arguments.
-  template <typename PointT, typename... Args>
-  PointT *getProgramPoint(Args &&...args) {
-    return PointT::get(uniquer, std::forward<Args>(args)...);
+  template <typename AnchorT, typename... Args>
+  AnchorT *getLatticeAnchor(Args &&... args) {
+    return AnchorT::get(uniquer, std::forward<Args>(args)...);
   }
 
   /// A work item on the solver queue is a program point, child analysis pair.
@@ -267,10 +284,10 @@ class DataFlowSolver {
   /// Push a work item onto the worklist.
   void enqueue(WorkItem item) { worklist.push(std::move(item)); }
 
-  /// Get the state associated with the given program point. If it does not
+  /// Get the state associated with the given lattice anchor....
[truncated]

@cxy-1993 cxy-1993 requested a review from ftynse August 22, 2024 13:19
Copy link

github-actions bot commented Aug 22, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

Copy link
Contributor

@Mogball Mogball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for driving this!

@cxy-1993 cxy-1993 merged commit b6603e1 into llvm:main Aug 25, 2024
8 checks passed
MaheshRavishankar added a commit to iree-org/llvm-project that referenced this pull request Aug 28, 2024
@MaheshRavishankar
Copy link
Contributor

This change seems to have broken torch-mlir. I fully expect that this is probably an issue in torch-mlir itself, but The way the data flow framework is used in InlineGlobalSlots.cpp seems incompatible with this change. I havent been able to make changes to this to get this to build.

I dont have a repro or anything, and I am not fully upto speed on the DataFlow analysis or how it is used in Torch-MLIR but just leaving a note here that this seems to cause a breakage (due to potentially mis-use of the data flow framework)

@MaheshRavishankar
Copy link
Contributor

Cross listing this change with IREE PR that brings in this MLIR change (iree-org/iree#18384) for reference

@cxy-1993
Copy link
Contributor Author

This change seems to have broken torch-mlir. I fully expect that this is probably an issue in torch-mlir itself, but The way the data flow framework is used in InlineGlobalSlots.cpp seems incompatible with this change. I havent been able to make changes to this to get this to build.

I dont have a repro or anything, and I am not fully upto speed on the DataFlow analysis or how it is used in Torch-MLIR but just leaving a note here that this seems to cause a breakage (due to potentially mis-use of the data flow framework)

Thanks for the feedback. It's clear that our current analysis doesn't align well with the patch's modifications. After a brief review of the code (please point out any errors), I believe this analysis is essentially a more robust version of sparse data flow analysis, extended to track attrs. The distinguishing characteristic is that the visit process might be triggered by an attr (symbol ref) rather than an operation.

In my opinion, this doesn't fundamentally deviate from the essence of current data flow analysis. When we update flat symbol attrs or op result, we still rely on initializeGlobalSlotsOp/GlobalSlotGetOp to update operand states. It's just that the current updates only modify the states of some operands in initializeGlobalSlotsOp(visit op will update all). Therefore, I suggest modifying the existing visit op result at (https://github.com/llvm/torch-mlir/blob/98e08023bbf71e00ab81e980eac9f7c96f1f24b4/lib/Dialect/Torch/Transforms/InlineGlobalSlots.cpp#L193C31-L193C46) to visit GlobalSlotGetOp, and modifying the visit flatSymbolRefPoint at (https://github.com/llvm/torch-mlir/blob/98e08023bbf71e00ab81e980eac9f7c96f1f24b4/lib/Dialect/Torch/Transforms/InlineGlobalSlots.cpp#L210) to visit initializeGlobalSlotsOp. This would align with the current dataflow analysis's definition of a program point.

If you have any further questions, please don't hesitate to discuss them with me. Additionally, I'm more than happy to assist you in modifying this section of code if needed.

I believe this modification is a step in the right direction for data flow analysis, and I hope it can be accepted by everyone. I plan to further standardize the semantics of program points, which may introduce some incompatibilities. I'm more than happy to help fix any issues that may arise.

dmpolukhin pushed a commit to dmpolukhin/llvm-project that referenced this pull request Sep 2, 2024
…a flow analysis (llvm#105656)

This patch distinguishes between program points and lattice anchors in
data flow analysis, where lattice anchors represent locations where a
lattice can be attached, while program points denote points in program
execution.

Related discussions:
https://discourse.llvm.org/t/rfc-unify-the-semantics-of-program-points/80671/8
@marbre
Copy link
Member

marbre commented Sep 4, 2024

In my opinion, this doesn't fundamentally deviate from the essence of current data flow analysis. When we update flat symbol attrs or op result, we still rely on initializeGlobalSlotsOp/GlobalSlotGetOp to update operand states. It's just that the current updates only modify the states of some operands in initializeGlobalSlotsOp(visit op will update all). Therefore, I suggest modifying the existing visit op result at (https://github.com/llvm/torch-mlir/blob/98e08023bbf71e00ab81e980eac9f7c96f1f24b4/lib/Dialect/Torch/Transforms/InlineGlobalSlots.cpp#L193C31-L193C46) to visit GlobalSlotGetOp, and modifying the visit flatSymbolRefPoint at (https://github.com/llvm/torch-mlir/blob/98e08023bbf71e00ab81e980eac9f7c96f1f24b4/lib/Dialect/Torch/Transforms/InlineGlobalSlots.cpp#L210) to visit initializeGlobalSlotsOp. This would align with the current dataflow analysis's definition of a program point.

If you have any further questions, please don't hesitate to discuss them with me. Additionally, I'm more than happy to assist you in modifying this section of code if needed.

Thanks for your suggestion @cxy-1993! I am currently looking into how to adapt torch-mlir but I am unfortunately not familiar with the data-flow analysis and might have missed something. Therefore please allow to ask for further clarification.

While the argument to the visit(ProgramPoint point) method still is of a ProgramPoint, the latter was changed and now essentially is a PointerUnion. Before this change, point could be casted to a Value (within visit()) which is no longer possible. To get the value, one could instead cast point to an Operation and obtain the result value from this op. Furthermore, getProgramPoint needs to be replaced with getLatticeAnchor, thus

auto *flatSymbolRefPoint = getProgramPoint<FlatSymbolRefProgramPoint>(globalSlotGet.getSlotAttr());
auto *valueState = getOrCreateFor<InlineGlobalSlotsAnalysisState>(flatSymbolRefPoint, globalSlotGet.getResult()):

becomes

auto *flatSymbolRefPoint = getLatticeAnchor<FlatSymbolRefProgramPoint>(globalSlotGet.getSlotAttr());
auto *valueState = getOrCreateFor<InlineGlobalSlotsAnalysisState>(*flatSymbolRefPoint, globalSlotGet.getResult()):

What is failing here is getOrCreateFor as the first argument needs to be a ProgramPoint but now is a FlatSymbolRefProgramPoint, while assuming that the constructor of InlineGlobalSlotsAnalysisState needs to be changed from

InlineGlobalSlotsAnalysisState(ProgramPoint point) : AnalysisState(point) {

to

InlineGlobalSlotsAnalysisState(LatticeAnchor anchor) : AnalysisState(anchor) {

Any pointers would be appreciated!

@cxy-1993
Copy link
Contributor Author

cxy-1993 commented Sep 5, 2024

In my opinion, this doesn't fundamentally deviate from the essence of current data flow analysis. When we update flat symbol attrs or op result, we still rely on initializeGlobalSlotsOp/GlobalSlotGetOp to update operand states. It's just that the current updates only modify the states of some operands in initializeGlobalSlotsOp(visit op will update all). Therefore, I suggest modifying the existing visit op result at (https://github.com/llvm/torch-mlir/blob/98e08023bbf71e00ab81e980eac9f7c96f1f24b4/lib/Dialect/Torch/Transforms/InlineGlobalSlots.cpp#L193C31-L193C46) to visit GlobalSlotGetOp, and modifying the visit flatSymbolRefPoint at (https://github.com/llvm/torch-mlir/blob/98e08023bbf71e00ab81e980eac9f7c96f1f24b4/lib/Dialect/Torch/Transforms/InlineGlobalSlots.cpp#L210) to visit initializeGlobalSlotsOp. This would align with the current dataflow analysis's definition of a program point.
If you have any further questions, please don't hesitate to discuss them with me. Additionally, I'm more than happy to assist you in modifying this section of code if needed.

Thanks for your suggestion @cxy-1993! I am currently looking into how to adapt torch-mlir but I am unfortunately not familiar with the data-flow analysis and might have missed something. Therefore please allow to ask for further clarification.

While the argument to the visit(ProgramPoint point) method still is of a ProgramPoint, the latter was changed and now essentially is a PointerUnion. Before this change, point could be casted to a Value (within visit()) which is no longer possible. To get the value, one could instead cast point to an Operation and obtain the result value from this op. Furthermore, getProgramPoint needs to be replaced with getLatticeAnchor, thus

auto *flatSymbolRefPoint = getProgramPoint<FlatSymbolRefProgramPoint>(globalSlotGet.getSlotAttr());
auto *valueState = getOrCreateFor<InlineGlobalSlotsAnalysisState>(flatSymbolRefPoint, globalSlotGet.getResult()):

becomes

auto *flatSymbolRefPoint = getLatticeAnchor<FlatSymbolRefProgramPoint>(globalSlotGet.getSlotAttr());
auto *valueState = getOrCreateFor<InlineGlobalSlotsAnalysisState>(*flatSymbolRefPoint, globalSlotGet.getResult()):

What is failing here is getOrCreateFor as the first argument needs to be a ProgramPoint but now is a FlatSymbolRefProgramPoint, while assuming that the constructor of InlineGlobalSlotsAnalysisState needs to be changed from

InlineGlobalSlotsAnalysisState(ProgramPoint point) : AnalysisState(point) {

to

InlineGlobalSlotsAnalysisState(LatticeAnchor anchor) : AnalysisState(anchor) {

Any pointers would be appreciated!

I have a limited understanding of the specific functionality of this pass. Please point out any issues if you find any. For the sake of clarity, let's call the program point before this patch the 'ori program point'. My overall modification idea is as follows:

The InlineGlobalSlotsAnalysis::visit function might accept two types of ori program points and update the lattice state: one is the result of Torch::GlobalSlotGetOp, and the other is FlatSymbolRefProgramPoint. They will respectively use Torch::GlobalSlotGetOp and Torch::InitializeGlobalSlotsOp to update the states of other lattices.

Therefore, I suggest modifying the original code:

getOrCreateFor<InlineGlobalSlotsAnalysisState>(flatSymbolRefPoint, xxx) to getOrCreateFor<InlineGlobalSlotsAnalysisState>(initializeGlobalSlotsOp, xxx).
getOrCreateFor<InlineGlobalSlotsAnalysisState>(value, xxx) to something like getOrCreateFor<InlineGlobalSlotsAnalysisState>(value.getDefiningOp<Torch::GlobalSlotGetOp>(), xxx).

At this point, the visit's program point will only have to deal initializeGlobalSlotsOp and GlobalSlotGetOp, and we can use the information from these two ops to update the lattice state related to these op.

marbre added a commit to marbre/torch-mlir that referenced this pull request Sep 5, 2024
@rsuderman
Copy link
Contributor

So I believe we will want to revert this and reland in the future with changes.

The main issue is the changes to ProgramPoint. Where previously we allowed it to be an old ProgramPoint (now LatticeAnchor) it now assumes that the location is an operation or block. However the torch-mlir pass uses symnames instead to check for cases. Other frameworks do similar cases of symnames for referencing a shared handle not mapped to any particular operation. Likely the right choice is to remove ProgramPoint and have the dependent location be some type of LatticeAnchor or generalized type rather than hardcoding to an Operation* or Block*. If we don't want to assume the same type as LatticeAnchor we can create a similar generic for storing the program point used to visit.

In short, this actually removes some functionality that would require completely reworking downstream analysis passes to correct without there being a clear benefit in imposing this new restriction.

@stellaraccident
Copy link
Contributor

Flyby comment 1: I am almost certain that all of the features behind InlineGlobalSlots.cpp are very close to being decommissioned, and then this very old code can be outright deleted vs carried forward. We've been clear since January that the day will come soon when the cost of maintaining this path cannot be carried anymore. If this is that day, then delete any remaining tests and I will approve.

Flyby comment 2: While we may get lucky in this case and simply not need to address this, in general, when doing an in-situ upgrade of an API, it is important to not just address where things get to but to not remove semantics that people are relying on along the way. I know this is hard/annoying. For this very specific case of impact to torch-mlir, we may be able to not worry about this at this time, but torch-mlir is not the only user of these APIs, and we need to be evolving them without semantic/representability reduction in intermediate steps. If that isn't possible, then it strongly indicates that an in-situ refactor isn't the appropriate way to go.

@Mogball
Copy link
Contributor

Mogball commented Sep 10, 2024

This was the RFC https://discourse.llvm.org/t/rfc-unify-the-semantics-of-program-points/80671. Not many people engaged on the RFC. Waiting to be broken downstream before entering the discussion is probably not the ideal situation, but maybe we should have been more aggressive in pinging potential stakeholders.

That being said, I am sympathetic to the loss of functionality here (my approval was based on the supposed acceptance of the RFC and the code changes themselves). I am +1 to reverting this change but I think we should have a follow-up discussion on ProgramPoint afterwards. Maybe we can have an ODM for the first time in ages!

@Mogball
Copy link
Contributor

Mogball commented Sep 10, 2024

@cxy-1993 Would you be okay with opening a revert? I would also love to understand the specific use cases you have in mind for being more able to generically reason about program points.

@stellaraccident
Copy link
Contributor

From my perspective, this falls into "these things happen" bucket. I read the RFC a few times and failed to see the issue. I wasn't calling for any action this time but more just commenting for the future. I had forgotten that this use existed in this very old code path and didn't think to check deeply. It can be hard to fight inertia on this stuff, and I appreciate the effort it takes.

Maybe give it another day? I still think we can just outright delete the live use downstream. I can't speak for whether other used this feature of the API. So it's a judgment call as to whether to proceed as is or better understand this detail and address with a revert.

@Mogball
Copy link
Contributor

Mogball commented Sep 10, 2024

I read the RFC a few times and failed to see the issue

This is fair. And yeah I really meant "ideally" in an aspiration sense. I know that these things are inevitable. I also read the RFC and didn't pick up on this too :P

@cxy-1993
Copy link
Contributor Author

I sincerely regret any inconvenience this submission may have caused to the downstream repo. I should have given more thought to how to make this change without breaking compatibility.

@cxy-1993 Would you be okay with opening a revert? I would also love to understand the specific use cases you have in mind for being more able to generically reason about program points.

If necessary, I'm glad to revert this code. Should I revert this now?

We definitely need a way to notify all downstream stakeholders about the necessary infrastructure changes. I have another related commit in development, and I'm curious about what measures we can take to ensure it doesn't break compatibility with downstream repos (for example, perhaps creating a new dataflow directory for this commit?). And I'm also wondering at what point it would be safe to break compatibility.Any help is appreciated.

@marbre
Copy link
Member

marbre commented Sep 10, 2024

If necessary, I'm glad to revert this code. Should I revert this now?

It seems @rsuderman found a way that work for torch-mlir, so let's wait for his feedback.

We definitely need a way to notify all downstream stakeholders about the necessary infrastructure changes. I have another related commit in development, and I'm curious about what measures we can take to ensure it doesn't break compatibility with downstream repos (for example, perhaps creating a new dataflow directory for this commit?). And I'm also wondering at what point it would be safe to break compatibility.Any help is appreciated.

Posting (another?) RFC could be an option. If the changes break compatibility, it seems common to send out a PSA (https://discourse.llvm.org/search?q=MLIR%20PSA) to inform users prior to merging. Probably discussing in and ODM as @Mogball suggested would be a good thing as well.

@cxy-1993
Copy link
Contributor Author

If necessary, I'm glad to revert this code. Should I revert this now?

It seems @rsuderman found a way that work for torch-mlir, so let's wait for his feedback.

We definitely need a way to notify all downstream stakeholders about the necessary infrastructure changes. I have another related commit in development, and I'm curious about what measures we can take to ensure it doesn't break compatibility with downstream repos (for example, perhaps creating a new dataflow directory for this commit?). And I'm also wondering at what point it would be safe to break compatibility.Any help is appreciated.

Posting (another?) RFC could be an option. If the changes break compatibility, it seems common to send out a PSA (https://discourse.llvm.org/search?q=MLIR%20PSA) to inform users prior to merging. Probably discussing in and ODM as @Mogball suggested would be a good thing as well.

Thanks for the information.

@stellaraccident
Copy link
Contributor

stellaraccident commented Sep 10, 2024

No apology necessary -- we've all done it. Especially for these older, established APIs, it is a matter of experience to know how to sequence changes in a way that works well for everyone. Generally, we are fine pushing "syntax" changes downstream but removing functionality without replacement is harder to deal with. It's tricky because sometimes it's not obvious that some corner case got used in some way.

I was mainly being explicit about my comments so that we could learn for next time vs suggesting that we need to backtrack this time. The contributions are appreciated.

nirvedhmeshram pushed a commit to iree-org/llvm-project that referenced this pull request Sep 10, 2024
@joker-eph
Copy link
Collaborator

We definitely need a way to notify all downstream stakeholders about the necessary infrastructure changes. I have another related commit in development, and I'm curious about what measures we can take to ensure it doesn't break compatibility with downstream repos (for example, perhaps creating a new dataflow directory for this commit?). And I'm also wondering at what point it would be safe to break compatibility.Any help is appreciated.

You did nothing wrong, it's fine. We break APIs all the time.
It's also hard to evolve components without breaking any use-cases, usually downstream find a way to update but sometimes we miss one use-case that can't be implemented anymore. In such case we just revert and work it out collaboratively (downstream are expected to actively contribute to the solution) to be able to re-land.

Downstream should implement better upstream test to model their use-cases if they want these to be better supported/maintained, otherwise they deal with occasional breakages.

nirvedhmeshram pushed a commit to iree-org/llvm-project that referenced this pull request Sep 12, 2024
qiaojbao pushed a commit to GPUOpen-Drivers/llvm-project that referenced this pull request Sep 30, 2024
…64ddfc135

Local branch amd-gfx 3ab64dd Merged main:2847020dbd9b8f932ee564651ec72ce15fa37d07 into amd-gfx:95a2ef3ad1f2
Remote branch main b6603e1 [mlir] [dataflow] Refactoring the definition of program points in data flow analysis (llvm#105656)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants