[custom op] Generalize shape library logic to work with dtypes (llvm#…

…1594) * [custom op] Generalize shape library logic to work with dtypes This commit generalizes the shape library logic, so that dtype rules for ops can also be expressed using the same mechanism. In other words, each op can now have a shape function and a dtype function specified in Python that is imported during lowering to calculate the shapes and dtypes throught a program. For more information about how to specify a dtype function, see the updated `docs/adding_a_shape_and_dtype_function.md`. For those not familiar with how the shape library works, the file `docs/calculations_lib.md` provides an overview.
chenkanhw · Dec 13, 2022 · a710237 · a710237
1 parent 2acf7da
commit a710237
Show file tree

Hide file tree

Showing 45 changed files with 3,739 additions and 2,312 deletions.
diff --git a/.github/workflows/RollPyTorch.yml b/.github/workflows/RollPyTorch.yml
@@ -61,14 +61,14 @@ jobs:
         echo "PT_RELEASE=${PT_RELEASE}" >> ${GITHUB_ENV}
         echo "PT_HASH_CHANGED=${PT_HASH_CHANGED}" >> ${GITHUB_ENV}
 
-    - name: Build and test (in-tree), also update ODS and shape library
+    - name: Build and test (in-tree), also update ODS and abstract interpretation library
       if: env.PT_HASH_CHANGED != '0'
       run: |
         cd ${GITHUB_WORKSPACE}
         TM_PACKAGES="in-tree" TM_USE_PYTORCH_BINARY="OFF" \
         TORCH_MLIR_SRC_PYTORCH_BRANCH="${{ env.PT_HASH }}" \
         TORCH_MLIR_SRC_PYTORCH_RELEASE="${{ env.PT_RELEASE }}" \
-        TM_UPDATE_ODS_AND_SHAPE_LIB="ON" \
+        TM_UPDATE_ODS_AND_ABSTRACT_INTERP_LIB="ON" \
         ./build_tools/python_deploy/build_linux_packages.sh
 
     - name: Push changes to main branch
@@ -79,7 +79,7 @@ jobs:
         git config user.name "Roll PyTorch Action"
         git fetch --recurse-submodules=no
         git checkout main
-        git add pytorch-hash.txt pytorch-requirements.txt lib/Dialect/Torch/Transforms/ShapeLibrary.cpp include/torch-mlir/Dialect/Torch/IR/GeneratedTorchOps.td
+        git add pytorch-hash.txt pytorch-requirements.txt lib/Dialect/Torch/Transforms/AbstractInterpLibrary.cpp include/torch-mlir/Dialect/Torch/IR/GeneratedTorchOps.td
         git diff --cached --exit-code || (git commit -m "update PyTorch version to ${{ env.PT_RELEASE }}" && git push --set-upstream origin main)
 
     - name: Update PyTorch Build Cache (if running on main branch)

diff --git a/build_tools/python_deploy/build_linux_packages.sh b/build_tools/python_deploy/build_linux_packages.sh
@@ -53,8 +53,8 @@ TM_PACKAGES="${TM_PACKAGES:-torch-mlir}"
 TM_USE_PYTORCH_BINARY="${TM_USE_PYTORCH_BINARY:-ON}"
 # Skip running tests if you want quick iteration
 TM_SKIP_TESTS="${TM_SKIP_TESTS:-OFF}"
-# Update ODS and shape library files
-TM_UPDATE_ODS_AND_SHAPE_LIB="${TM_UPDATE_ODS_AND_SHAPE_LIB:-OFF}"
+# Update ODS and abstract interpretation library files
+TM_UPDATE_ODS_AND_ABSTRACT_INTERP_LIB="${TM_UPDATE_ODS_AND_ABSTRACT_INTERP_LIB:-OFF}"
 
 PKG_VER_FILE="${repo_root}"/torch_mlir_package_version ; [ -f "$PKG_VER_FILE" ] && . "$PKG_VER_FILE"
 TORCH_MLIR_PYTHON_PACKAGE_VERSION="${TORCH_MLIR_PYTHON_PACKAGE_VERSION:-0.0.1}"
@@ -119,7 +119,7 @@ function run_on_host() {
     -e "TM_PYTHON_VERSIONS=${TM_PYTHON_VERSIONS}" \
     -e "TM_PACKAGES=${package}" \
     -e "TM_SKIP_TESTS=${TM_SKIP_TESTS}" \
-    -e "TM_UPDATE_ODS_AND_SHAPE_LIB=${TM_UPDATE_ODS_AND_SHAPE_LIB}" \
+    -e "TM_UPDATE_ODS_AND_ABSTRACT_INTERP_LIB=${TM_UPDATE_ODS_AND_ABSTRACT_INTERP_LIB}" \
     -e "TM_USE_PYTORCH_BINARY=${TM_USE_PYTORCH_BINARY}" \
     -e "TORCH_MLIR_SRC_PYTORCH_REPO=${TORCH_MLIR_SRC_PYTORCH_REPO}" \
     -e "TORCH_MLIR_SRC_PYTORCH_BRANCH=${TORCH_MLIR_SRC_PYTORCH_BRANCH}" \
@@ -164,10 +164,10 @@ function run_in_docker() {
         in-tree)
           setup_venv "$python_version"
           build_in_tree "$TM_USE_PYTORCH_BINARY" "$python_version"
-          if [ "${TM_UPDATE_ODS_AND_SHAPE_LIB}" == "ON" ]; then
+          if [ "${TM_UPDATE_ODS_AND_ABSTRACT_INTERP_LIB}" == "ON" ]; then
             pushd /main_checkout/torch-mlir
             ./build_tools/update_torch_ods.sh
-            ./build_tools/update_shape_lib.sh
+            ./build_tools/update_abstract_interp_lib.sh
             popd
           fi
           if [ "${TM_SKIP_TESTS}" == "OFF" ]; then
@@ -253,8 +253,8 @@ function test_in_tree() {
   cd /main_checkout/torch-mlir/
   export PYTHONPATH="/main_checkout/torch-mlir/build/tools/torch-mlir/python_packages/torch_mlir"
 
-  echo ":::: Check that update_shape_lib.sh has been run"
-  _check_file_not_changed_by ./build_tools/update_shape_lib.sh lib/Dialect/Torch/Transforms/ShapeLibrary.cpp
+  echo ":::: Check that update_abstract_interp_lib.sh has been run"
+  _check_file_not_changed_by ./build_tools/update_abstract_interp_lib.sh lib/Dialect/Torch/Transforms/AbstractInterpLibrary.cpp
 
   echo ":::: Check that update_torch_ods.sh has been run"
   _check_file_not_changed_by ./build_tools/update_torch_ods.sh include/torch-mlir/Dialect/Torch/IR/GeneratedTorchOps.td

diff --git a/build_tools/update_shape_lib.sh → build_tools/update_abstract_interp_lib.sh b/build_tools/update_shape_lib.sh → build_tools/update_abstract_interp_lib.sh
@@ -1,5 +1,6 @@
 #!/bin/bash
-# Updates auto-generated shape library files for the `torch` dialect.
+# Updates auto-generated abstract interpretation library files for the
+# `torch` dialect.
 #
 # Environment variables:
 #   TORCH_MLIR_EXT_MODULES: comma-separated list of python module names
@@ -41,6 +42,6 @@ if [ ! -z ${TORCH_MLIR_EXT_MODULES} ]; then
 fi
 
 PYTHONPATH="${pypath}" python \
-  -m torch_mlir.dialects.torch.importer.jit_ir.build_tools.shape_lib_gen \
+  -m torch_mlir.dialects.torch.importer.jit_ir.build_tools.abstract_interp_lib_gen \
   --pytorch_op_extensions=${ext_module:-""} \
   --torch_transforms_cpp_dir="${torch_transforms_cpp_dir}"
diff --git a/docs/abstract_interp_lib.md b/docs/abstract_interp_lib.md
@@ -0,0 +1,130 @@
+# Torch-MLIR Abstract Interpretation Library Infrastructure
+
+## Overview
+
+The Torch-MLIR project has an infrastructure for maintaining a library of
+calculation functions for different Torch operators, which supply extra
+information such as result dtypes and shapes as well as decompositions. These
+functions are fully executable specifications of the shape/dtype/decomposition
+functions for each operator and can be authored and tested from Python for
+convenience. These are then brought into the compiler and can be manipulated /
+transformed for various purposes.  Additionally, in the case of shape functions,
+this effort is synergistic with upstream PyTorch efforts to maintain a library
+of shape functions.
+
+The two main use cases are:
+
+- Refinement / inference. The `torch-shape-refinement-pipeline` and
+  `torch-dtype-refinement-pipeline` pass pipelines orchestrate a series of
+  passes that use the available information in the program to further refine the
+  types in the program.
+
+- Error guard insertion for backends (Not Yet Implemented). The executable
+  functions can include error guards / assertions that abort the program in case
+  of invalid input (such as a matmul with a mismatching contracting dimension).
+
+## Architecture
+
+Functions are defined as TorchScript-able Python functions in
+`python/torch_mlir/dialects/torch/importer/jit_ir/build_tools/abstract_interp_lib_gen.py`.
+The signatures of the functions are systematically derived from Torch JIT
+operator registry. Most shape functions are expected to reuse the upstream
+helper functions
+[`torch/jit/_shape_functions.py`](https://github.com/pytorch/pytorch/blob/279634f384662b7c3a9f8bf7ccc3a6afd2f05657/torch/jit/_shape_functions.py#L1),
+and any new shape functions should be added there.
+
+The `build_tools/update_abstract_interp_lib.sh` script invokes
+`abstract_interp_lib_gen.py` to generate an MLIR module containing the functions,
+which is currently embedded as a string literal in
+`lib/Dialect/Torch/Transforms/AbstractInterpLibrary.cpp`.
+
+The function `StringRef mlir::torch::Torch::getAbstractInterpLibrary()` is
+available for use inside the compiler any time that the library is needed.
+
+## Shape and Dtype Refinement Pipeline Architecture
+
+One of the main services that Torch-MLIR provides for backends is to normalize
+all Torch frontends into a common form which includes tensor shapes and dtypes
+that are as precise as possible. This alleviates the need for backends to solve
+this problem themselves. This process of shape and dtype refinement is
+accomplished in Torch-MLIR through a pipeline of passes which uses the abstract
+interpretation library combined with abstract interpretation of the calculation
+functions to calculate shapes and dtypes that are as precise as possible.
+
+The pipeline works as follows:
+
+1. Calculations are reified. The `torch-reify-shape-calculations` and
+   `torch-reify-dtype-calculations` passes reify (i.e., materializes into the
+   IR) the functions for each op with a function in the calculation library. To
+   do this, the passes wrap those ops in a `torch.shape.calculate` or
+   `torch.dtype.calculate` op, respectively, which has two regions: 1) a body
+   with the op itself, and 2) the shape or dtype calculation, which calculates
+   the shapes or dtypes of the tensors yielded by the body.
+
+2. Simplifying the functions and propagating the shapes and dtypes. After the
+   functions are reified, we then attempt to "optimize hard enough" until the
+   shapes and dtypes yielded by the calculation regions become obvious in the IR.
+   Those results are propagated through the IR, which usually reveals more
+   opportunities for simplification.
+
+   a. After reification, the functions are just a loose collection of
+   functions, which are difficult to analyze. The first step is to inline them.
+
+   b. After inlining, the `torch-simplify-shape-calculations` and
+   `torch-simplify-dtype-calculations` passes are used to simplify the
+   calculations. These passes bring in a number of targeted canonicalization
+   patterns and folds, along with a few specific patterns such as unrolling
+   fixed-trip-count loops and abstractly interpreting list operations (an
+   example is turning a series of "append" operations into a list
+   literal). These passes also look at the values yielded by the calculation
+   regions, and if the resulting shape or dtype can be deduced by looking at the
+   IR (for example, the shape is the list literal `[1, 2, 3]`), it will refine
+   the types of the `torch.shape.calculate` and `torch.dtype.calculate`
+   ops. This usually yields more opportunities for simplification. This process
+   runs to a fixed-point.
+
+3. Dropping the calculations. Once all the types in the program have been
+   refined as much as possible, the ops that were originally wrapped in
+   `torch.shape.calculate` and `torch.dtype.calculate` are unwrapped by the
+   `torch-drop-abstract-interp-calculations` pass which drops the reified
+   calculations, leaving behind the shape and dtype refined program.
+
+Inferring precise shapes and dtypes often is needed for correctness by
+backends. That said, requiring "optimizing hard enough" for correctness is
+usually considered quite brittle in a compiler flow. In this case, the saving
+grace is that we are only optimizing the functions, which are authored by
+compiler developers (not users), and thus there is some give-and-take in terms
+of understanding the optimizable constructs while authoring the functions, or
+improving the optimizations to enable easier authoring. Some brittleness is
+likely to escape to users, unfortunately, since there will always be situations
+where, for example, a statically shaped program allows the shape functions to be
+simplified to a greater extent than in a dynamically shaped program (for
+example, if the shape function checks "is this dimension of size 1"). We hope
+that this is minimal.
+
+## Adding to the abstract interpretation library
+
+See [Adding a Shape and Dtype Function](adding_a_shape_and_dtype_function.md)
+for details on how to add a shape and dtype function for an operator.
+
+## Rationale
+
+### Use of full operator signatures
+
+The use of the full operator signature such as
+`def aten〇add〇Tensor(self: List[int], other: List[int], alpha: float = 1) -> List[int]:`
+for defining calculation functions is somewhat verbose and repetitive, especially when
+there are multiple identical functions. Upstream uses a map with key-value
+pairs like `"aten.add.Tensor": upstream_shape_functions.broadcast`, which is
+more compact and less repetitive in some ways (upstream also allows trailing
+arguments beyond those accepted by the shape function to be ignored, allowing
+further deduplication). The decision to do it the more verbose way in Torch-MLIR
+was based on the following goals:
+
+- To make the system very easy to debug and test.
+
+- To make the system maximally consistent between functions that are
+  implemented with the upstream shape helpers and the ones that are manually
+  written, which are still a fairly large and non-trivial set.
+
+- To make it as mechanical as possible to add a new function.
diff --git a/docs/adding_a_shape_function.md b/docs/adding_a_shape_function.md