Skip to content

Commit 3d421b3

Browse files
ljk53facebook-github-bot
authored andcommitted
[pytorch] rewrite of the python binding codegen with the v2 API (pytorch#46244)
Summary: Pull Request resolved: pytorch#46244 - What does the generated binding code do? The Python binding codegen produces code that takes the input list of PyObjects, finds the matching ATen C++ function using PythonArgParser, converts the PyObjects into C++ types and calls the ATen C++ function: ``` +--------+ parsing +------------------------+ binding +-----------------------+ | PyObjs | ---------> | PythonArgParser Output | ---------> | Cpp Function Dispatch | +--------+ +------------------------+ +-----------------------+ ``` - Are Python arguments 1-1 mapped to C++ arguments? Python arguments might be reordered, packed, unpacked when binding to C++ arguments, as illustrated below: ``` // Binding - Reorder & Packing // aten::empty.names(int[] size, *, Dimname[]? names, ScalarType? dtype=None, Layout? layout=None, Device? device=None, bool? pin_memory=None, MemoryFormat? memory_format=None) -> Tensor Python Args Cpp Args ----------------------------------------------------------- 0: size size 1: names names 2: memory_format -------+ 3: dtype -----+-|--> options 4: layout / | 5: device / +--> memory_format 6: pin_memory / 7: requires_grad -+ // Binding - Unpacking // aten::max.names_dim(Tensor self, Dimname dim, bool keepdim=False) -> (Tensor values, Tensor indices) Python Args Cpp Args ----------------------------------------------------------- +----> max /-----> max_values 0: input / self 1: dim / dim 2: keepdim / keepdim 3: out -----+ ``` - Why do we want to rewrite the python binding codegen? The old codegen takes Declarations.yaml as input. It doesn't distinguish between Python arguments and C++ arguments - they are all mixed together as a bag of non-typed dict objects. Different methods process these arg objects and add new attributes for various different purposes. It's not so obvious to figure out the semantics of these attributes. The complicated binding logic happens implicitly and scatteredly. ``` +--------------------+ | Native Functions | +--------------------+ | | v +--------------------+ | Cpp Signatures | +--------------------+ | | v +--------------------+ | Declarations.yaml | +--------------------+ | +-------------------------------------+ | +-------> | PythonArgParser Schema | | | +-------------------------------------+ | | . | | . v | . +--------------------+ +-------------------------------------+ | NonTyped Args Objs | --> | PythonArgParser -> Cpp Args Binding | +--------------------+ +-------------------------------------+ | . | . | . | +-------------------------------------+ +-------> | Cpp Function Dispatch | +-------------------------------------+ ``` This PR leverages the new immutable data models introduced in the new aten codegen. It introduces dedicated data models for python schema. This way, we can not only avoid subtle Declaration.yaml conversions but also decouple the generation of python schema, python to c++ binding and c++ function call. The ultimate state will be like the following diagram: ``` +-------------------+ +-------------------------------------+ +-------> | Python Signatures | --> | PythonArgParser Schema | | +-------------------+ +-------------------------------------+ | | . | | . | | . +------------------+ | +-------------------------------------+ | Native Functions | +-------> | PythonArgParser -> Cpp Args Binding | +------------------+ | +-------------------------------------+ | | . | | . | | . | +-------------------+ +-------------------------------------+ +-------> | Cpp Signatures | --> | Cpp Function Dispatch | +-------------------+ +-------------------------------------+ ``` This PR has migrated the core binding logic from tools/autograd/gen_python_functions.py to tools/codegen/api/python.py. It produces the byte-for-byte same results (tested with pytorch#46243). Will migrate the rest of gen_python_functions.py in subsequent PRs. Test Plan: Imported from OSS Reviewed By: bhosmer Differential Revision: D24388874 Pulled By: ljk53 fbshipit-source-id: f88b6df4e917cf90d868a2bbae2d5ffb680d1841
1 parent 8f12c0e commit 3d421b3

File tree

9 files changed

+1159
-750
lines changed

9 files changed

+1159
-750
lines changed

.circleci/scripts/cpp_doc_push_script.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,7 @@ cp torch/_utils_internal.py tools/shared
5757
# Generate PyTorch files
5858
time python tools/setup_helpers/generate_code.py \
5959
--declarations-path build/aten/src/ATen/Declarations.yaml \
60+
--native-functions-path aten/src/ATen/native/native_functions.yaml \
6061
--nn-path aten/src/
6162

6263
# Build the docs

.github/workflows/lint.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,7 @@ jobs:
138138
# Generate PyTorch files.
139139
time python tools/setup_helpers/generate_code.py \
140140
--declarations-path build/aten/src/ATen/Declarations.yaml \
141+
--native-functions-path aten/src/ATen/native/native_functions.yaml \
141142
--nn-path aten/src
142143
fi
143144

BUILD.bazel

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -213,9 +213,10 @@ genrule(
213213
name = "all_generated_code",
214214
srcs = [
215215
"aten/src/ATen/Declarations.yaml",
216+
"aten/src/ATen/native/native_functions.yaml",
216217
],
217218
outs = libtorch_cpp_generated_sources + libtorch_python_generated_sources,
218-
cmd = "$(location :generate_code) --install_dir `dirname $(location torch/csrc/autograd/generated/variable_factories.h)`/../.. --declarations-path $(location aten/src/ATen/Declarations.yaml) --nn-path aten/src",
219+
cmd = "$(location :generate_code) --install_dir `dirname $(location torch/csrc/autograd/generated/variable_factories.h)`/../.. --declarations-path $(location aten/src/ATen/Declarations.yaml) --native-functions-path $(location aten/src/ATen/native/native_functions.yaml) --nn-path aten/src",
219220
tools = [":generate_code"],
220221
)
221222

caffe2/CMakeLists.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -391,11 +391,13 @@ if(NOT INTERN_BUILD_MOBILE OR NOT BUILD_CAFFE2_MOBILE)
391391
COMMAND
392392
"${PYTHON_EXECUTABLE}" tools/setup_helpers/generate_code.py
393393
--declarations-path "${CMAKE_BINARY_DIR}/aten/src/ATen/Declarations.yaml"
394+
--native-functions-path "aten/src/ATen/native/native_functions.yaml"
394395
--nn-path "aten/src"
395396
$<$<BOOL:${INTERN_DISABLE_AUTOGRAD}>:--disable-autograd>
396397
$<$<BOOL:${SELECTED_OP_LIST}>:--selected-op-list-path="${SELECTED_OP_LIST}">
397398
--force_schema_registration
398399
DEPENDS
400+
"${TORCH_ROOT}/aten/src/ATen/native/native_functions.yaml"
399401
"${CMAKE_BINARY_DIR}/aten/src/ATen/Declarations.yaml"
400402
"${TOOLS_PATH}/autograd/templates/VariableType.h"
401403
"${TOOLS_PATH}/autograd/templates/VariableType.cpp"

docs/cpp/source/check-doxygen.sh

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ python -m tools.codegen.gen
2020

2121
python tools/setup_helpers/generate_code.py \
2222
--declarations-path build/aten/src/ATen/Declarations.yaml \
23+
--native-functions-path aten/src/ATen/native/native_functions.yaml \
2324
--nn-path aten/src
2425

2526
popd

tools/autograd/gen_autograd.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -254,8 +254,7 @@ def is_operator_selected_for_training(decl):
254254
gen_variable_factories(out, full_aten_decls, template_path)
255255

256256

257-
def gen_autograd_python(aten_path, out, autograd_dir):
258-
257+
def gen_autograd_python(aten_path, native_functions_path, out, autograd_dir):
259258
# TODO Deduplicate these four variable assignments
260259

261260
aten_decls = load_aten_declarations(aten_path)
@@ -278,6 +277,9 @@ def gen_autograd_python(aten_path, out, autograd_dir):
278277

279278
# Generate Python bindings
280279
from . import gen_python_functions
280+
# TODO: change gen_python_functions to process native functions directly.
281+
gen_python_functions.init(native_functions_path)
282+
281283
gen_python_functions.gen_py_variable_methods(
282284
out, aten_decls + deprecated, template_path)
283285
gen_python_functions.gen_py_torch_functions(

0 commit comments

Comments
 (0)