[pytorch] continue to rewrite gen_python_functions.py with typed models #46978

ljk53 · 2020-10-28T09:07:08Z

Stack from ghstack:

[pytorch] split out trace type generator and migrate to new codegen model #47438 [pytorch] split out trace type generator and migrate to new codegen model
[pytorch] factor out return_names api #47437 [pytorch] factor out return_names api
[pytorch] TraceType codegen tweak - newline before redispatch call #47436 [pytorch] TraceType codegen tweak - newline before redispatch call
[pytorch] continue to rewrite gen_python_functions.py with typed models #46978 [pytorch] continue to rewrite gen_python_functions.py with typed models

Refactored and added type annotations to the most part of the file.

Some top-level codegen functions are called by other codegen scripts.
Will migrate them in subsequent PRs.

Differential Revision: D24589210

Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. [ghstack-poisoned]

… typed models" Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. [ghstack-poisoned]

Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. ghstack-source-id: 602db8dcd1a03ef69bb36dc3028ca10faffa5d59 Pull Request resolved: #46978

dr-ci · 2020-10-28T09:22:44Z

💊 CI failures summary and remediations

As of commit 402ffac (more details on the Dr. CI page):

✅ None of the CI failures appear to be your fault 💚

1/1 broken upstream at merge base 5a5258c since Nov 06

🚧 1 ongoing upstream failure:

These were probably caused by upstream breakages that are not fixed yet:

binary_linux_libtorch_3_7m_cpu_devtoolset7_shared-with-deps_build since Nov 06
- 🔁 rerun

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker or post in the (internal) Dr. CI Users group.

See how this bot performed.

This comment has been revised 20 times.

ljk53 · 2020-10-28T09:22:53Z

tools/autograd/gen_python_functions.py

+    """
+    if overload.outplace is not None:
+        # dispatch output and no-output variants, branch on _r.isNone(<out_idx>)
+        return PY_VARIABLE_OUT.substitute(


I noticed that the new codegen is moving away from CodeTemplate - but the auto-indent feature is useful in this case as the emit_single_dispatch can have different indents: inside switch-case, inside if-else, inside both or none.

Yes, it is troublesome. For now, I have have been opting for code gen readability (f-strings) at the expense of generated code readability (yolo indentation) but generated code readability is getting worse and worse. I'm kind of pissed that Python doesn't have some sort of pluggable f-string mechanism for this case.

I can only really think of a few options for this problem:

Run some sort of formatter (clang-format, maybe?) on the generated code after we generate it. Sounds slow.

Manually annotate multiline string formats with an indent level that triggers indentation, e.g.,

void f() {{ {multiline(thing, 4)} }}

Stop using f-strings

for now I'll slow down converting multiline CodeTemplate to f-string for my future codegen rewrite PRs.

… typed models" Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. Differential Revision: [D24589210](https://our.internmc.facebook.com/intern/diff/D24589210) [ghstack-poisoned]

Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. ghstack-source-id: 6d2c50a599534000e784050c1fb7416baf7c177a Pull Request resolved: #46978

… typed models" Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. Differential Revision: [D24589210](https://our.internmc.facebook.com/intern/diff/D24589210) [ghstack-poisoned]

Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. ghstack-source-id: fbd1b53f795e79882fae3a77bb588defa464d081 Pull Request resolved: #46978

ezyang · 2020-10-29T20:50:35Z

What's the general approach I should take for reviewing this PR?

… typed models" Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. Differential Revision: [D24589210](https://our.internmc.facebook.com/intern/diff/D24589210) [ghstack-poisoned]

Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. ghstack-source-id: 89625535a931b6ff5f3390fb26d916cf4d8f9268 Pull Request resolved: #46978

Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. ghstack-source-id: 89625535a931b6ff5f3390fb26d916cf4d8f9268 Pull Request resolved: pytorch#46978

… typed models" Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. Differential Revision: [D24589210](https://our.internmc.facebook.com/intern/diff/D24589210) [ghstack-poisoned]

ljk53 · 2020-11-05T11:37:07Z

tools/autograd/gen_python_functions.py

+    for sig, out in outplaces.items():
+        if sig not in bases:
+            candidates: List[str] = []
+            for overload in overloads:


merged with #47087 and tested by hacking the native_functions.yaml:

RuntimeError: While identifying overloads, we found an out schema addmv(Tensor input, Tensor mat, Tensor vec, *, Scalar beta=1, Scalar alpha=1, Tensor out=None) without a corresponding non-out variant. We expected the non-out variant to have schema: - addmv(Tensor input, Tensor mat, Tensor vec, *, Scalar beta=1, Scalar alpha=1) Please check that you spelled the schema correctly in native_functions.yaml. We discovered the following candidate(s): - addmv(Tensor input, Tensor mat, Tensor vec, *, Scalar beta=1)

ezyang · 2020-11-05T15:27:32Z

Last week, @ljk53 and @bhosmer and I had a brief discussion about how to go about reviewing this PR. Big take away: Phabricator does a better job of generating diff than GitHub, so go look at the diff there (it identifies code motion).

ezyang · 2020-11-05T15:36:08Z

tools/autograd/gen_python_functions.py

+        # TODO: should use some canonical form instead of 'str(arg.type)' - see comments
+        # above. The old codegen used the deprecated 'dynamic_type(arg.type)', which
+        # ignores the optional annotation, i.e. 'Scalar' and 'Scalar?'.
+        any_unequal = any(str(arg1.type) != str(arg2.type)


You know you can just test these for equality right? So no need to stringify them first. (Stringifying is poor man's pattern matching.)

In fact, you can't you literally just do args1 != args2 because you've already checked that the lengths are the same above?

Oh you can't, because you only want to check the types

You know you can just test these for equality right? So no need to stringify them first. (Stringifying is poor man's pattern matching.)

oh, I thought it has to implement __eq__ explicitly... thanks for pointing out!

ezyang · 2020-11-05T15:55:36Z

tools/autograd/gen_python_functions.py

                                   for arg1, arg2 in zip(args1, args2))
-        return any_smaller and all_smaller_or_equal
+        return any_unequal and all_smaller_or_equal


I understand this is the fault of the original code, but I just want to remark: the use of "any" and "all" here obscures the essential logical property going on here: if you know how to compute lte and eq, you can compute lt = lte && !eq. The swapping between any/all is the result of pushing the negation inside of the all quantifier, but the quantifier doesn't really have anything to do with the logical computation going on here.

Also, (once again, artifact of the old code) I'm a little perplexed as to why equality matters here, because I don't feel like there really should be any overloads in the list with exactly the same types, so given two distinct signatures they should always be unequal.

removed 'any' and 'all'

why equality check? I think the nested loops below can compare one to itself (which can be modified, of course)

ezyang · 2020-11-05T16:00:26Z

tools/autograd/gen_python_functions.py

-                format(declaration['name']))
-        return name
+                del larger_than[j]
+                sorted_ids.append(j)


Also the old code's fault, but mutating lists as I iterate over them always gives me the jeebies

Also the old code's fault, but mutating lists as I iterate over them always gives me the jeebies

ok - I've simply changed it to iterating through index indirectly instead of sorted_ids directly - not sure it hurts readability :-)

ezyang · 2020-11-05T16:32:20Z

tools/autograd/gen_python_functions.py

-    """
-    grouped = defaultdict(dict)
+def group_overloads(
+    overloads: Tuple[PythonSignatureNativeFunctionPair, ...]


Nit: this is probably OK just being a Sequence

ok, changed all Tuple[PythonSignatureNativeFunctionPair, ...] to Sequence[PythonSignatureNativeFunctionPair]

ezyang · 2020-11-05T16:36:19Z

tools/autograd/gen_python_functions.py

+    for overload in overloads:
+        sig = overload.signature.signature_str(skip_outputs=True)
+        if overload.function.func.is_out_fn():
+            outplaces[sig] = overload


assert sig not in outplaces? Similar with bases below.

done:

RuntimeError: Found duplicated function definition: - addmv.out2(Tensor self, Tensor mat, Tensor vec, *, Scalar beta=1, Scalar alpha=1, Tensor(a!) out, Tensor(b!) out2) -> (Tensor(a!), Tensor(b!)). Existing definition: - addmv.out(Tensor self, Tensor mat, Tensor vec, *, Scalar beta=1, Scalar alpha=1, Tensor(a!) out) -> Tensor(a!).

RuntimeError: Found duplicated function definition: - addmv.dup(Tensor self, Tensor mat, Tensor vec, *, Scalar beta=1, Scalar alpha=2) -> Tensor. Existing definition: - addmv(Tensor self, Tensor mat, Tensor vec, *, Scalar beta=1, Scalar alpha=1) -> Tensor.

ezyang · 2020-11-05T16:40:15Z

tools/autograd/gen_python_functions.py

+    for sig, base in bases.items():
+        outplace = outplaces.get(sig)
+        grouped.append(PythonSignatureGroup(
+            # prefer the signature with optional out=... arguments


Nothing to do here, but I hate this. Will be kind of hard to fix though.

The opposite invariant (use the signature with functional arguments) would be easier to get to; I doubt there are many out= functions that don't have a functional version. Something that isn't so clear just looking at the comment here or in PythonSignatureGroup comments is why we need to prefer the out= variant.

Improved the comments at both places: because the one with the optional out arguments is the superset that can be used to parse input for both base and outplace.

ezyang · 2020-11-05T16:41:31Z

tools/autograd/gen_python_functions.py

-        def_template = PY_VARIABLE_METHOD_BINOP_DEF
+        # PyMethodDef entry for binary op, throws not implemented error
+        return f"""\
+{{"{name}", {pyfunc_cast}(TypeError_to_NotImplemented_<{pycname}>), {flags}, NULL}},"""


Thanks for inlining these, I appreciate it.

ezyang · 2020-11-05T16:45:43Z

tools/autograd/gen_python_functions.py

-        py_forwards.extend(forward_decls(name, overload_decls, is_python_method, module))
+        overloads = tuple(decl_to_signature_function_pair(decl, method=is_python_method)
+                          for decl in python_functions[name])
+        noarg = len(overloads) == 1 and overloads[0].signature.arguments_count() == 0


Note to self: wondering what a noarg is and why we have to compute here, rather than inside the functions.

no particular reason - just that this way we can avoid passing 'overloads' to 'method_def' / 'forward_decls' because this bit is what all they need.
I've changed it back to passing overloads instead to make it more uniform.

ezyang · 2020-11-05T16:48:52Z

tools/autograd/gen_python_functions.py

+
+@with_native_function
+def gen_namedtuple_typename_key(f: NativeFunction) -> str:
+    name = cpp.name(f.func)


cpp.name? Really?

Hmm, this is probably an internal name, judging from the algo below

ezyang · 2020-11-05T16:52:06Z

tools/autograd/gen_python_functions.py

+
+def emit_namedtuple_typedefs(
+    overloads: Tuple[PythonSignatureNativeFunctionPair, ...]
+) -> Tuple[List[str], Dict[str, str]]:


Getting pretty bad str blindness here

ezyang · 2020-11-05T16:54:24Z

tools/autograd/gen_python_functions.py

-    for decl in declarations:
-        fieldnames = namedtuple_fieldnames(decl)
-        if fieldnames == []:
-            decl['namedtuple_typeref'] = ''


GOOD RIDDANCE, ha!

ezyang · 2020-11-05T17:05:49Z

tools/autograd/gen_python_functions.py

-        typename = typenames.get(key)
+        name = cpp.name(overload.function.func)  # use @with_native_function?
+        tn_key = gen_namedtuple_typename_key(overload.function)
+        typename = typenames.get(tn_key)


It would make me feel better if there was some asserts here that would tell me that order of overloads doesn't matter for these initializations but I can't think of an easy way to actually do this.

ezyang · 2020-11-05T17:09:57Z

tools/autograd/gen_python_functions.py

-    method_footer = ['END_HANDLE_TH_ERRORS']
+    # NOTE: we type the unpacked self as Tensor not Variable to avoid return type
+    # discrepancies on method resolution (e.g. Variable::detach_ returns void
+    # rather than Tensor &)


BTW, this comment is defunct now, because Tensor and Variable are the same in C++ now. (No action on this PR, I'm pretty sure; this would be a codegen output change).

ok, I deleted this comment - not worth calling out anymore?

ezyang · 2020-11-05T18:22:26Z

tools/autograd/gen_python_functions.py

+if(check_has_torch_function(self_)) {{
+  return handle_torch_function(self_, "{name}");
+}}
+""" if method else ''


nit: the if method has the same indentation as if noarg so I nearly thought it was acutally on the next line haha. Might be easier to read if we just have another nested if block inside

… typed models" Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. Differential Revision: [D24589210](https://our.internmc.facebook.com/intern/diff/D24589210) [ghstack-poisoned]

facebook-github-bot · 2020-11-08T11:09:59Z

@ljk53 merged this pull request in 16c72a5.

[pytorch] continue to rewrite gen_python_functions.py with typed models

1f4c1ae

Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. [ghstack-poisoned]

This was referenced Oct 28, 2020

[pytorch] add script to run all codegen #46243

Closed

[pytorch] simplify tensor options logic in pybinding codegen #46976

Closed

[pytorch] get rid of cpp_type_str from pybind codegen #46977

Closed

Update on "[pytorch] continue to rewrite gen_python_functions.py with…

fae7216

… typed models" Refactored and added type annotations to the most part of the file. Some top-level codegen functions are called by other codegen scripts. Will migrate them in subsequent PRs. [ghstack-poisoned]

ljk53 requested review from bhosmer and ezyang October 28, 2020 09:14

ljk53 commented Oct 28, 2020

View reviewed changes

facebook-github-bot added the cla signed label Oct 30, 2020

This was referenced Nov 5, 2020

[pytorch] TraceType codegen tweak - newline before redispatch call #47436

Closed

[pytorch] factor out return_names api #47437

Closed

[pytorch] split out trace type generator and migrate to new codegen model #47438

Closed

ljk53 commented Nov 5, 2020

View reviewed changes

ljk53 mentioned this pull request Nov 5, 2020

Codegen - error when an argument that looks like an out argument isn't a kwarg (fix #43273) #47284

Closed

ezyang reviewed Nov 5, 2020

View reviewed changes

ezyang approved these changes Nov 6, 2020

View reviewed changes

ljk53 added 2 commits November 6, 2020 16:35

facebook-github-bot closed this in 16c72a5 Nov 8, 2020

facebook-github-bot added the Merged label Nov 8, 2020

facebook-github-bot deleted the gh/ljk53/183/head branch November 11, 2020 15:17

[pytorch] continue to rewrite gen_python_functions.py with typed models #46978

[pytorch] continue to rewrite gen_python_functions.py with typed models #46978

Conversation

ljk53 commented Oct 28, 2020 • edited Loading

dr-ci bot commented Oct 28, 2020 • edited Loading

💊 CI failures summary and remediations

🚧 1 ongoing upstream failure:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ezyang commented Oct 29, 2020

Choose a reason for hiding this comment

ezyang commented Nov 5, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ezyang Nov 5, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Nov 8, 2020

ljk53 commented Oct 28, 2020 •

edited

Loading

dr-ci bot commented Oct 28, 2020 •

edited

Loading

ezyang Nov 5, 2020 •

edited

Loading