Change the elementwise broadcasting contract from graph to kernel #3894

mcremon-meta · 2024-06-07T03:04:49Z

Summary:
Currently, there is a graph level pass to handle limited broadcasting of elementwise ops if the input tensors are not of the same size.

We move this responsibility down to the kernels with this diff, which is how ET and the portable ops do it. Ops of this kind are only add, sub, mul and div for now, but there will be more.

We retain the implementations for the reference kernels, because we want to avoid linking the portable ops directly, which takes forever at compile time. We can also use a much smaller set of types (basically only float).

We can remove a hack in the RNNT Joiner with this change, and run it natively. It takes a huge hit in performance, which will be fixed by getting broadcast-friendly kernels from Cadence.

We finally remove the binop tests in test_aten_ops.py, which were also using strange types and had been on the chopping block for a while.

Differential Revision: D58207691

Summary: Currently, there is a graph level pass to handle limited broadcasting of elementwise ops if the input tensors are not of the same size. We move this responsibility down to the kernels with this diff, which is how ET and the portable ops do it. Ops of this kind are only `add`, `sub`, `mul` and `div` for now, but there will be more. We retain the implementations for the reference kernels, because we want to avoid linking the portable ops directly, which takes forever at compile time. We can also use a much smaller set of types (basically only `float`). We can remove a hack in the RNNT Joiner with this change, and run it natively. It takes a huge hit in performance, which will be fixed by getting broadcast-friendly kernels from Cadence. We finally remove the binop tests in `test_aten_ops.py`, which were also using strange types and had been on the chopping block for a while. Differential Revision: D58207691

pytorch-bot · 2024-06-07T03:04:51Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3894

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit d52c39f with merge base 6554fa5 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-06-07T03:04:58Z

This pull request was exported from Phabricator. Differential Revision: D58207691

facebook-github-bot · 2024-06-11T22:29:23Z

This pull request has been merged in 88e9737.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 7, 2024

facebook-github-bot added the fb-exported label Jun 7, 2024

dulinriley approved these changes Jun 10, 2024

View reviewed changes

facebook-github-bot closed this in 88e9737 Jun 11, 2024

facebook-github-bot added the Merged label Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change the elementwise broadcasting contract from graph to kernel #3894

Change the elementwise broadcasting contract from graph to kernel #3894

mcremon-meta commented Jun 7, 2024

pytorch-bot bot commented Jun 7, 2024 •

edited

Loading

facebook-github-bot commented Jun 7, 2024

facebook-github-bot commented Jun 11, 2024

Change the elementwise broadcasting contract from graph to kernel #3894

Change the elementwise broadcasting contract from graph to kernel #3894

Conversation

mcremon-meta commented Jun 7, 2024

pytorch-bot bot commented Jun 7, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3894

✅ No Failures

facebook-github-bot commented Jun 7, 2024

facebook-github-bot commented Jun 11, 2024

pytorch-bot bot commented Jun 7, 2024 •

edited

Loading