-
Notifications
You must be signed in to change notification settings - Fork 607
[ET-VK][7/n] Slice, with lots of codegen improvements #3171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1. Add slice operation. Instead of using copy in LI, we implement a simple shader with offsets. 2. Improvement in codegen. - add support of optional variables - improve indent of the code, for better readability - allow user to specify tensor value generation, possible to generate sequential values for easier debugging for index operations - sample code improve test-case specification, particularly with long and optional values. Differential Revision: [D56295985](https://our.internmc.facebook.com/intern/diff/D56295985/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/3171
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 73379b8 with merge base fa433cb ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
1. Add slice operation. Instead of using copy in LI, we implement a simple shader with offsets. 2. Improvement in codegen. - add support of optional variables - improve indent of the code, for better readability - allow user to specify tensor value generation, possible to generate sequential values for easier debugging for index operations - sample code improve test-case specification, particularly with long and optional values. Differential Revision: [D56295985](https://our.internmc.facebook.com/intern/diff/D56295985/) ghstack-source-id: 223242316 Pull Request resolved: #3171
This pull request was exported from Phabricator. Differential Revision: D56295985 |
1. Add slice operation. Instead of using copy in LI, we implement a simple shader with offsets. 2. Improvement in codegen. - add support of optional variables - improve indent of the code, for better readability - allow user to specify tensor value generation, possible to generate sequential values for easier debugging for index operations - sample code improve test-case specification, particularly with long and optional values. Differential Revision: [D56295985](https://our.internmc.facebook.com/intern/diff/D56295985/) [ghstack-poisoned]
Pull Request resolved: #3171 1. Add slice operation. Instead of using copy in LI, we implement a simple shader with offsets. 2. Improvement in codegen. - add support of optional variables - improve indent of the code, for better readability - allow user to specify tensor value generation, possible to generate sequential values for easier debugging for index operations - sample code improve test-case specification, particularly with long and optional values. ghstack-source-id: 223247365 Differential Revision: [D56295985](https://our.internmc.facebook.com/intern/diff/D56295985/)
This pull request was exported from Phabricator. Differential Revision: D56295985 |
1. Add slice operation. Instead of using copy in LI, we implement a simple shader with offsets. 2. Improvement in codegen. - add support of optional variables - improve indent of the code, for better readability - allow user to specify tensor value generation, possible to generate sequential values for easier debugging for index operations - sample code improve test-case specification, particularly with long and optional values. Differential Revision: [D56295985](https://our.internmc.facebook.com/intern/diff/D56295985/) [ghstack-poisoned]
Pull Request resolved: #3171 1. Add slice operation. Instead of using copy in LI, we implement a simple shader with offsets. 2. Improvement in codegen. - add support of optional variables - improve indent of the code, for better readability - allow user to specify tensor value generation, possible to generate sequential values for easier debugging for index operations - sample code improve test-case specification, particularly with long and optional values. ghstack-source-id: 223254861 Differential Revision: [D56295985](https://our.internmc.facebook.com/intern/diff/D56295985/)
This pull request was exported from Phabricator. Differential Revision: D56295985 |
This pull request has been merged in 7469a28. |
## The Operator `nn.Module` invocations of [`torch.index_select`](https://pytorch.org/docs/stable/generated/torch.index_select.html) get compiled to `aten.index_select.default` in the Edge Dialect, which carries the following signature. ``` - func: index_select(Tensor self, int dim, Tensor index) -> Tensor ``` ## Implementation This is a C-packing-only implementation. It is very similar to `aten.slice`: #3171 ``` - func: slice.Tensor(Tensor(a) self, int dim=0, SymInt? start=None, SymInt? end=None, SymInt step=1) -> Tensor(a) ``` It features a similar split between a shader for N,H,W and a shader for C, because copying from the C-dimension is more difficult due to C-packing. Both `index_select` and `slice` copy specific indices across 1 dimension. The difference is in the way these indices are specified. - `slice` uses `start=1`/`end=5`/`step=2` as three scalars for indices `1,3`. - `index_select` lists the exact indices inside a tensor e.g. `index=torch.tensor([1,3])`. Hence, `slice` uses a `offset=1` and `step=2` to compute input position. In `index_select`, we read the index tensor to compute input position. Differential Revision: [D57745489](https://our.internmc.facebook.com/intern/diff/D57745489/) [ghstack-poisoned]
## The Operator `nn.Module` invocations of [`torch.index_select`](https://pytorch.org/docs/stable/generated/torch.index_select.html) get compiled to `aten.index_select.default` in the Edge Dialect, which carries the following signature. ``` - func: index_select(Tensor self, int dim, Tensor index) -> Tensor ``` ## Implementation This is a C-packing-only implementation. It is very similar to `aten.slice`: #3171 ``` - func: slice.Tensor(Tensor(a) self, int dim=0, SymInt? start=None, SymInt? end=None, SymInt step=1) -> Tensor(a) ``` It features a similar split between a shader for N,H,W and a shader for C, because copying from the C-dimension is more difficult due to C-packing. Both `index_select` and `slice` copy specific indices across 1 dimension. The difference is in the way these indices are specified. - `slice` uses `start=1`/`end=5`/`step=2` as three scalars for indices `1,3`. - `index_select` lists the exact indices inside a tensor e.g. `index=torch.tensor([1,3])`. Hence, `slice` uses a `offset=1` and `step=2` to compute input position. In `index_select`, we read the index tensor to compute input position. Differential Revision: [D57745489](https://our.internmc.facebook.com/intern/diff/D57745489/) ghstack-source-id: 227736336 Pull Request resolved: #3744
Summary: Pull Request resolved: #3744 ## The Operator `nn.Module` invocations of [`torch.index_select`](https://pytorch.org/docs/stable/generated/torch.index_select.html) get compiled to `aten.index_select.default` in the Edge Dialect, which carries the following signature. ``` - func: index_select(Tensor self, int dim, Tensor index) -> Tensor ``` ## Implementation This is a C-packing-only implementation. It is very similar to `aten.slice`: #3171 ``` - func: slice.Tensor(Tensor(a) self, int dim=0, SymInt? start=None, SymInt? end=None, SymInt step=1) -> Tensor(a) ``` It features a similar split between a shader for N,H,W and a shader for C, because copying from the C-dimension is more difficult due to C-packing. Both `index_select` and `slice` copy specific indices across 1 dimension. The difference is in the way these indices are specified. - `slice` uses `start=1`/`end=5`/`step=2` as three scalars for indices `1,3`. - `index_select` lists the exact indices inside a tensor e.g. `index=torch.tensor([1,3])`. Hence, `slice` uses a `offset=1` and `step=2` to compute input position. In `index_select`, we read the index tensor to compute input position. Reviewed By: copyrightly Differential Revision: D57745489 fbshipit-source-id: 4ef7f1a13d4bf74af20fe61149dbd5d461aaab0c
## The Operator `nn.Module` invocations of [`torch.index_select`](https://pytorch.org/docs/stable/generated/torch.index_select.html) get compiled to `aten.index_select.default` in the Edge Dialect, which carries the following signature. ``` - func: index_select(Tensor self, int dim, Tensor index) -> Tensor ``` ## Implementation This is a C-packing-only implementation. It is very similar to `aten.slice`: #3171 ``` - func: slice.Tensor(Tensor(a) self, int dim=0, SymInt? start=None, SymInt? end=None, SymInt step=1) -> Tensor(a) ``` It features a similar split between a shader for N,H,W and a shader for C, because copying from the C-dimension is more difficult due to C-packing. Both `index_select` and `slice` copy specific indices across 1 dimension. The difference is in the way these indices are specified. - `slice` uses `start=1`/`end=5`/`step=2` as three scalars for indices `1,3`. - `index_select` lists the exact indices inside a tensor e.g. `index=torch.tensor([1,3])`. Hence, `slice` uses a `offset=1` and `step=2` to compute input position. In `index_select`, we read the index tensor to compute input position. Differential Revision: [D57745489](https://our.internmc.facebook.com/intern/diff/D57745489/) [ghstack-poisoned]
## The Operator `nn.Module` invocations of [`torch.index_select`](https://pytorch.org/docs/stable/generated/torch.index_select.html) get compiled to `aten.index_select.default` in the Edge Dialect, which carries the following signature. ``` - func: index_select(Tensor self, int dim, Tensor index) -> Tensor ``` ## Implementation This is a C-packing-only implementation. It is very similar to `aten.slice`: #3171 ``` - func: slice.Tensor(Tensor(a) self, int dim=0, SymInt? start=None, SymInt? end=None, SymInt step=1) -> Tensor(a) ``` It features a similar split between a shader for N,H,W and a shader for C, because copying from the C-dimension is more difficult due to C-packing. Both `index_select` and `slice` copy specific indices across 1 dimension. The difference is in the way these indices are specified. - `slice` uses `start=1`/`end=5`/`step=2` as three scalars for indices `1,3`. - `index_select` lists the exact indices inside a tensor e.g. `index=torch.tensor([1,3])`. Hence, `slice` uses a `offset=1` and `step=2` to compute input position. In `index_select`, we read the index tensor to compute input position. Differential Revision: [D57745489](https://our.internmc.facebook.com/intern/diff/D57745489/) [ghstack-poisoned]
Pull Request resolved: pytorch/executorch#3744 ## The Operator `nn.Module` invocations of [`torch.index_select`](https://pytorch.org/docs/stable/generated/torch.index_select.html) get compiled to `aten.index_select.default` in the Edge Dialect, which carries the following signature. ``` - func: index_select(Tensor self, int dim, Tensor index) -> Tensor ``` ## Implementation This is a C-packing-only implementation. It is very similar to `aten.slice`: pytorch/executorch#3171 ``` - func: slice.Tensor(Tensor(a) self, int dim=0, SymInt? start=None, SymInt? end=None, SymInt step=1) -> Tensor(a) ``` It features a similar split between a shader for N,H,W and a shader for C, because copying from the C-dimension is more difficult due to C-packing. Both `index_select` and `slice` copy specific indices across 1 dimension. The difference is in the way these indices are specified. - `slice` uses `start=1`/`end=5`/`step=2` as three scalars for indices `1,3`. - `index_select` lists the exact indices inside a tensor e.g. `index=torch.tensor([1,3])`. Hence, `slice` uses a `offset=1` and `step=2` to compute input position. In `index_select`, we read the index tensor to compute input position. Differential Revision: [D57745489](https://our.internmc.facebook.com/intern/diff/D57745489/) ghstack-source-id: 227954599
Stack from ghstack (oldest at bottom):
Add slice operation. Instead of using copy in LI, we implement a simple shader with offsets.
Improvement in codegen.
Differential Revision: D56295985