add BroadcastIndexesRange #8864

swolchok · 2025-03-01T01:16:17Z

See class comment. In brief, this adds an iterable range to make broadcasting ops convenient and efficient to implement.

[ghstack-poisoned]

swolchok · 2025-03-01T01:16:18Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2025-03-01T01:16:20Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8864

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 8ade738 with merge base 5814a3b ():

NEW FAILURE - The following job has failed:

.github/workflows/trunk.yml (gh)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

[ghstack-poisoned]

manuelcandales · 2025-03-04T21:33:50Z

kernels/portable/cpu/util/broadcast_indexes_range.h

+    // TODO: add optimization for particular input tensors not being
+    // broadcasted?
+    for (auto ii = output_dim_ - 1; ii >= 0; --ii) {
+      // You might wonder what happens if output_shape_[ii] == 0. In that case,


shouldn't we check for this before starting to iterate at all?

This comment is meant to be explaining why every caller does check for that already -- in that case begin() == end() and any loop that uses this thing won't be entered. I'll see if I can make that a bit clearer.

manuelcandales · 2025-03-04T21:38:15Z

kernels/portable/cpu/util/broadcast_indexes_range.h

+      result[idx] = 0;
+    }
+    const auto t_sizes = t.sizes();
+    const auto t_strides = t.strides();


does this take dim order into account?

I don't recall how dim_order affects strides and sizes. if the tests pass, either it works or we have no tests for dim_order support (which would mean it didn't work before this diff).

at the minimal you should add checks for the dim order assumptions that are being made here. That is is assumes whatever the default dim order is, nothing fancy. If in future when the tests with dim order are added, at least this will be caught more gracefully rather than having to go down the debug rabbit hole

manuelcandales · 2025-03-04T21:40:11Z

kernels/portable/cpu/util/broadcast_indexes_range.h

+  // output_dim. This is straightforwardly implementable with an
+  // adjusted stride array that contains 0s where the padded input
+  // shape would contain 1s.
+  std::array<ShapeType, kNumInputs> effective_input_broadcast_strides_ = {


I love this kNumInputs generalization. This is great!

manuelcandales · 2025-03-04T21:43:01Z

kernels/portable/cpu/util/test/broadcast_indexes_range_test.cpp

+// [1, W] -> [H, W]
+// [H, 1] -> [H, W]
+// [H, W] -> [H, W]
+// Cover all these at the same time to also exercise multiple input tensors.


you are also covering [1, 1] -> [H, W] and [W] -> [H, W] here. would be good to mention as well.

good catch, I'll rename it to OneAndTwoDExhaustive

manuelcandales · 2025-03-04T21:48:53Z

kernels/portable/cpu/util/test/broadcast_indexes_range_test.cpp

+}
+
+// Here we assume that the previous tests established that padding
+// with leading 1s is working, and test:


You are testing 5 out of 8 possibilities here. You might as well add the remaining 3:
[1, 1, 1] -> [C, H, W]
[1, 1, W] -> [C, H, W] (this one in particular would be good to have, i.e. multiple leading ones)
[1, H, W] -> [C, H, W]

manuelcandales · 2025-03-04T21:49:54Z

kernels/portable/cpu/util/test/broadcast_indexes_range_test.cpp

+  EXPECT_EQ(expected, actual);
+}
+
+// 4-D should generalize, but we will go ahead and test:


this is great!

[ghstack-poisoned]

facebook-github-bot · 2025-03-05T03:49:24Z

@swolchok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

kimishpatel · 2025-03-05T15:33:23Z

kernels/portable/cpu/util/test/broadcast_indexes_range_test.cpp

+// [1, C, 1, W] -> [N, C, H, W]
+TEST(BroadcastIndexesRangeTest, FourDBroadcasting) {
+  TensorFactory<ScalarType::Int> tf;
+  Tensor out = tf.zeros({2, 3, 4, 5});


What about out = {2, 3, 1, 5} and a = {1, 3, 1, 5} and b = {2, 1, 1, 5}

Mainly highlighting that it there is size 1 dim in the output, is it taken care of? From cursory looks I presume the answer is yes, but not sure

it there is size 1 dim in the output, is it taken care of? From cursory looks I presume the answer is yes, but not sure

why would H == 1 be special? I'll add an exhaustive test for that for 1- and 2-D just in case in a follow-up.

See class comment. In brief, this adds an iterable range to make broadcasting ops convenient and efficient to implement.

ghstack-comment-id: 2691804929 ghstack-source-id: ae9a6ce ghstack-comment-id: 2691808818 Pull Request resolved: pytorch/executorch#8864

swolchok added 2 commits February 28, 2025 17:16

Update

f1ace77

[ghstack-poisoned]

Update

d3a0f67

[ghstack-poisoned]

swolchok requested review from larryliu0820, kirklandsign and manuelcandales as code owners March 1, 2025 01:16

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 1, 2025

This was referenced Mar 1, 2025

portable arg{max,min}: optimize update check #8863

Merged

Deploy BroadcastIndexesRange #8865

Merged

Add optimized op_where #8866

Merged

swolchok marked this pull request as draft March 1, 2025 01:18

swolchok added 2 commits March 3, 2025 10:03

Update

7cbe7a1

[ghstack-poisoned]

Update

8e8ce33

[ghstack-poisoned]

This was referenced Mar 3, 2025

portable arg{max,min}: optimize update check #8755

Closed

add DelinearizedIndexesRange #8859

Closed

deploy delinearized_indexes_range -- didn't work #8860

Closed

first crack at optimized op_where #8861

Closed

swolchok changed the title ~~add DelinearizedIndexesRange~~ add BroadcastIndexesRange Mar 3, 2025

This was referenced Mar 3, 2025

Link xnn_executor_runner with optimized op library #8901

Merged

Add cpu_thread setting logic to xnn_executor_runner #8902

Merged

swolchok requested a review from kimishpatel March 3, 2025 23:27

swolchok marked this pull request as ready for review March 3, 2025 23:27

swolchok force-pushed the gh/swolchok/299/head branch 2 times, most recently from 7bc4529 to 764977b Compare March 4, 2025 16:11

Update

f7e4fc9

[ghstack-poisoned]

swolchok changed the base branch from gh/swolchok/299/head to main March 4, 2025 17:45

swolchok added the topic: not user facing label Mar 4, 2025

Update

14a55be

[ghstack-poisoned]

manuelcandales reviewed Mar 4, 2025

View reviewed changes

manuelcandales approved these changes Mar 4, 2025

View reviewed changes

Update

8ade738

[ghstack-poisoned]

swolchok merged commit 2b90570 into main Mar 5, 2025
50 of 52 checks passed

swolchok deleted the gh/swolchok/300/head branch March 5, 2025 05:31

kimishpatel reviewed Mar 5, 2025

View reviewed changes

zonglinpeng pushed a commit that referenced this pull request Mar 6, 2025

add BroadcastIndexesRange (#8864)

760272c

See class comment. In brief, this adds an iterable range to make broadcasting ops convenient and efficient to implement.

github-actions bot mentioned this pull request Mar 10, 2025

Weekly pr metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#16

Open

This was referenced Mar 17, 2025

Weekly pr metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#18

Open

Weekly pr metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#20

Open

github-actions bot mentioned this pull request Mar 31, 2025

Weekly pr metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#22

Open

github-actions bot mentioned this pull request Apr 7, 2025

Weekly pr metrics report - 2025-03-01..2025-03-07 wdvr/pytorch#26

Open

kedarnath03 pushed a commit to kedarnath03/executorch that referenced this pull request Jun 25, 2025

add BroadcastIndexesRange

bc08bcf

ghstack-comment-id: 2691804929 ghstack-source-id: ae9a6ce ghstack-comment-id: 2691808818 Pull Request resolved: pytorch/executorch#8864

add BroadcastIndexesRange #8864

add BroadcastIndexesRange #8864

Uh oh!

Conversation

swolchok commented Mar 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

swolchok commented Mar 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8864

❌ 1 New Failure

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Mar 5, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

swolchok commented Mar 1, 2025 •

edited

Loading

swolchok commented Mar 1, 2025 •

edited

Loading

pytorch-bot bot commented Mar 1, 2025 •

edited

Loading