-
Notifications
You must be signed in to change notification settings - Fork 564
add BroadcastIndexesRange #8864
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Stack from ghstack (oldest at bottom): |
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8864
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 8ade738 with merge base 5814a3b ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
7bc4529
to
764977b
Compare
// TODO: add optimization for particular input tensors not being | ||
// broadcasted? | ||
for (auto ii = output_dim_ - 1; ii >= 0; --ii) { | ||
// You might wonder what happens if output_shape_[ii] == 0. In that case, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't we check for this before starting to iterate at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is meant to be explaining why every caller does check for that already -- in that case begin() == end() and any loop that uses this thing won't be entered. I'll see if I can make that a bit clearer.
result[idx] = 0; | ||
} | ||
const auto t_sizes = t.sizes(); | ||
const auto t_strides = t.strides(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this take dim order into account?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't recall how dim_order affects strides and sizes. if the tests pass, either it works or we have no tests for dim_order support (which would mean it didn't work before this diff).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at the minimal you should add checks for the dim order assumptions that are being made here. That is is assumes whatever the default dim order is, nothing fancy. If in future when the tests with dim order are added, at least this will be caught more gracefully rather than having to go down the debug rabbit hole
// output_dim. This is straightforwardly implementable with an | ||
// adjusted stride array that contains 0s where the padded input | ||
// shape would contain 1s. | ||
std::array<ShapeType, kNumInputs> effective_input_broadcast_strides_ = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I love this kNumInputs generalization. This is great!
// [1, W] -> [H, W] | ||
// [H, 1] -> [H, W] | ||
// [H, W] -> [H, W] | ||
// Cover all these at the same time to also exercise multiple input tensors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are also covering [1, 1] -> [H, W] and [W] -> [H, W] here. would be good to mention as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, I'll rename it to OneAndTwoDExhaustive
} | ||
|
||
// Here we assume that the previous tests established that padding | ||
// with leading 1s is working, and test: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are testing 5 out of 8 possibilities here. You might as well add the remaining 3:
[1, 1, 1] -> [C, H, W]
[1, 1, W] -> [C, H, W] (this one in particular would be good to have, i.e. multiple leading ones)
[1, H, W] -> [C, H, W]
EXPECT_EQ(expected, actual); | ||
} | ||
|
||
// 4-D should generalize, but we will go ahead and test: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great!
@swolchok has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
// [1, C, 1, W] -> [N, C, H, W] | ||
TEST(BroadcastIndexesRangeTest, FourDBroadcasting) { | ||
TensorFactory<ScalarType::Int> tf; | ||
Tensor out = tf.zeros({2, 3, 4, 5}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about out = {2, 3, 1, 5} and a = {1, 3, 1, 5} and b = {2, 1, 1, 5}
Mainly highlighting that it there is size 1 dim in the output, is it taken care of? From cursory looks I presume the answer is yes, but not sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it there is size 1 dim in the output, is it taken care of? From cursory looks I presume the answer is yes, but not sure
why would H == 1 be special? I'll add an exhaustive test for that for 1- and 2-D just in case in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See class comment. In brief, this adds an iterable range to make broadcasting ops convenient and efficient to implement.
See class comment. In brief, this adds an iterable range to make broadcasting ops convenient and efficient to implement.