-
Notifications
You must be signed in to change notification settings - Fork 6k
[Impeller] Fix 1-d grid computation for compute #42516
Conversation
It looks like this pull request may not have tests. Please make sure to add tests before merging. If you need an exemption to this rule, contact Hixie on the #hackers channel in Chat (don't just cc him here, he won't see it! He's on Discord!). If you are not sure if you need tests, consider this rule of thumb: the purpose of a test is to make sure someone doesn't accidentally revert the fix. Ask yourself, is there anything in your PR that you feel it is important we not accidentally revert back to how it was before your fix? Reviewers: Read the Tree Hygiene page and make sure this patch meets those guidelines before LGTMing. |
@@ -258,8 +258,10 @@ static bool Bind(ComputePassBindingsCache& pass, | |||
|
|||
// Special case for linear processing. | |||
if (height == 1) { | |||
int64_t threadGroups = | |||
std::max(width / maxTotalThreadsPerThreadgroup, 1LL); | |||
int64_t threadGroups = std::max( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot these were ints so we were capped at 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Doing so required me to update the vulkan compute shaders to provide a specialization constant which contains the actual wg size, as these need to agree with the value in compute_pass.vk for anything to work correctly.
…e into fix_grid_computation
…128158) flutter/engine@8769e9c...5429372 2023-06-03 jonahwilliams@google.com [Impeller] Fix 1-d grid computation for compute (flutter/engine#42516) 2023-06-02 skia-flutter-autoroll@skia.org Roll Fuchsia Linux SDK from PuYA-6NVHeHPlkCdk... to VtLnfLmVda1_h1AtM... (flutter/engine#42529) 2023-06-02 chris@bracken.jp [macOS] Top-left origin for PlatformView container (flutter/engine#42523) 2023-06-02 skia-flutter-autoroll@skia.org Manual roll Dart SDK from 9d8df2a5210b to d198f84f5e4e (1 revision) (flutter/engine#42527) 2023-06-02 flar@google.com Revert "Reland "add non-rendering operation culling to DisplayListBuilder" (#41463)" (flutter/engine#42525) 2023-06-02 godofredoc@google.com Move benchmarks no upload to staging. (flutter/engine#42524) 2023-06-02 mdebbar@google.com [web] Support platform view creation params (flutter/engine#42255) 2023-06-02 goderbauer@google.com MultiView changes for dart:ui (flutter/engine#42493) Also rolling transitive DEPS: fuchsia/sdk/core/linux-amd64 from PuYA-6NVHeHP to VtLnfLmVda1_ If this roll has caused a breakage, revert this CL and stop the roller using the controls here: https://autoroll.skia.org/r/flutter-engine-flutter-autoroll Please CC jonahwilliams@google.com,rmistry@google.com,zra@google.com on the revert to ensure that a human is aware of the problem. To file a bug in Flutter: https://github.com/flutter/flutter/issues/new/choose To report a problem with the AutoRoller itself, please file a bug: https://bugs.chromium.org/p/skia/issues/entry?template=Autoroller+Bug Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+doc/main/autoroll/README.md
Note that the 2d grid case is still incorrect. Consider: the grid size should be the number of compute units required, but the threadgroup size is a minimum number of compute units.
If I need to process a 50x50 image, I should be able to set a grid size of 50x50. Since the minimum threadgroup size is probably bigger (say 1024), this should turn into one dispatch of size (1, 1, 1). However with the current implementation, we will make a dispatch of (50, 50, 1), which essentially squares the amount of work - doing one thread group per unit of compute.
The correct implementation for 2d compute should take the mod of each grid dimension with the threadgroup size in that dimension. I did not fix this case as we do not have a use for 2d compute yet.