You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The logic was misguided, and based on the idea that if
using max-work-group-size can lead to launching just a
single work-group, then we can reduce everything within
the work-group and not use atomics altogether.
This lead to problems on CPU, where max-work-group-size is 8192,
and max-work-group size was selected, but the total number of
work-groups launched was high due to large iteration space size,
and this resulted in severe underutilization of the device (low
ocupancy).
0 commit comments