Commit c7cb4ac
committed
[Fix][TIR] LowerThreadAllreduce with correct thread mask
This PR fixes a bug in the LowerThreadAllreduce pass.
Prior to this PR, in multi-group settings, the thread mask is not
correctly set: when the reduction extent is 32, the thread mask will
always be 0. This bug was not spotted because even when the mask is 0,
the CUDA program still gives correct result. But in any way, having
the zero mask is dangerous and should be fixed.1 parent b6502f4 commit c7cb4ac
File tree
2 files changed
+70
-5
lines changed- src/tir/transforms
- tests/python/unittest
2 files changed
+70
-5
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
333 | 333 | | |
334 | 334 | | |
335 | 335 | | |
336 | | - | |
337 | | - | |
| 336 | + | |
| 337 | + | |
338 | 338 | | |
339 | 339 | | |
340 | 340 | | |
| |||
392 | 392 | | |
393 | 393 | | |
394 | 394 | | |
395 | | - | |
| 395 | + | |
396 | 396 | | |
397 | 397 | | |
398 | 398 | | |
| |||
405 | 405 | | |
406 | 406 | | |
407 | 407 | | |
408 | | - | |
| 408 | + | |
409 | 409 | | |
410 | 410 | | |
411 | 411 | | |
| |||
669 | 669 | | |
670 | 670 | | |
671 | 671 | | |
672 | | - | |
| 672 | + | |
673 | 673 | | |
674 | 674 | | |
675 | 675 | | |
| |||
Lines changed: 65 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
235 | 235 | | |
236 | 236 | | |
237 | 237 | | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
238 | 303 | | |
239 | 304 | | |
0 commit comments