-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Closed
Description
tvm/src/arith/rewrite_simplify.cc
Line 857 in 9f29e2a
| // If all possible indices in ramp are the same. |
I have a counter example to this rewrite. It takes this code
floormod((((threadIdx.x*50) + (b.i.fused.j.fused.inner.outer.inner*8)) + b.i.fused.j.fused.inner.inner.s_1), 20)
to this code:
ramp((floormod(((threadIdx.x*50) + (b.i.fused.j.fused.inner.outer.inner*8)), 20)*32), 32, 8)
The code determines the modular set to have base 0 and coeff 2 but when it computes ramp_max for lanes=8, it incorrectly determines that ramp_min==ramp_max. Perhaps this is always a problem with ceoff*lanes < c2val?
I can't provide a test case because it requires a modified version of vectorize_loop.cc which I can't yet share.