-
Notifications
You must be signed in to change notification settings - Fork 247
Aggressively prune no-side-effect instructions during DCE. #691
Conversation
khyperia
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neat! I think I'd normally be a little meh about this, since I'm trying (and partially failing) to keep rust-gpu's passes strictly for correctness rather than optimization, and leave the optimization passes to spirv-opt, but if you opt for my comment about unused in #690 then this is actually needed for correctness (to nuke the invalid composites).
I think the android CI failure should be fixed by #701, could you rebase?
| any |= root(inst, rooted); | ||
| } else if let Some(id) = inst.result_id { | ||
| if rooted.contains(&id) { | ||
| any |= root(inst, rooted); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hrm, this might be dreadfully inefficient, some profiling might be nice here! (how many times spread_roots is called before/after this change). It's probably way more efficient to iterate over instructions backwards instead of forwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, just checked, when running the multibuilder example on main, dce is called three times, and in each of those, spread_roots is called [3, 4, 1] times. With this PR, it's [8, 8, 5], a significant cost (definitely not zero-cost, especially with the last one being a no-op). But, reversing the iteration order of the function's instructions, it drops back down to [3, 4, 1].
It'll still likely be worse than the original in some cases (loops etc.), so it's not practically zero-cost, but still should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. So do the reversing of iteration and write a comment saying why it is so?
Since we're walking all the instructions anyway, it's practically zero-cost.
This allows to root more instructions per `spread_roots` invocation, becoming zero-cost in absence of loops.
4be655b to
120717c
Compare
Since we're walking all the instructions anyway, it's practically
zero-cost.