-
Notifications
You must be signed in to change notification settings - Fork 1.7k
C++: Fix join order in cpp/missing-check-scanf
#12061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The issues were: * `revFlow`: `revFlow` joins `fwdFlow` on `vn`. * `Node.getASuccessor()`: `MkNode` self-join on `vn`. * `hasFlow/5`: `MkNode` self-join on `vn`.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment. LGTM otherwise! Do you have some before/after tuple counts?
@@ -100,7 +100,7 @@ private predicate fwdFlow(Instruction instr, ValueNumber vn) { | |||
*/ | |||
pragma[nomagic] | |||
predicate revFlow(Instruction instr, ValueNumber vn) { | |||
fwdFlow(instr, vn) and | |||
fwdFlow(instr, pragma[only_bind_out](vn)) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have imagined that the right join order here would be to always start with fwdFlow
since there are most likely a lot more sinks for this query than there are sources (although I might be wrong on this). But this still allows revFlow -> getASuccessor -> fwdFlow
, right? Is that the intention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way you get (revFlow#delta minus isBarrier) join getASuccessor on succ
, whereas before it was (revFlow#delta minus isBarrier) join fwdFlow on vn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, my understanding was we wanted an early join with getASuccessor
, but we were happy to let the optimizer pick what it joined it with (fwdFlow
or revFlow
). As long as it isn't joining fwdFlow
with revFlow
any more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can totally see why joining on vn
would be bad. But is it not even better to start with fwdFlow
? So that we get something like fwdFlow -> getASuccessor -> revFlow#delta
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you start optimizing join orders that aren't problematic any more (rather than let the optimizer do its job), you open a very, very large can of worms...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... though if you can do it still with just one pragma
, I will be happy with that. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just suggesting that we try the alternative join order and look at the tuple counts on the projects you've been checking today. Now that we're modifying the order we might as well get the right one the first time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I was too impatient to run the whole query "before" due to how long it took, so I don't have before/after tuple counts :P But for example, quick-eval'ing revFlow
went from 7 minutes down to 8 seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally understandable! I often also just post a partial "before" run (for example here) since it's often not even possible to fully evaluate the problematic pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this alternative join order explored at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes were done in a pair programming session (though @d10c deserves far more credit than I do). I'm happy with the three fixes and saw their effects first hand.
We were focussed on two problematic projects in particular, it probably makes sense to do a DCA run to check performance on more typical projects as well.
The issues were:
revFlow
:revFlow
joinsfwdFlow
onvn
.Node.getASuccessor()
:MkNode
self-join onvn
.hasFlow/5
:MkNode
self-join onvn
.