-
Notifications
You must be signed in to change notification settings - Fork 1.9k
C++: Reuse even more DataFlow::Nodes
#14008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
15c3a9e to
99cc417
Compare
|
This PR is finally good to go. There's a couple of test changes in the internal repo that I need to accept as well. I'll do that once this PR has been reviewed. DCA looks good. I've verified that all the lost results are instances of the test added in f65fe34. |
b42b8c9 to
b092da4
Compare
geoffw0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core change looks plausible to me. I can't comment on performance but the tests show considerable deduplication. 🎉
A few comments / questions...
cpp/ql/test/library-tests/dataflow/dataflow-tests/test_self_argument_flow.ql
Show resolved
Hide resolved
geoffw0
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
DCA shows a speedup.
I've verified that all the lost results are instances of the test added in f65fe34.
👍
In C/C++ dataflow we reuse some dataflow nodes that we know are sementically equivalent to gain a small amount of performance, and to make it easier to select the right dataflow node in sources, sinks, barriers, etc. For example, consider the following snippet of IR:
For dataflow we have a node that represents
&:r1(i.e, the value of typechar***), as well as nodes for the possible indirections of&:r1:*&:r1(i.e., the value of typechar**),**&:r1(i.e, the value of typechar*), and***&:r1(i.e., the value of typechar).Similarly, we have a node that represents the
r2operand (i.e., the value of typechar**), as well as nodes for the possible indirections ofr2:*r2(i.e., the value of typechar*), and**r2(i.e., the value of typechar).However, some of these nodes represent the semantically same value. For example, both
*&:r1andr2represent thechar**value (i.e., the value you get when you dereferenceVariableAddress[a]).The
getIRRepresentationOfIndirectOperandandgetIRRepresentationOfIndirectInstructionpredicates are used to identify such semantically identical representations.Currently, however,
getIRRepresentationOfIndirectOperandandgetIRRepresentationOfIndirectInstructiononly identified that*&:r1andr2represented the same value. But**&:r1and*r2, and***&:r1and**r2also represent the same values. This PR extends those predicates to identify these cases.This should (hopefully) give us a small speedup, a reduced memory footprint, and possible even remove some FPs in queries in cases where we selected a equivalent (but until now not identical!) node to the one on the actual dataflow path.