C++: Reuse even more `DataFlow::Node`s #14008

MathiasVP · 2023-08-21T12:09:01Z

In C/C++ dataflow we reuse some dataflow nodes that we know are sementically equivalent to gain a small amount of performance, and to make it easier to select the right dataflow node in sources, sinks, barriers, etc. For example, consider the following snippet of IR:

r1(glval<char **>) = VariableAddress[a]          :
r2(char **)        = Load[a]                     : &:r1, m1
r3(glval<char *>)  = CopyValue                   : r2

For dataflow we have a node that represents &:r1 (i.e, the value of type char***), as well as nodes for the possible indirections of &:r1: *&:r1 (i.e., the value of type char**), **&:r1 (i.e, the value of type char*), and ***&:r1 (i.e., the value of type char).

Similarly, we have a node that represents the r2 operand (i.e., the value of type char**), as well as nodes for the possible indirections of r2: *r2 (i.e., the value of type char*), and **r2 (i.e., the value of type char).

However, some of these nodes represent the semantically same value. For example, both *&:r1 and r2 represent the char** value (i.e., the value you get when you dereference VariableAddress[a]).

The getIRRepresentationOfIndirectOperand and getIRRepresentationOfIndirectInstruction predicates are used to identify such semantically identical representations.

Currently, however, getIRRepresentationOfIndirectOperand and getIRRepresentationOfIndirectInstruction only identified that *&:r1 and r2 represented the same value. But **&:r1 and *r2, and ***&:r1 and **r2 also represent the same values. This PR extends those predicates to identify these cases.

This should (hopefully) give us a small speedup, a reduced memory footprint, and possible even remove some FPs in queries in cases where we selected a equivalent (but until now not identical!) node to the one on the actual dataflow path.

…e have a result for 'isUseImpl'.

…doing reverse reads.

MathiasVP · 2023-08-30T09:15:43Z

This PR is finally good to go. There's a couple of test changes in the internal repo that I need to accept as well. I'll do that once this PR has been reviewed.

DCA looks good. I've verified that all the lost results are instances of the test added in f65fe34.

geoffw0

The core change looks plausible to me. I can't comment on performance but the tests show considerable deduplication. 🎉

A few comments / questions...

cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaInternalsCommon.qll

cpp/ql/test/library-tests/dataflow/dataflow-tests/test_self_argument_flow.ql

cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaInternals.qll

geoffw0

LGTM.

DCA shows a speedup.

I've verified that all the lost results are instances of the test added in f65fe34.

👍

MathiasVP added 2 commits August 21, 2023 12:51

C++: Don't limit instruction and operand reuse to those cases where w…

50190ef

…e have a result for 'isUseImpl'.

C++: Don't consider additional loads when reusing dataflow operands.

c46f9e4

github-actions bot added the C++ label Aug 21, 2023

MathiasVP added the no-change-note-required This PR does not need a change note label Aug 21, 2023

MathiasVP added 4 commits August 21, 2023 14:02

C++: Accept more test changes.

ef9d342

Merge branch 'main' into reuse-even-more-nodes

bb1712b

C++: Add false positive caused by flowing back into a function after …

f65fe34

…doing reverse reads.

C++: Fix FPs by making 'isArgumentOfCallable' more robust.

99cc417

MathiasVP force-pushed the reuse-even-more-nodes branch from 15c3a9e to 99cc417 Compare August 29, 2023 13:12

Merge branch 'main' into reuse-even-more-nodes

e4a11b8

MathiasVP marked this pull request as ready for review August 30, 2023 09:13

MathiasVP requested a review from a team as a code owner August 30, 2023 09:13

MathiasVP added the depends on internal PR This PR should only be merged in sync with an internal Semmle PR label Aug 30, 2023

C++: Accept more test changes.

b092da4

MathiasVP force-pushed the reuse-even-more-nodes branch from b42b8c9 to b092da4 Compare August 30, 2023 10:27

geoffw0 reviewed Aug 30, 2023

View reviewed changes

cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaInternalsCommon.qll Show resolved Hide resolved

cpp/ql/test/library-tests/dataflow/dataflow-tests/test_self_argument_flow.ql Show resolved Hide resolved

cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaInternals.qll Show resolved Hide resolved

C++: Add QLDoc to 'isDereference'.

261ba8e

geoffw0 approved these changes Aug 30, 2023

View reviewed changes

MathiasVP merged commit 1159508 into github:main Aug 30, 2023

This was referenced Oct 4, 2023

C++: Use unique in hasIRRepresentationOfIndirectInstruction #14376

Merged

C++: Update for changes in frontend. #14135

Merged

MathiasVP mentioned this pull request Mar 7, 2025

C++: Share indirect dataflow nodes across CopyValue instructions #18955

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

C++: Reuse even more `DataFlow::Node`s #14008

C++: Reuse even more `DataFlow::Node`s #14008

Uh oh!

MathiasVP commented Aug 21, 2023 •

edited

Loading

Uh oh!

MathiasVP commented Aug 30, 2023 •

edited

Loading

Uh oh!

geoffw0 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

geoffw0 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

C++: Reuse even more DataFlow::Nodes #14008

C++: Reuse even more DataFlow::Nodes #14008

Uh oh!

Conversation

MathiasVP commented Aug 21, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

MathiasVP commented Aug 30, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

geoffw0 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

geoffw0 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

C++: Reuse even more `DataFlow::Node`s #14008

C++: Reuse even more `DataFlow::Node`s #14008

MathiasVP commented Aug 21, 2023 •

edited

Loading

MathiasVP commented Aug 30, 2023 •

edited

Loading