C++: Better value numbering support for loading fields in IR #2772

MathiasVP · 2020-02-05T16:45:36Z

This PR improves the IR value numbering library such that the two a->f expressions are assigned the same value number in the example below:

void foo(A* a) {
  int x = a->f;
  int y = a->f;
}

jbj

I've looked through the changes to the existing tests, and they look fantastic! I haven't looked through the new tests.

@dbartol or @rdmarsh2, can you vouch for the soundness of these changes?

jbj · 2020-02-05T18:13:45Z

cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/gvn/ValueNumbering.qll

+  ValueNumber operand
+) {
+  instr.getEnclosingIRFunction() = irFunc and
+  instr.getResultIRType() = type and


Are there test cases where the type is required to avoid conflating values? If so, I'd be surprised if those cases can't be tweaked so the type is the same.

Off the top of my head:

double foo(void* p, bool b) { if (b) { return *(int*)p; } else { return *(float*)p; } }

I'm not sure we'd treat the results of the two casts as congruent today, but we probably should. However, we would not want the results of the two indirect loads to be congruent, even though their address operands were congruent.

I included the type mainly because all the other constructors did so. I will take a further look tomorrow and see if it actually makes a difference

cpp/ql/test/library-tests/valuenumbering/GlobalValueNumbering/ir_gvn.expected

cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/gvn/ValueNumbering.qll

dbartol · 2020-02-05T20:49:35Z

This PR is great. This is exactly how I'd expect this to be implemented.

jbj · 2020-02-06T09:42:44Z

cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/gvn/ValueNumbering.qll

+  instr.getEnclosingIRFunction() = irFunc and
+  instr.getResultIRType() = type and
+  valueNumber(instr.getAnOperand().(MemoryOperand).getAnyDef()) = memOperand and
+  valueNumberOfOperand(instr.getAnOperand().(AddressOperand)) = operand


What CopyInstruction types actually have an AddressOperand? I believe this PR is intended for LoadInstructions, but the implementation includes all CopyInstructions.

The CopyValueInstruction doesn't have an AddressOperand, so it'll be missing from this predicate even though it's present in numberableInstruction, which means it'll have no value number at all, not even a unique one. I'm guessing that's not a problem in practice since CopyValueInstructions read from register operands rather than memory operands, and so they must always exactly overlap.

The StoreInstruction has an AddressOperand, but it's used for a different purpose. It doesn't have a MemoryOperand, though, so it'll be missing from this predicate. That's okay in practice for the same reasons as CopyValueInstruction.

The QLDoc on LoadOperand say that this operand type is also used on ReturnValueInstruction and ThrowValueInstruction. Is there something to be done for those instructions too? I don't think there is, because those instructions don't return anything. Their loads are always exact, so they won't be relevant in this case. I wonder why those instructions have a load built into them instead of using a separate LoadInstruction.

A ReadSideEffectInstruction is not a CopyInstruction, but I think it behaves like a LoadInstruction: it has an address and a memory operand, and that memory operand can be (is always?) inexact. It seems to me like we'd want value numbers for the indirections that are accessed by these read-side-effect instructions, but I don't see how it's possible. First, a ReadSideEffectInstruction does not return a result; second, we can't tell without analysing or modelling the callee what portion of the data will be loaded.

Okay, most of this discussion was just me thinking out loud and confirming that for the way the IR is generated today we're fine. But I think the right thing to do, for future robustness, is to restrict CongruentCopyInstructionTotal to LoadInstructions only and rename it accordingly.

I've restricted it to extend LoadInstruction only now, and renamed it CongruentCopyInstructionTotal for lack of a better name. The test results are identical.

…ion and extend LoadInstruction instead of CopyInstruction

MathiasVP added the C++ label Feb 5, 2020

jbj reviewed Feb 5, 2020

View reviewed changes

cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/gvn/ValueNumbering.qll Show resolved Hide resolved

MathiasVP marked this pull request as ready for review February 5, 2020 22:16

MathiasVP requested review from a team as code owners February 5, 2020 22:16

MathiasVP added 6 commits February 6, 2020 09:21

C++: Add testcase demonstrating unexpectly different value numbers

687dcb7

C++: Add more support for load instructions

54f0b4a

C++: Accept output

5e5bd92

C++/C#: Sync identical files

4f27750

C++: Add comment explaining buggy value number

cfcf087

C++: Update test annotations and accept output

ba395cf

MathiasVP force-pushed the more-gvn-loads branch from 7d7ed10 to ba395cf Compare February 6, 2020 08:27

jbj reviewed Feb 6, 2020

View reviewed changes

MathiasVP added 2 commits February 6, 2020 11:23

C++: Rename CongruentCopyInstructionTotal to LoadTotalOverlapInstruct…

527181b

…ion and extend LoadInstruction instead of CopyInstruction

C++/C#: Sync identical files

aaa6233

jbj approved these changes Feb 6, 2020

View reviewed changes

jbj merged commit 4997aa7 into github:master Feb 6, 2020

MathiasVP deleted the more-gvn-loads branch February 6, 2020 15:02

jbj mentioned this pull request Feb 14, 2020

C++: Value number performance fix #2835

Merged

MathiasVP mentioned this pull request Feb 19, 2020

C#: Fix compilation of ValueNumberingInternal #2870

Closed

snyk-bot mentioned this pull request Jun 23, 2021

[Snyk] Upgrade ramda from 0.25.0 to 0.27.1 aliscco/codeql#4

Open

snyk-bot mentioned this pull request Jan 13, 2022

[Snyk] Security upgrade ramda from 0.25.0 to 0.27.2 aliscco/codeql#19

Open

aliscco mentioned this pull request Sep 23, 2022

[Snyk] Fix for 3 vulnerabilities aliscco/codeql#33

Open

MathiasVP mentioned this pull request Sep 9, 2024

C++: Improve AliasedSSA performance #17225

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

C++: Better value numbering support for loading fields in IR #2772

C++: Better value numbering support for loading fields in IR #2772

Uh oh!

MathiasVP commented Feb 5, 2020 •

edited

Loading

Uh oh!

jbj left a comment

Uh oh!

jbj Feb 5, 2020

Uh oh!

dbartol Feb 5, 2020

Uh oh!

MathiasVP Feb 5, 2020

Uh oh!

Uh oh!

Uh oh!

dbartol commented Feb 5, 2020

Uh oh!

jbj Feb 6, 2020

Uh oh!

MathiasVP Feb 6, 2020

Uh oh!

Uh oh!

C++: Better value numbering support for loading fields in IR #2772

C++: Better value numbering support for loading fields in IR #2772

Uh oh!

Conversation

MathiasVP commented Feb 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jbj left a comment

Choose a reason for hiding this comment

Uh oh!

jbj Feb 5, 2020

Choose a reason for hiding this comment

Uh oh!

dbartol Feb 5, 2020

Choose a reason for hiding this comment

Uh oh!

MathiasVP Feb 5, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

dbartol commented Feb 5, 2020

Uh oh!

jbj Feb 6, 2020

Choose a reason for hiding this comment

Uh oh!

MathiasVP Feb 6, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MathiasVP commented Feb 5, 2020 •

edited

Loading