C++: Alias analysis follow-up to #16907 #16981

MathiasVP · 2024-07-15T14:25:31Z

This fixes the consistency problem with phi instructions with only one successor that we saw here.

On main we consider an allocation (e.g., a variable) to be escaped if one of its base instructions (e.g., a VariableAddress instruction whose variable is the the variable of the allocation) flows to a place where the alias analysis cannot reason about it. For example, in the example below:

1. int x;
2. int* p = &x;
3. int** pp = &p;

the VariableAddress instruction representing x flows to p, and p is unmodeled in the unaliased IR because it's address is taken on line 3.

However, it turns out that there's another way for the variable to escape: when the VariableAddress flows to the address operand with a corresponding MemoryOperand, and we cannot determine an appropriate MemoryLocation for the memory operand. For example, consider the last example in the PR description for #16907. To understand this, consider first an example that works as of #16907:

void test(bool b) {
  int x, y;
  int* p;
  if(b) {
    p = &x;
  } else {
    p = &y;
  }
  *p = 42;
  use(x);
}

in the above we precisely know that &x and &y flows to the definition of *p and nothing escapes so we:

Create a MemoryLocation that represents {x, y}
Conclude that *p may write to this location
Conclude that {x, y} totally overlaps with x
So we insert a Chi instruction after *p = 42, and that Chi instruction is loaded from when reading from x on the last line.

So all is good here. Now, consider this slight variation:

int global;
int* global_address() { return &global; }

void test(bool b) {
  int x;
  int* p = global_address();
  if(b) {
    p = &x;
  }
  *p = 42;
  return global;
}

here, only one allocation flows into *p (since we don't track alias analysis interprocedurally). Additionally, p doesn't escape according to the current rules for escape analysis so we conclude that *p always writes to x.
However, p can alias &global, so the write to *p must also write to {AllAliasedMemory}, and thus p need to be considered escaping.

To fix this, we add an extra case to the escape analysis that considers an allocation to be escaping when:

One of the base instructions (e.g., a VariableAddress) for the allocation (e.g., a variable) flows to an address operand, and
there exists another instruction not belonging to any allocation (for example, a call to global_address) that also flows to the address operand

cpp/ql/lib/semmle/code/cpp/ir/implementation/aliased_ssa/internal/AliasAnalysis.qll

MathiasVP · 2024-07-24T16:40:56Z

@jketema I've fixed the problems with this PR, and DCA looks good now 🎉

jketema · 2024-07-25T09:44:07Z

cpp/ql/lib/semmle/code/cpp/ir/implementation/aliased_ssa/internal/AliasConfiguration.qll

@@ -146,3 +146,8 @@ class DynamicAllocation extends Allocation, TDynamicAllocation {
 }

 predicate phaseNeedsSoundEscapeAnalysis() { none() }
+
+UnaliasedSsa::Allocation getOldAllocation(VariableAllocation allocation) {
+  UnaliasedSsa::canReuseSsaForVariable(allocation.getIRVariable()) and


What exactly does canReuseSsaForVariable do? I've read the comment on the predicate, but I don't really understand why this is important here.

As you know we do 2 SSA analyses as part of IR construction:

One when we construct the unaliased IR

One when we construct the aliased (i.e., "final") IR

The unaliased IR computed SSA information for those variables that never alias with anything else (hence the name), and the aliased IR computes everything else. Thus, the aliased IR needs to know which variables it is in charge of computing SSA for, and which variables it can just reuse SSA the unaliased SSA's computations for.

So, in this predicate, when we're asking for the allocation produced from by unaliased SSA (i.e., a UnaliasedSsa::Allocation) for a given aliased allocation (i.e., a VariableAllocation) we only return a result if the alias information for the aliased allocation can actually be reused from the unaliased SSA analysis.

And the reason for why we need the getOldAllocation predicate is that we need it for the base case in the new propagatedFromAllocationBase predicate in AliasAnalysis.qll:

private predicate propagatedFromAllocationBase(Operand operand, Configuration::Allocation allocation) { consumedAsAddressOperand(operand) and ( not exists(Configuration::getOldAllocation(allocation)) and operand.getDef() = allocation.getABaseInstruction() or exists(Operand address | operandIsPropagated(address, _, operand.getDef()) and propagatedFromAllocationBase(address, allocation) ) ) }

The main predicate that I'm adding in this PR is operandConsumesEscaped (which captures the situation expressed in the PR description). And we only want to reason about allocations that haven't already been resolved in the unaliased stage. So we want to insert the not exists(Configuration::getOldAllocation(allocation)) at the earliest point to restrict the size of the recursion, and the base case of this recursion is the earliest possible place to put it 😄

Thanks. That helps a lot.

github-actions bot added the C++ label Jul 15, 2024

MathiasVP mentioned this pull request Jul 17, 2024

C++: Add a new MemoryLocation to represent sets of Allocations #16907

Merged

MathiasVP added 2 commits July 17, 2024 15:44

C++: Fix more inconsistencies.

1c8e0c4

C++: Accept more test changes.

7b8301a

MathiasVP force-pushed the phi-escape-5-follow-up-2 branch from 25bfade to 7b8301a Compare July 17, 2024 14:45

MathiasVP marked this pull request as ready for review July 18, 2024 11:09

MathiasVP requested a review from a team as a code owner July 18, 2024 11:09

MathiasVP added the no-change-note-required This PR does not need a change note label Jul 18, 2024

jketema reviewed Jul 18, 2024

View reviewed changes

cpp/ql/lib/semmle/code/cpp/ir/implementation/aliased_ssa/internal/AliasAnalysis.qll Outdated Show resolved Hide resolved

MathiasVP marked this pull request as draft July 18, 2024 16:28

C++: Respond to review comments.

a5efe9f

MathiasVP force-pushed the phi-escape-5-follow-up-2 branch from c5b682c to a5efe9f Compare July 24, 2024 14:18

MathiasVP added 2 commits July 24, 2024 15:20

C++: Sync identical files.

7ec3162

Merge branch 'main' into phi-escape-5-follow-up-2

b3bffb6

MathiasVP marked this pull request as ready for review July 24, 2024 16:36

jketema reviewed Jul 25, 2024

View reviewed changes

jketema approved these changes Jul 25, 2024

View reviewed changes

MathiasVP merged commit c5da43e into github:main Jul 25, 2024
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

C++: Alias analysis follow-up to #16907 #16981

C++: Alias analysis follow-up to #16907 #16981

Uh oh!

MathiasVP commented Jul 15, 2024 •

edited

Loading

Uh oh!

Uh oh!

MathiasVP commented Jul 24, 2024

Uh oh!

jketema Jul 25, 2024 •

edited

Loading

Uh oh!

MathiasVP Jul 25, 2024

Uh oh!

jketema Jul 25, 2024

Uh oh!

Uh oh!

Uh oh!

C++: Alias analysis follow-up to #16907 #16981

C++: Alias analysis follow-up to #16907 #16981

Uh oh!

Conversation

MathiasVP commented Jul 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

MathiasVP commented Jul 24, 2024

Uh oh!

jketema Jul 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MathiasVP Jul 25, 2024

Choose a reason for hiding this comment

Uh oh!

jketema Jul 25, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

MathiasVP commented Jul 15, 2024 •

edited

Loading

jketema Jul 25, 2024 •

edited

Loading