Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Alias analysis follow-up to #16907 #16981

Merged
merged 5 commits into from
Jul 25, 2024

Conversation

MathiasVP
Copy link
Contributor

@MathiasVP MathiasVP commented Jul 15, 2024

This fixes the consistency problem with phi instructions with only one successor that we saw here.

On main we consider an allocation (e.g., a variable) to be escaped if one of its base instructions (e.g., a VariableAddress instruction whose variable is the the variable of the allocation) flows to a place where the alias analysis cannot reason about it. For example, in the example below:

1. int x;
2. int* p = &x;
3. int** pp = &p;

the VariableAddress instruction representing x flows to p, and p is unmodeled in the unaliased IR because it's address is taken on line 3.

However, it turns out that there's another way for the variable to escape: when the VariableAddress flows to the address operand with a corresponding MemoryOperand, and we cannot determine an appropriate MemoryLocation for the memory operand. For example, consider the last example in the PR description for #16907. To understand this, consider first an example that works as of #16907:

void test(bool b) {
  int x, y;
  int* p;
  if(b) {
    p = &x;
  } else {
    p = &y;
  }
  *p = 42;
  use(x);
}

in the above we precisely know that &x and &y flows to the definition of *p and nothing escapes so we:

  1. Create a MemoryLocation that represents {x, y}
  2. Conclude that *p may write to this location
  3. Conclude that {x, y} totally overlaps with x
    So we insert a Chi instruction after *p = 42, and that Chi instruction is loaded from when reading from x on the last line.

So all is good here. Now, consider this slight variation:

int global;
int* global_address() { return &global; }

void test(bool b) {
  int x;
  int* p = global_address();
  if(b) {
    p = &x;
  }
  *p = 42;
  return global;
}

here, only one allocation flows into *p (since we don't track alias analysis interprocedurally). Additionally, p doesn't escape according to the current rules for escape analysis so we conclude that *p always writes to x.
However, p can alias &global, so the write to *p must also write to {AllAliasedMemory}, and thus p need to be considered escaping.

To fix this, we add an extra case to the escape analysis that considers an allocation to be escaping when:

  • One of the base instructions (e.g., a VariableAddress) for the allocation (e.g., a variable) flows to an address operand, and
  • there exists another instruction not belonging to any allocation (for example, a call to global_address) that also flows to the address operand

@MathiasVP MathiasVP marked this pull request as ready for review July 18, 2024 11:09
@MathiasVP MathiasVP requested a review from a team as a code owner July 18, 2024 11:09
@MathiasVP MathiasVP added the no-change-note-required This PR does not need a change note label Jul 18, 2024
@MathiasVP MathiasVP marked this pull request as draft July 18, 2024 16:28
@MathiasVP MathiasVP marked this pull request as ready for review July 24, 2024 16:36
@MathiasVP
Copy link
Contributor Author

@jketema I've fixed the problems with this PR, and DCA looks good now 🎉

@@ -146,3 +146,8 @@ class DynamicAllocation extends Allocation, TDynamicAllocation {
}

predicate phaseNeedsSoundEscapeAnalysis() { none() }

UnaliasedSsa::Allocation getOldAllocation(VariableAllocation allocation) {
UnaliasedSsa::canReuseSsaForVariable(allocation.getIRVariable()) and
Copy link
Contributor

@jketema jketema Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly does canReuseSsaForVariable do? I've read the comment on the predicate, but I don't really understand why this is important here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you know we do 2 SSA analyses as part of IR construction:

  • One when we construct the unaliased IR
  • One when we construct the aliased (i.e., "final") IR

The unaliased IR computed SSA information for those variables that never alias with anything else (hence the name), and the aliased IR computes everything else. Thus, the aliased IR needs to know which variables it is in charge of computing SSA for, and which variables it can just reuse SSA the unaliased SSA's computations for.

So, in this predicate, when we're asking for the allocation produced from by unaliased SSA (i.e., a UnaliasedSsa::Allocation) for a given aliased allocation (i.e., a VariableAllocation) we only return a result if the alias information for the aliased allocation can actually be reused from the unaliased SSA analysis.

And the reason for why we need the getOldAllocation predicate is that we need it for the base case in the new propagatedFromAllocationBase predicate in AliasAnalysis.qll:

private predicate propagatedFromAllocationBase(Operand operand, Configuration::Allocation allocation) {
  consumedAsAddressOperand(operand) and
  (
    not exists(Configuration::getOldAllocation(allocation)) and
    operand.getDef() = allocation.getABaseInstruction()
    or
    exists(Operand address |
      operandIsPropagated(address, _, operand.getDef()) and
      propagatedFromAllocationBase(address, allocation)
    )
  )
}

The main predicate that I'm adding in this PR is operandConsumesEscaped (which captures the situation expressed in the PR description). And we only want to reason about allocations that haven't already been resolved in the unaliased stage. So we want to insert the not exists(Configuration::getOldAllocation(allocation)) at the earliest point to restrict the size of the recursion, and the base case of this recursion is the earliest possible place to put it 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. That helps a lot.

@MathiasVP MathiasVP merged commit c5da43e into github:main Jul 25, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants