-
Notifications
You must be signed in to change notification settings - Fork 1.7k
C++: Alias analysis follow-up to #16907 #16981
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
25bfade
to
7b8301a
Compare
cpp/ql/lib/semmle/code/cpp/ir/implementation/aliased_ssa/internal/AliasAnalysis.qll
Outdated
Show resolved
Hide resolved
c5b682c
to
a5efe9f
Compare
@jketema I've fixed the problems with this PR, and DCA looks good now 🎉 |
@@ -146,3 +146,8 @@ class DynamicAllocation extends Allocation, TDynamicAllocation { | |||
} | |||
|
|||
predicate phaseNeedsSoundEscapeAnalysis() { none() } | |||
|
|||
UnaliasedSsa::Allocation getOldAllocation(VariableAllocation allocation) { | |||
UnaliasedSsa::canReuseSsaForVariable(allocation.getIRVariable()) and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly does canReuseSsaForVariable
do? I've read the comment on the predicate, but I don't really understand why this is important here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you know we do 2 SSA analyses as part of IR construction:
- One when we construct the unaliased IR
- One when we construct the aliased (i.e., "final") IR
The unaliased IR computed SSA information for those variables that never alias with anything else (hence the name), and the aliased IR computes everything else. Thus, the aliased IR needs to know which variables it is in charge of computing SSA for, and which variables it can just reuse SSA the unaliased SSA's computations for.
So, in this predicate, when we're asking for the allocation produced from by unaliased SSA (i.e., a UnaliasedSsa::Allocation
) for a given aliased allocation (i.e., a VariableAllocation
) we only return a result if the alias information for the aliased allocation can actually be reused from the unaliased SSA analysis.
And the reason for why we need the getOldAllocation
predicate is that we need it for the base case in the new propagatedFromAllocationBase
predicate in AliasAnalysis.qll
:
private predicate propagatedFromAllocationBase(Operand operand, Configuration::Allocation allocation) {
consumedAsAddressOperand(operand) and
(
not exists(Configuration::getOldAllocation(allocation)) and
operand.getDef() = allocation.getABaseInstruction()
or
exists(Operand address |
operandIsPropagated(address, _, operand.getDef()) and
propagatedFromAllocationBase(address, allocation)
)
)
}
The main predicate that I'm adding in this PR is operandConsumesEscaped
(which captures the situation expressed in the PR description). And we only want to reason about allocations that haven't already been resolved in the unaliased stage. So we want to insert the not exists(Configuration::getOldAllocation(allocation))
at the earliest point to restrict the size of the recursion, and the base case of this recursion is the earliest possible place to put it 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. That helps a lot.
This fixes the consistency problem with phi instructions with only one successor that we saw here.
On
main
we consider an allocation (e.g., a variable) to be escaped if one of its base instructions (e.g., aVariableAddress
instruction whose variable is the the variable of the allocation) flows to a place where the alias analysis cannot reason about it. For example, in the example below:the
VariableAddress
instruction representingx
flows top
, andp
is unmodeled in the unaliased IR because it's address is taken on line 3.However, it turns out that there's another way for the variable to escape: when the
VariableAddress
flows to the address operand with a correspondingMemoryOperand
, and we cannot determine an appropriateMemoryLocation
for the memory operand. For example, consider the last example in the PR description for #16907. To understand this, consider first an example that works as of #16907:in the above we precisely know that
&x
and&y
flows to the definition of*p
and nothing escapes so we:MemoryLocation
that represents{x, y}
*p
may write to this location{x, y}
totally overlaps withx
So we insert a
Chi
instruction after*p = 42
, and thatChi
instruction is loaded from when reading fromx
on the last line.So all is good here. Now, consider this slight variation:
here, only one allocation flows into
*p
(since we don't track alias analysis interprocedurally). Additionally,p
doesn't escape according to the current rules for escape analysis so we conclude that*p
always writes tox
.However,
p
can alias&global
, so the write to*p
must also write to{AllAliasedMemory}
, and thusp
need to be considered escaping.To fix this, we add an extra case to the escape analysis that considers an allocation to be escaping when:
VariableAddress
) for the allocation (e.g., a variable) flows to an address operand, andglobal_address
) that also flows to the address operand