Skip to content

C++: Alias analysis follow-up to #16907 #16981

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -338,6 +338,56 @@ private predicate resultEscapesNonReturn(Instruction instr) {
not instr.isResultModeled()
}

/** Holds if `operand` may (transitively) flow to an `AddressOperand`. */
private predicate consumedAsAddressOperand(Operand operand) {
operand instanceof AddressOperand
or
exists(Operand address |
consumedAsAddressOperand(address) and
operandIsPropagated(operand, _, address.getDef())
)
}

/**
* Holds if `operand` may originate from a base instruction of an allocation,
* and that operand may transitively flow to an `AddressOperand`.
*/
private predicate propagatedFromAllocationBase(Operand operand, Configuration::Allocation allocation) {
consumedAsAddressOperand(operand) and
(
not exists(Configuration::getOldAllocation(allocation)) and
operand.getDef() = allocation.getABaseInstruction()
or
exists(Operand address |
operandIsPropagated(address, _, operand.getDef()) and
propagatedFromAllocationBase(address, allocation)
)
)
}

private predicate propagatedFromNonAllocationBase(Operand operand) {
exists(Instruction def |
def = operand.getDef() and
not operandIsPropagated(_, _, def) and
not def = any(Configuration::Allocation allocation).getABaseInstruction()
)
or
exists(Operand address |
operandIsPropagated(address, _, operand.getDef()) and
propagatedFromNonAllocationBase(address)
)
}

/**
* Holds if we cannot see all producers of an operand for which allocation also flows into.
*/
private predicate operandConsumesEscaped(Configuration::Allocation allocation) {
exists(AddressOperand address |
propagatedFromAllocationBase(address, allocation) and
propagatedFromNonAllocationBase(address)
)
}

/**
* Holds if the address of `allocation` escapes outside the domain of the analysis. This can occur
* either because the allocation's address is taken within the function and escapes, or because the
Expand All @@ -346,12 +396,14 @@ private predicate resultEscapesNonReturn(Instruction instr) {
predicate allocationEscapes(Configuration::Allocation allocation) {
allocation.alwaysEscapes()
or
exists(IREscapeAnalysisConfiguration config |
config.useSoundEscapeAnalysis() and resultEscapesNonReturn(allocation.getABaseInstruction())
exists(IREscapeAnalysisConfiguration config | config.useSoundEscapeAnalysis() |
resultEscapesNonReturn(allocation.getABaseInstruction())
or
operandConsumesEscaped(allocation)
)
or
Configuration::phaseNeedsSoundEscapeAnalysis() and
resultEscapesNonReturn(allocation.getABaseInstruction())
(resultEscapesNonReturn(allocation.getABaseInstruction()) or operandConsumesEscaped(allocation))
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -146,3 +146,8 @@ class DynamicAllocation extends Allocation, TDynamicAllocation {
}

predicate phaseNeedsSoundEscapeAnalysis() { none() }

UnaliasedSsa::Allocation getOldAllocation(VariableAllocation allocation) {
UnaliasedSsa::canReuseSsaForVariable(allocation.getIRVariable()) and
Copy link
Contributor

@jketema jketema Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly does canReuseSsaForVariable do? I've read the comment on the predicate, but I don't really understand why this is important here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you know we do 2 SSA analyses as part of IR construction:

  • One when we construct the unaliased IR
  • One when we construct the aliased (i.e., "final") IR

The unaliased IR computed SSA information for those variables that never alias with anything else (hence the name), and the aliased IR computes everything else. Thus, the aliased IR needs to know which variables it is in charge of computing SSA for, and which variables it can just reuse SSA the unaliased SSA's computations for.

So, in this predicate, when we're asking for the allocation produced from by unaliased SSA (i.e., a UnaliasedSsa::Allocation) for a given aliased allocation (i.e., a VariableAllocation) we only return a result if the alias information for the aliased allocation can actually be reused from the unaliased SSA analysis.

And the reason for why we need the getOldAllocation predicate is that we need it for the base case in the new propagatedFromAllocationBase predicate in AliasAnalysis.qll:

private predicate propagatedFromAllocationBase(Operand operand, Configuration::Allocation allocation) {
  consumedAsAddressOperand(operand) and
  (
    not exists(Configuration::getOldAllocation(allocation)) and
    operand.getDef() = allocation.getABaseInstruction()
    or
    exists(Operand address |
      operandIsPropagated(address, _, operand.getDef()) and
      propagatedFromAllocationBase(address, allocation)
    )
  )
}

The main predicate that I'm adding in this PR is operandConsumesEscaped (which captures the situation expressed in the PR description). And we only want to reason about allocations that haven't already been resolved in the unaliased stage. So we want to insert the not exists(Configuration::getOldAllocation(allocation)) at the earliest point to restrict the size of the recursion, and the base case of this recursion is the earliest possible place to put it 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. That helps a lot.

result = allocation.getIRVariable()
}
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ private import semmle.code.cpp.ir.implementation.unaliased_ssa.internal.SSAConst
private import semmle.code.cpp.ir.internal.IntegerConstant as Ints
private import semmle.code.cpp.ir.internal.IntegerInterval as Interval
private import semmle.code.cpp.ir.implementation.internal.OperandTag
private import AliasConfiguration
import AliasConfiguration
private import codeql.util.Boolean

private class IntValue = Ints::IntValue;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -338,6 +338,56 @@ private predicate resultEscapesNonReturn(Instruction instr) {
not instr.isResultModeled()
}

/** Holds if `operand` may (transitively) flow to an `AddressOperand`. */
private predicate consumedAsAddressOperand(Operand operand) {
operand instanceof AddressOperand
or
exists(Operand address |
consumedAsAddressOperand(address) and
operandIsPropagated(operand, _, address.getDef())
)
}

/**
* Holds if `operand` may originate from a base instruction of an allocation,
* and that operand may transitively flow to an `AddressOperand`.
*/
private predicate propagatedFromAllocationBase(Operand operand, Configuration::Allocation allocation) {
consumedAsAddressOperand(operand) and
(
not exists(Configuration::getOldAllocation(allocation)) and
operand.getDef() = allocation.getABaseInstruction()
or
exists(Operand address |
operandIsPropagated(address, _, operand.getDef()) and
propagatedFromAllocationBase(address, allocation)
)
)
}

private predicate propagatedFromNonAllocationBase(Operand operand) {
exists(Instruction def |
def = operand.getDef() and
not operandIsPropagated(_, _, def) and
not def = any(Configuration::Allocation allocation).getABaseInstruction()
)
or
exists(Operand address |
operandIsPropagated(address, _, operand.getDef()) and
propagatedFromNonAllocationBase(address)
)
}

/**
* Holds if we cannot see all producers of an operand for which allocation also flows into.
*/
private predicate operandConsumesEscaped(Configuration::Allocation allocation) {
exists(AddressOperand address |
propagatedFromAllocationBase(address, allocation) and
propagatedFromNonAllocationBase(address)
)
}

/**
* Holds if the address of `allocation` escapes outside the domain of the analysis. This can occur
* either because the allocation's address is taken within the function and escapes, or because the
Expand All @@ -346,12 +396,14 @@ private predicate resultEscapesNonReturn(Instruction instr) {
predicate allocationEscapes(Configuration::Allocation allocation) {
allocation.alwaysEscapes()
or
exists(IREscapeAnalysisConfiguration config |
config.useSoundEscapeAnalysis() and resultEscapesNonReturn(allocation.getABaseInstruction())
exists(IREscapeAnalysisConfiguration config | config.useSoundEscapeAnalysis() |
resultEscapesNonReturn(allocation.getABaseInstruction())
or
operandConsumesEscaped(allocation)
)
or
Configuration::phaseNeedsSoundEscapeAnalysis() and
resultEscapesNonReturn(allocation.getABaseInstruction())
(resultEscapesNonReturn(allocation.getABaseInstruction()) or operandConsumesEscaped(allocation))
}

/**
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
private import AliasConfigurationImports
private import codeql.util.Unit

/**
* A memory allocation that can be tracked by the SimpleSSA alias analysis.
Expand All @@ -16,3 +17,5 @@ class Allocation extends IRAutomaticVariable {
}

predicate phaseNeedsSoundEscapeAnalysis() { any() }

Unit getOldAllocation(Allocation allocation) { any() }
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import AliasAnalysis
private import SimpleSSAImports
import SimpleSSAPublicImports
private import AliasConfiguration
import AliasConfiguration
private import codeql.util.Unit

private predicate isTotalAccess(Allocation var, AddressOperand addrOperand, IRType type) {
Expand Down
7 changes: 4 additions & 3 deletions cpp/ql/test/library-tests/ir/ir/aliased_ir.expected
Original file line number Diff line number Diff line change
Expand Up @@ -18827,22 +18827,23 @@ ir.cpp:
# 2667| r2667_1(glval<int>) = VariableAddress[intBuffer] :
# 2667| r2667_2(int) = Constant[8] :
# 2667| m2667_3(int) = Store[intBuffer] : &:r2667_1, r2667_2
# 2667| m2667_4(unknown) = Chi : total:m2663_4, partial:m2667_3
# 2668| r2668_1(glval<int>) = VariableAddress[intBuffer] :
# 2668| r2668_2(int *) = CopyValue : r2668_1
# 2668| r2668_3(glval<int *>) = VariableAddress[data] :
# 2668| m2668_4(int *) = Store[data] : &:r2668_3, r2668_2
#-----| Goto -> Block 2

# 2670| Block 2
# 2670| m2670_1(int) = Phi : from 1:m2667_3
# 2670| m2670_1(unknown) = Phi : from 0:~m2663_4, from 1:~m2667_4
# 2670| m2670_2(int *) = Phi : from 0:m2665_3, from 1:m2668_4
# 2670| r2670_3(glval<unknown>) = FunctionAddress[use_int] :
# 2670| r2670_4(glval<int *>) = VariableAddress[data] :
# 2670| r2670_5(int *) = Load[data] : &:r2670_4, m2670_2
# 2670| r2670_6(int) = Load[?] : &:r2670_5, ~m2670_1
# 2670| v2670_7(void) = Call[use_int] : func:r2670_3, 0:r2670_6
# 2670| m2670_8(unknown) = ^CallSideEffect : ~m2663_4
# 2670| m2670_9(unknown) = Chi : total:m2663_4, partial:m2670_8
# 2670| m2670_8(unknown) = ^CallSideEffect : ~m2670_1
# 2670| m2670_9(unknown) = Chi : total:m2670_1, partial:m2670_8
# 2671| v2671_1(void) = NoOp :
# 2663| v2663_7(void) = ReturnVoid :
# 2663| v2663_8(void) = AliasedUse : ~m2670_9
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ missingOperand
unexpectedOperand
duplicateOperand
missingPhiOperand
| ir.cpp:2670:3:2670:9 | Phi: call to use_int | Instruction 'Phi: call to use_int' is missing an operand for predecessor block 'EnterFunction: phi_with_single_input_at_merge' in function '$@'. | ir.cpp:2663:13:2663:42 | void phi_with_single_input_at_merge(bool) | void phi_with_single_input_at_merge(bool) |
missingOperandType
duplicateChiOperand
sideEffectWithoutPrimary
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ missingOperand
unexpectedOperand
duplicateOperand
missingPhiOperand
| ir.cpp:2670:3:2670:9 | Phi: call to use_int | Instruction 'Phi: call to use_int' is missing an operand for predecessor block 'EnterFunction: phi_with_single_input_at_merge' in function '$@'. | ir.cpp:2663:13:2663:42 | void phi_with_single_input_at_merge(bool) | void phi_with_single_input_at_merge(bool) |
missingOperandType
duplicateChiOperand
sideEffectWithoutPrimary
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,4 @@
| test.cpp:424:2:425:2 | for(...;...;...) ... | test.cpp:424:18:424:23 | ... < ... | 1 | i | { ... } | i | return ... |
| test.cpp:433:2:434:2 | for(...;...;...) ... | test.cpp:433:18:433:22 | 0 | 0 | | { ... } | 0 | return ... |
| test.cpp:559:3:564:3 | while (...) ... | test.cpp:559:9:559:15 | call to getBool | | call to getBool | { ... } | call to getBool | ExprStmt |
| test.cpp:574:3:579:3 | while (...) ... | test.cpp:574:10:574:16 | call to getBool | | call to getBool | { ... } | call to getBool | ExprStmt |
Original file line number Diff line number Diff line change
Expand Up @@ -565,4 +565,19 @@ void test45() {

*rP = NULL;
use(r); // GOOD
}
}

void test46()
{
LinkedList *r, **rP = &r;

while (getBool())
{
LinkedList *s = nullptr;
*rP = s;
rP = &s->next;
}

*rP = nullptr;
use(r);
}
Loading