-
Notifications
You must be signed in to change notification settings - Fork 1.7k
C++/C#: Alias analysis of indirect parameters #2696
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Instead of tracking `IRVariable`s directly, alias analysis now tracks instances of the `Allocation` type provided by its `Configuration` parameter. For unaliased SSA, an `Allocation` is just an `IRAutomaticVariable`. For aliased SSA, an `Allocation` is either an `IRVariable` or the memory pointed to by an indirect parameter.
This introduces a new type of `MemoryLocation`: `EntireAllocationMemoryLocation`, representing an entire contiguous allocation whose size is not known. This is used to model the memory accesses on `InitializeIndirection` and `ReturnIndirection`.
5f585c5
to
542579d
Compare
I've rebased onto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main code changes and the IR dump and dataflow test changes look good to me. I'm not sure I fully understand the new test framework.
private string expectationCommentPattern() { result = "//\\s*(\\$(?:[^/]|/[^/])*)(?://.*)?" } | ||
|
||
private string expectationPattern() { | ||
result = "(?:(f(?:\\+|-)):)?((?:[A-Za-z-_]+)(?:\\s*,\\s*[A-Za-z-_]+)*)(?:=(.*))?" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These regexes need explanations, maybe in the class header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've documented each of them individually (with examples in the module header as part of the "how to use this" docs).
@@ -0,0 +1,168 @@ | |||
import cpp | |||
|
|||
abstract class InlineExpectationsTest extends string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs some qldoc explaining what this class is for and how to override it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a huge module comment with details and examples.
@rdmarsh2 I think I've addressed all of your feedback. |
sync-identical-files failed because #2713 touched the C++ copies of AliasAnalysis.qll |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Conflicts: cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/AliasedSSA.qll csharp/ql/src/semmle/code/csharp/ir/implementation/unaliased_ssa/internal/AliasAnalysis.qll
This change enables more precise alias analysis the memory pointed to by pointer and reference parameters. At a high level, we now treat the memory pointed to by the value of a pointer/reference parameter to be its own unaliased allocation, independent of all other allocations. This is unsound, but produces much better analysis results in practice.
Overall, IR diffs (in aliased SSA) are what I would expect. Stores to an indirect parameter no longer pollute the all aliased memory virtual variable. Even though we don't know how big any given indirect parameter buffer is, if all accesses to that buffer are just scalars at offset zero, we'll still hook up uses to exact definitions just as if we knew the size all along.
The memory accessed by the
InitializeIndirection
andReturnIndirection
instructions now has a newMemoryAccessKind
:EntireAllocationMemoryAccess
. This kind represents an access to the entire contiguous allocation, without knowing exactly how big that allocation is.Alias analysis now operates on
Allocation
s, which are provided as module arguments to represent the memory locations to be tracked. In unaliased SSA, anAllocation
is just anIRAutomaticVariable
. In aliased SSA, there is anAllocation
for eachIRVariable
, plus one for each indirect parameter.Aliased SSA still works mostly the same, except that a new type of
MemoryLocation
has been added to represent an access to the entireAllocation
. The overlap computation has been updated accordingly.The existing
AliasAnalysis.qll
was shared between IR stages in C++, but was not automatically shared with C#. I brought the two languages in sync, so that now they will be kept up to date as they change in the future. This required adding a couple more predicates to the language interface. C# did not already have the same "models" system as C++, so for now, I implemented a trivialAliasFunction
class that never actually has any object instances.I've implemented a new module for writing analysis tests called
InlineExpectationTests.qll
. You extendInlineExpectationsTest
to specify what results you found on each line. Each result has a tag and a value, both strings. The tags and values are automatically compared to comments in the code under test to determine which results are expected, which are unexpected, and which are missing. The output file only shows missing and unexpected results, so a passing test should have an empty.expected
file. There's also a way to annotate a particular result as being either a false positive or false negative. This results in a different message if the actual result doesn't match the expected. For example, if the actual output is missing a result that was declared as a false positive, the output message reports that a false positive was fixed.The new
points_to.ql
test uses the above module.