Skip to content

Conversation

@MathiasVP
Copy link
Contributor

@MathiasVP MathiasVP commented Nov 10, 2022

Consider the IR for this simple function:

void test() {
  int x = source();
  sink(x);
}

this has the following IR:

4| void test()
4|   Block 0
4|     v4_1(void)           = EnterFunction           :
4|     m4_2(unknown)        = AliasedDefinition       :
4|     m4_3(unknown)        = InitializeNonLocal      :
4|     m4_4(unknown)        = Chi                     : total:m4_2, partial:m4_3
5|     r5_1(glval<int>)     = VariableAddress[x]      :
5|     r5_2(glval<unknown>) = FunctionAddress[source] :
5|     r5_3(int)            = Call[source]            : func:r5_2
5|     m5_4(unknown)        = ^CallSideEffect         : ~m4_4
5|     m5_5(unknown)        = Chi                     : total:m4_4, partial:m5_4
5|     m5_6(int)            = Store[x]                : &:r5_1, r5_3
6|     r6_1(glval<unknown>) = FunctionAddress[sink]   :
6|     r6_2(glval<int>)     = VariableAddress[x]      :
6|     r6_3(int)            = Load[x]                 : &:r6_2, m5_6
6|     v6_4(void)           = Call[sink]              : func:r6_1, 0:r6_3
6|     m6_5(unknown)        = ^CallSideEffect         : ~m5_5
6|     m6_6(unknown)        = Chi                     : total:m5_5, partial:m6_5
7|     v7_1(void)           = NoOp                    :
4|     v4_5(void)           = ReturnVoid              :
4|     v4_6(void)           = AliasedUse              : ~m6_6
4|     v4_7(void)           = ExitFunction            :

Currently, we generate an SSA write to the IndirectOperand of &:r5_1, and SSA then flows from that node to the IndirectOperand for &:r6_2. Flow then continues from that node, through the LoadInstruction and into the IR operand 0:r6_3.

That's kinda unnecessary. Instead of flowing to these IndirectOperand, we might as well just reuse the IR operands (and instructions) where appropriate. This is what this PR does. For the above example, that means that IndirectOperand(&:r6_2) is represented by the 0:r6_3 operand node.

Note that we still need the IndirectOperands for things like:

void test() {
  int x = source();
  int* px = &x;
  sink(*px);
}

which require some non-trivial flow through indirect dataflow nodes in order to reach *px.

A nice side-effect of this (which is really why the real goal of this PR) is that our existing InstructionBarrierGuard barrier now works as expected on the feature branch (since the nodes you want to select are now backed by instructions and operands in most cases). This makes our existing barrier guard tests pass again 🎉

Commit-by-commit review recommended!

I've checked the query result changes. They're all either:

  • Fixing of FPs (because we now correctly identify some barriers that we didn't identify before)
  • New TPs (🎉)
  • Result duplication (😭)

@MathiasVP MathiasVP requested a review from a team as a code owner November 10, 2022 17:55
@github-actions github-actions bot added the C++ label Nov 10, 2022
@MathiasVP MathiasVP added the no-change-note-required This PR does not need a change note label Nov 10, 2022
…d 'IndirectOperand' when the semantically identical value already exists in the IR.
…truction before we should pick the operand as the 'sink' in the call-target resolution recursion.
…he qualifier of a call (and not just when it's an argument of a call).
@MathiasVP MathiasVP force-pushed the merge-some-indirect-and-instruction-nodes branch from 282cf2f to 0c7f57e Compare November 11, 2022 11:14
@MathiasVP MathiasVP requested a review from rdmarsh2 November 11, 2022 15:15
@MathiasVP
Copy link
Contributor Author

I've looked through the result changes on DCA: the Samate ones are all TPs, and the lost results on cpp/uncontrolled-allocation-size query are because we're now correctly using the sanitizers in the query 🎉.

predicate hasIndirectOperand(Operand op, int indirectionIndex) {
/**
* Holds if the `(operand, indirectionIndex)` columns should be
* assigned an `RawIndirectOperand` value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor typo, should be a RawIndirectOperand

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b8fab9a.

predicate hasIndirectInstruction(Instruction instr, int indirectionIndex) {
/**
* Holds if the `(instr, indirectionIndex)` columns should be
* assigned an `RawIndirectInstruction` value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b8fab9a.

}

/**
* INTERNAL: Do not use.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this (and IndirectInstruction) go in DataFlowPrivate instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I've done this in b8fab9a.

@rdmarsh2
Copy link
Contributor

Looks like there's some failing tests for Coding Standards - we need to fix or accept those as well

@MathiasVP
Copy link
Contributor Author

Looks like there's some failing tests for Coding Standards - we need to fix or accept those as well

They're failing on the base branch already. There's a fairly lengthy process requires to change these tests, so we'll do those separately once the feature branch is ready to be merged in.

@rdmarsh2
Copy link
Contributor

oh, right. Does the same go for the CPP Language tests?

@jketema
Copy link
Contributor

jketema commented Nov 15, 2022

Does the same go for the CPP Language tests?

The internal ones are the next ones I'll be picking up.

There's a fairly lengthy process requires to change these tests

The process relatively straightforward, as we won't target the coding standards main branch with these changes, but only the next branch (this is different for the GVN deprecation). Unfortunately, the next branch is used everywhere, so whatever is on it also needs to work with the CodeQL and our internal main branches. So fixing needs to wait until we're ready to merge the feature branch.

@rdmarsh2 rdmarsh2 merged commit 70a9e49 into github:mathiasvp/replace-ast-with-ir-use-usedataflow Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants