Skip to content

C++: Handle casts to void in IR #67

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 20, 2018
Merged

Conversation

dave-bartolomeo
Copy link
Contributor

Casts to void did not have a semantic conversion type in the AST, so they also weren't getting generated correctly in the IR. I've added a VoidConversion class to the AST, along with tests. I've also added IR translation for such conversions, using a new ConvertToVoid opcode. I'm not sure if it's really necessary to generate an instruction to represent this, but it may be useful for detecting values that are explicitly unused (e.g. return value from a call).

I added two new sanity queries for the IR to detect the following:

  • IR blocks with no successors, which usually indicates bad IR translation
  • Phi instruction without an operand for one of the predecessor blocks.

These sanity queries found another subtle IR translation bug. If an expression that is normally translated as a condition (e.g. &&, ||, or parens in certain contexts) has a constant value, we were not creating a TranslatedExpr for the expression at all. I changed it to always treat a constant condition as a non-condition expression.

Note that the ".expected" files for the IR and AST dumps are now human-readable.

Casts to `void` did not have a semantic conversion type in the AST, so they also weren't getting generated correctly in the IR. I've added a `VoidConversion` class to the AST, along with tests. I've also added IR translation for such conversions, using a new `ConvertToVoid` opcode. I'm not sure if it's really necessary to generate an instruction to represent this, but it may be useful for detecting values that are explicitly unused (e.g. return value from a call).

I added two new sanity queries for the IR to detect the following:
- IR blocks with no successors, which usually indicates bad IR translation
- Phi instruction without an operand for one of the predecessor blocks.

These sanity queries found another subtle IR translation bug. If an expression that is normally translated as a condition (e.g. `&&`, `||`, or parens in certain contexts) has a constant value, we were not creating a `TranslatedExpr` for the expression at all. I changed it to always treat a constant condition as a non-condition expression.
Copy link
Contributor

@jbj jbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

@@ -33,6 +33,7 @@ private newtype TOpcode =
TPointerSub() or
TPointerDiff() or
TConvert() or
TConvertToVoid() or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we already have a TConvert opcode, why is it necessary to add another one? Or could these conversions just have TConvert with a result type of VoidType?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, I've been trying to use a different opcode whenever two operations are "qualitatively different", for some fuzzy definition of "qualitatively different" that exists only in my head. The fact that ConvertToVoid has no result makes it different enough from Convert, which always has a result, makes it different enough in my view.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The void type is special in many ways in C/C++, but the main purpose of the IR is to paper over syntactic details. To me, void semantically corresponds to the unit type of functional programming languages. It's like a struct with no members. This is a useful type in template programming, and C++17 added it as std::monostate. There has even been a proposal to turn void into a proper object type in C++.

It's probably true that the translation of (void)e will never appear as an operand of an instruction, but isn't that just the consequence of an arbitrary syntactic restriction in the current standard?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm actually buying your argument here. A conversion from, say, int to void doesn't seem much different from a conversion from int to float, which is just a Convert instruction. I've removed ConvertToVoid and now just use Convert.
The only place I found in the IR that needed to change to support void as a first-class type was in Instruction.getResultSize(). For a void result, previously it did not hold, meaning the result had unknown size. I now just return sizeof(void), which is zero.

not exists(PhiOperand operand |
exists(instr.getOperand(operand)) and
operand.getPredecessorBlock() = pred
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This predicate doesn't implement what its QLDoc says. It checks to make sure phi instructions have operands for all of their predecessors. But why should that always hold? After int x = 0; if (b) { x = 1; } else { f(); } I'd expect a phi node with two operands, one for each x = but none for the f() block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PhiOperand.getPredecessorBlock() specifies the immediate predecessor block from which the value flowed, not the block that contains the definition. This is important for anything trying to ignore unreachable edges, or anything vaguely path sensitive, because we need to know which definitions flowed in on which edge, even if the same definition came via multiple edges.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Then I'm happy with the check but not with its QLDoc. Could you add a separate check that any block with a phi node has at least two predecessors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed the comment and added unnecessaryPhiInstruction.

@jbj jbj merged commit b931e88 into github:master Aug 20, 2018
@dave-bartolomeo dave-bartolomeo deleted the dave/CastToVoid branch September 5, 2018 18:48
aibaars pushed a commit that referenced this pull request Oct 14, 2021
Rename describeQlClass to getAPrimaryQlClass
smowton pushed a commit to smowton/codeql that referenced this pull request Nov 1, 2021
Kotlin: Support for more type operators
dbartol pushed a commit that referenced this pull request Dec 18, 2024
feat(bash): Improve bash command parsing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants