Skip to content

C++: Add a predicate for getting dataflow nodes whose value has been constant folded #13895

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

MathiasVP
Copy link
Contributor

Consider the following program:

enum MY_ENUM
{
    A = 1,
    B = 2,
    C = 0xff00,
};

void sink(int x);

void test() {
  int x = 1 | C;
  sink(x);
}

if one writes the expected taint-tracking query to identify flow from the constant 1 to an argument of sink:

/**
 * @kind path-problem
 */

import cpp
import semmle.code.cpp.dataflow.new.TaintTracking
import TestFlow::PathGraph

module TestConfig implements DataFlow::ConfigSig {
  predicate isSource(DataFlow::Node source) { source.asExpr().getValue().toInt() = 1 }

  predicate isSink(DataFlow::Node sink) {
    sink.asExpr() = any(Call call | call.getTarget().hasName("sink")).getAnArgument()
  }
}

module TestFlow = TaintTracking::Global<TestConfig>;

from TestFlow::PathNode source, TestFlow::PathNode sink
where TestFlow::flowPath(source, sink)
select sink.getNode(), source, sink, ""

then this won't work because 1 has been constant folded during IR construction into the value of the expression 1 | C. Thus, even though taint-tracking allows flow through bitwise OR, this won't have any result.

This PR adds a new predicate on DataFlow::Nodes called asFoldedConstant. The expression node.asFoldedConstant() gives all the constants that has been folded into the node's underlying IntegerConstantInstruction. This allows one to find flow in the above example by replacing

predicate isSource(DataFlow::Node source) { source.asExpr().getValue().toInt() = 1 }

with

predicate isSource(DataFlow::Node source) { source.asFoldedConstant().toInt() = 1 }

@github-actions github-actions bot added the C++ label Aug 6, 2023
* since the the value returned by this predicate will not represent the
* runtime value of the underlying expression (since the expression has been
* constant folded). For such cases `node.asExpr().getValue()` should be used
* instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no (general) way to tell what operations were done to combine the constants at present, right? i.e. the | in 1 | MYVALUE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at the moment, no. We could of course do something more clever instead of using .getAChild*() (i.e., traverse the AST to find the set of operations and construct some useful type that gives this information), but I haven't yet seen a good reason to do this yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to start with the simple solution that solves the problem. 👍

@MathiasVP
Copy link
Contributor Author

This shouldn't be needed anymore after #15969

@MathiasVP MathiasVP closed this Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants