Skip to content

C++: wire up models library to DefaultTaintTracking #2657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jan 27, 2020

Conversation

jbj
Copy link
Contributor

@jbj jbj commented Jan 21, 2020

@jbj jbj added the C++ label Jan 21, 2020
@jbj jbj requested a review from geoffw0 January 21, 2020 11:06
@jbj jbj requested a review from a team as a code owner January 21, 2020 11:06
@@ -159,18 +161,64 @@ private predicate instructionTaintStep(Instruction i1, Instruction i2) {
// This is part of the translation of `a[i]`, where we want taint to flow
// from `a`.
i2.(PointerAddInstruction).getLeft() = i1
// TODO: robust Chi handling
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to remove this TODO comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because I don't think it belongs here. It probably belongs in the IR data-flow library and the IR itself, which means it should be in a Jira ticket -- I've opened CPP-490.

i2 = any(CallInstruction call |
exists(int indexIn |
modelTaintToReturnValue(call.getStaticCallTarget(), indexIn) and
i1 = call.getPositionalArgument(indexIn)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also need to account for indirect parameters via IndirectReadSideEffectInstruction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it should. I've added three commits to test and implement that functionality. The test I came up with is very contrived since there's almost no flow to indirections without #2667. Hopefully this will be better soon.

jbj added 3 commits January 22, 2020 13:27
There was already a `WriteSideEffectInstruction` class that served as a
superclass for all the specific write side effects. This new class
serves the same purpose for read side effects.
Until we have better tracking of indirections, these flow rules conflate
pointers and their contents.
@geoffw0
Copy link
Contributor

geoffw0 commented Jan 22, 2020

Can/should we update cpp\ql\test\library-tests\dataflow\taint-tests to show the results of DefaultTaintTracking, so that we can see how this works out on those tests?

(I'm not completely clear on the relationship between TaintTracking and DefaultTaintTracking, I think the latter is built on top of the former? Also I note DefaultTaintTracking.qll lacks a comment at the top of the file explaining itself, which is not the fault of this PR but something that ought to be addressed)

@jbj
Copy link
Contributor Author

jbj commented Jan 22, 2020

I'd like to have more tests of DefaultTaintTracking, but we can't easily use the tests in cpp\ql\test\library-tests\dataflow\taint-tests since those are for the configurable taint-tracking libraries. We have tests in semmlecode-cpp-tests/DO_NOT_DISTRIBUTE/security-tests/library/taint that could be used directly, but those are in the internal repo. You're very welcome to move that test to the external repo, along with other security tests, if there are no concerns about copyright.

@geoffw0
Copy link
Contributor

geoffw0 commented Jan 24, 2020

We have tests in semmlecode-cpp-tests/DO_NOT_DISTRIBUTE/security-tests/library/taint that could be used directly, but those are in the internal repo.

Done: https://git.semmle.com/Semmle/code/pull/36079 and #2688 .

(I've not actually added DefaultTaintTracking to it, but that should be easy to do now; I also note it's not the most extensive test, but every little helps)

The issue for that test is being tested and fixed on PR github#2686. Adding a
test here will cause a semantic merge conflict.
@rdmarsh2
Copy link
Contributor

Compared to tainted_ir.ql in #2695, this adds flow to the results of the calls to strcmp on line 40, strlen on line 64, and strcpy on line 70. It doesn't seem to add flow to the destination of the strcpy, but that might be because the destination is never dereferenced.

Copy link
Contributor

@rdmarsh2 rdmarsh2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we wired the dataflow models into the AST-based dataflow but not the IR-based dataflow, so that's not caused by these changes. LGTM otherwise.

@rdmarsh2 rdmarsh2 merged commit c7975e8 into github:master Jan 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants