Skip to content

Don't run failed unit tests by default #5248

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Mar 4, 2020

Conversation

thk123
Copy link
Contributor

@thk123 thk123 commented Feb 27, 2020

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • [na] The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • [na] Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • [na] My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

This replaces #2256. By using the tag [.], the tests are not run by default, leading to no useless output when another unit test fails. By changing [!mayfail] to [!shouldfail] means that these tests fail if they pass. This stops tests inadvertently getting fixed without anyone noticing. I used @owen-mc-diffblue suggestion of xfail for an expected failure. Finally, I added a separate run of these failed tests to CI.

@thk123 thk123 force-pushed the dont-run-failed-tests branch from 000a799 to a665490 Compare February 27, 2020 11:24
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don’t want to run [!shouldfail] test you can just provide ~[!shouldfail] as a tag filter.

@@ -36,4 +36,7 @@ Author: Michael Tautschnig
#include <util/pragma_pop.def>
#endif

/// Add to the end of test tags to mark a test that is expected to fail
#define XFAIL "[.][shouldfail]"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the commit message:

These tests won't be run by default (de-cluttering the output with expected failures). When run, they will fail if they pass

In previous PR Owen / Micahel complained that shouldfail was misleading. They shouldn't fail, in the perfect world, but they do due to known bug. We felt expected fail (or XFAIL as python calls them) was clearer. Can change to EXPECTED_FAIL if you'd prefer?

@@ -12,7 +12,7 @@ Author: Diffblue Limited.

SCENARIO(
"Lazy load lambda methods",
"[core][java_bytecode][ci_lazy_methods][lambdas][!mayfail]")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mayfail doesn’t stop a test from running, it just means test failure isn’t counted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test that can pass or fail is bad - since no body will observe if it changes state. I believe this has been erroneously marked as mayfail, since it passes.

@@ -188,7 +188,7 @@ void validate_local_variable_lambda_assignment(
SCENARIO(
"Converting invokedynamic with a local lambda",
"[core]"
"[lambdas][java_bytecode][java_bytecode_convert_method][!mayfail]")
"[lambdas][java_bytecode][java_bytecode_convert_method]" XFAIL)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be [!shouldfail] (it’s shouldfail in the macro), and I still dont think this should be a macro.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in later commit

.travis.yml Outdated
- env UBSAN_OPTIONS=print_stacktrace=1 make -C jbmc/regression test-parallel "CXX=${COMPILER} ${EXTRA_CXXFLAGS}" -j2 JOBS=2
- make -C jbmc/unit "CXX=${COMPILER} ${EXTRA_CXXFLAGS}" -j2
- make -C jbmc/unit test
- echo "Running expected failure tests"
- make TAGS="[shouldfail]" -C jbmc/unit test xfail

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

??? Why not just use the builtin !shouldfail mechanism?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean? The trailing xfail was a typo, but this is me passing the appropriate tags to the unit runner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be [!shouldfail], not [shouldfail]. I believe might be fixed by a later commit?

if ! ./$(CATCH_TEST) -l | grep -q "^$(N_CATCH_TESTS) test cases" ; then \
./$(CATCH_TEST) -l ; fi
./$(CATCH_TEST)
if ! ./$(CATCH_TEST) *,[.] -l | grep -q "^$(N_CATCH_TESTS) matching test cases" ; then \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not sure what this supposed to mean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is listing all tests, even hidden tests, to check the number of tests == the number of tests in source. (The check is to catch people forgetting to add their unit tests to Makefile).

@thk123 thk123 marked this pull request as ready for review March 3, 2020 10:22
@thk123
Copy link
Contributor Author

thk123 commented Mar 3, 2020

If you don’t want to run [!shouldfail] test you can just provide ~[!shouldfail] as a tag filter.

This is also an option.

I prefer my approach as it makes it easier to use for the general case. I.e. to run the unit tests, you just run the unit tests binary. You neither need to provide additional flags, nor do you get misleading output about failed tests.

Happy to change to that though, if that is preferred.

@@ -355,9 +355,13 @@ script:
- env PATH=$PATH:`pwd`/src/solvers UBSAN_OPTIONS=print_stacktrace=1 make -C regression/cbmc test-cprover-smt2
- make -C unit "CXX=${COMPILER} ${EXTRA_CXXFLAGS}" -j2
- make -C unit test
- echo "Running expected failure tests"
- make TAGS="[!shouldfail]" -C unit test
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about CMake configurations? Does anything need to be done for those?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot, fixed

@thk123 thk123 force-pushed the dont-run-failed-tests branch from b1df747 to 5d92f1b Compare March 3, 2020 12:07
Copy link
Contributor

@chrisr-diffblue chrisr-diffblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for checking/fixing the cmake stuff.

@thk123 thk123 changed the title Dont run failed tests Don't run failed unit tests by default Mar 3, 2020
if ! ./$(CATCH_TEST) -l | grep -q "^$(N_CATCH_TESTS) test cases" ; then \
./$(CATCH_TEST) -l ; fi
./$(CATCH_TEST)
if ! ./$(CATCH_TEST) *,[.] -l | grep -q "^$(N_CATCH_TESTS) matching test cases" ; then \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hannes-steffenhagen-diffblue how about?

Suggested change
if ! ./$(CATCH_TEST) *,[.] -l | grep -q "^$(N_CATCH_TESTS) matching test cases" ; then \
# Include hidden tests by specifying "*,[.]" for tests to count
if ! ./$(CATCH_TEST) *,[.] -l | grep -q "^$(N_CATCH_TESTS) matching test cases" ; then \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. Ok, I thought this was using a glob pattern somehow. Would you mind putting quotes around it to clarify it’s supposed to be passed as an argument to catch as-is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - will merge on green CI

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don’t really like using a macro for this, but otherwise LGTM now.

thk123 added 4 commits March 4, 2020 10:08
These tests won't be run by default (de-cluttering the output with
expected failures). When run, they will fail if they pass
These don't fail, so no reason not to have them running
This ensures that failed tests don't get inadvertently fixed, without
cluttering the normal output
@thk123 thk123 force-pushed the dont-run-failed-tests branch from 5d92f1b to 92dac11 Compare March 4, 2020 10:10
@thk123 thk123 force-pushed the dont-run-failed-tests branch from 92dac11 to 6176871 Compare March 4, 2020 10:12
@thk123 thk123 merged commit fb07ce0 into diffblue:develop Mar 4, 2020
@thk123 thk123 deleted the dont-run-failed-tests branch March 4, 2020 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants