-
-
Notifications
You must be signed in to change notification settings - Fork 740
Derandomize std.algorithm #5202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hmmm... there's good advantage to randomized unittesting, we don't want to give that up. How about non-random tests that ensure coverage and then a randomized unittest? Would that be difficult? |
db6fb73
to
a670227
Compare
So here's what I did:
Seems like it worked :) |
From a practical position, I don't like random unit tests, because there can be weird interactions (e.g. some PR fails for totally unrelated random unit test failure, cannot repeat it). However, I agree random testing has good benefits. I would propose that we have true random testing, but not part of unit tests. Make it a separate testing system, such that random failures generate alerts (probably posts to mailing list), with enough info to add the failures as non-random tests. But they aren't creating spurious failures for PR testing. Something to think about, probably not a quick solution. In a previous life I wrote problem sets for topcoder and also tested other's problem sets. One thing I would do frequently is generate tons of random data to see if I could cause my solution or the other solutions to fail. If I found a test case that caused failure, then I would add it as a non-random test case for the testing phase of the competition (the hypothesis being that if one of our solutions got caught by some corner case exposed by the random data, someone else might too). During the actual competition, all solutions were run against the exact same test cases to judge whether they were correct or not. So the results are objective, even if they might have been incomplete. Such a stance is well founded. |
While we are at it, could you please change them to print the seed when the tests fail for reproducibility? IIRC we had already settled on that last time the discussion on randomised testing came up, so perhaps I'm just missing the respective bits of code. |
Yah, printing the seed along with indications on how to set it to repro is essential. BTW at best we should have a systematic approach (one global seed, one easy way to set etc) instead of each unittest doing its own. |
You can scratch the "can be weird interactions" - just have a look at the CodeCov CI project status of a recent PR - it's moving "randomly" ;-)
Could we please go step by step? |
std/algorithm/sorting.d
Outdated
auto pieces = partition3(a, 25); | ||
assert(pieces[0].length + pieces[1].length + pieces[2].length == a.length); | ||
foreach (e; pieces[0]) | ||
uint[] seeds = [3923355730, 1927035882, unpredictableSeed]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stylistic suggestion: immutable seeds = …
std/algorithm/sorting.d
Outdated
assert(left <= a[k]); | ||
} | ||
if (k + 1 < a.length) | ||
uint[] seeds = [90027751, 2709791795, 1374631933, 995751648, 3541495258, 984840953, unpredictableSeed]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here: const
/immutable
std/algorithm/sorting.d
Outdated
auto r = Random(s); | ||
|
||
int[] a = new int[uniform(1, 10000, r)]; | ||
foreach (ref e; a) e = uniform(-1000, 1000, r); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonky indentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's from the existing source ;-)
Fair enough – lgtm, feel free to merge after fixing the broken indentation. |
This would prohibit us from using (Hmm, so does printing the seed, though, unless we use exception/assertion messages for that. Maybe the randomised tests should be a separate thing, although this doesn't seem like an elegant solution.) |
Careful here, I think if you are having a random test, you need the seed printed before each test, not one global seed. Otherwise, other factors (which modules are included, which ones test random data, etc) can affect whether the seed is set correctly at the time of the test. If you just mean there needs to be a standardized way to set the seed for an individual test, yes, that is ideal. Yes, fix the issue that is causing random PR failures first, and then maybe we can have a bigger discussion on random testing elsewhere. |
Done, but a PR needs to be approved via an GH review otherwise merging is blocked. |
I didn't realise one couldn't green-light their own PR's. |
This is a rather new change due to us trying to catching up with the improvements GH makes ;-) http://forum.dlang.org/post/qoamackqqnkqrxkochdu@forum.dlang.org |
I have seen (and read) your summaries, but the last comment still stands. ;) |
Thanks @wilzbach this is terrific work. |
@schveiguy well unless there's randomization in the order of running unittests, we can set things up so one seed sets up an entire test battery. At any rate please let's not print random numbers with every successful run :). Agreed on taking the discussion to a higher level, maybe you could lead it. Thanks! |
Right, I was thinking only printing after a failure, but I meant you need to look at the seed for the specific test (and store it before running the test), not once for the beginning of all unit tests. |
@andralex could you please have a look? (and maybe for the other repos as well) |
c26b85b
to
b46bd29
Compare
Auto-merge toggled on |
The labelled PRs still don't have priority on the auto-tester. Hence I went for the faster option and squashed manually. |
Alright, squash merges should be on now. Thanks for your diligence! |
So it turns out that using a random seed leads to "interesting" effects. One example is that our coverage report jumps randomly up & down.
Don't believe me? Check it out for yourself:
https://codecov.io/gh/dlang/phobos/compare/77f1a9067b59395f8444cb2d01047680aebcfe5f...f2b58341725a6d195b388c1b6f31dcdaa9f000f7/changes
https://codecov.io/gh/dlang/phobos/compare/22331db571d874ba44e412777ed8310536a18dff...89e4014e3d89c7de4d2e2c50eca7924e465ff062/changes
https://codecov.io/gh/dlang/phobos/compare/86050a1c59eda327fa263549f0b53e56a4cfa9ed...2749175ceb24e4b780725677b7e233f65e4f791c/changes
https://codecov.io/gh/dlang/phobos/compare/77f1a9067b59395f8444cb2d01047680aebcfe5f...2e05f29d3c4cc7690198c3ad0d06eb23293da48a/changes
...
(endless list)
I think it's quite reasonable to fix the random test to a constant value, s.t. we know what parts are actually covered and if needed add tests for the uncovered bits.
Note that there's also floppy line in
std.parallelism
, but that one might be harder to get.edit: for
std.parallelism
we should probably have a look at this #4399