-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testing: add -shuffle and -shuffleseed to shuffle tests #28592
Comments
Interesting problem - reminds me of On the other hand, making I wonder if this is something we could enable when one runs That suggestion wouldn't apply to |
Might be a problem with #24929 😉
But how we can know an order that triggered a test run failure? Sort only success run? |
Oh, I thought that proposal would affect
Potentially. Or one could take the usual approach like with any other test flake - run the tests over and over again with |
More command-line flags seems problematic. |
@aclements and @dr2chase and I were talking about the problem of benchmarks being sensitive to earlier benchmarks, which this would partly address too. What if testing.M exposed a slice of test identifiers (strings or some object with a Name method) and a slice of benchmark identifiers, and you could reorder or filter or duplicate those, by rewriting the slices, before calling m.Run? We could still give an m.Shuffle helper for that common operation. |
Quick draft of what you're saying. Did I get it right?
|
@cristaloleg No, it would have to be fields in M (or have setters too), so that they could be rewritten before calling m.Run. |
For benchmarks, this is specifically in the context of iterated runs. Ideally, I'd want a way to take, say, N runs of a set of M benchmarks and shuffle the whole N*M sequence of runs. This would get much more robust benchmark results when benchmarks aren't totally isolated (which they never are, at least because of GC state). For reproducibility of order in the benchmark case, you could imagine printing the sequence randomization seed as one of the key/value headers in the output. |
Retitled per suggestion above. |
Seems like we'd like to do this, but we're blocked on an API definition. Is that correct? |
@andybons looks so |
The use case that interests me from this proposed API: the ease of partitioning a single test package, that contains many individual tests, that each take a long time to run. For example, if my codebase has one particular test package that takes several minutes when run with Without an API like this proposal, I would have to run |
The original problem identified was reordering tests to shake out unwanted dependencies between test functions. We expanded to being able to set up the order of repetitions as well. Those still seem worth doing. Sharding is possible for top-level tests but more difficult once subtests are involved (subtests basically can't get sharded because you don't know until you enter the outer test that they exist, and the outer test might have done expensive setup that you don't want to repeat on each shard). All those still seem like enough reason to do this. So what would the API look like? Currently testing.M says
The idea I think would be to add exported slice fields Tests and Benchmarks and maybe also a convenience method Shuffle. It is unclear what the types of those slice elements are. There are also examples. Right now we have:
Should we de-internalize these (we can type alias InternalBenchmark = Benchmark for compatibility) and then use them? The idea is that you'd reorder however you like and then call Run. Or maybe we could make the lists be []Test where you can only find out the name?
That would hide F, which may be preferable. (Calling one of these F's is non-trivial and perhaps impossible depending on what it needs.) Or should we just shuffle by default and not add any new API? That approach would require a new flag for -seed still, and it would not solve the benchmark iteration issue. |
My previous comment asked three questions and no one has suggested any answers. Do people care enough about this suggestion that we should continue with it? |
As a user in need of randomized order of tests, I do care about having the feature, though I have no preferences about the interface — any of the options above would allow me to achieve the result: shuffle the tests, and rerun them in the same order later if needed. |
It sounds like the generality we got to is overkill - no one seems to want it. The original proposal was to add Changing the title back to focus on those flags. Any thoughts, @mpvl, @robpike, @aclements? |
How about just one flag that handles a default value? -shuffle=seed. If seed is 0 (say), it means be random. |
Or, rather than a special value to indicate a random seed, |
Or just |
@josharian I think that means you couldn't write |
The There need to be three different settings here:
Using one flag is tough to get all three of those. I suppose you could have Another options is =-1 (default) means no shuffle, =0 means random seed, and =othernumber means that's the seed. But -shuffle=0 looks a lot like -shuffle=false. Or maybe a string: -shuffle=off (default) means no shuffle, -shuffle=on means random seed, and =N means that's the seed. That last one, the string, seems like the nicest one of the three. Thoughts? |
I definitely prefer the string-based option, since it avoids bestowing special meaning to a few integer values. |
Agreed with the string-based option. When using The original proposal said:
But it seems important that a user is able to reproduce a failure with a known Assuming the shuffle seed is printed in some circumstances, I'm not sure how that affects test2json output. Is it a new Action? |
I would like to work on it. |
@cristaloleg all yours, please go ahead and sorry if I might sound like am putting pressure on you, but the Go development tree will close down at the end of April, so please let us know if you don't have time to work on it. Thank you! |
Adds flag to go test which will randomize tests before they are run. Flag can have next values: off(default), on and N, where N is a random seed. Fixes golang#28592
Punting this to Go1.17 as we didn’t get any action during Go1.16 development. @cuonglm here is something we could tack onto our plate for Go1.17 if you are interested. |
@cristaloleg are you still picking this up? |
@strideynet for the 1.17 looks possible. Will try to make early March. |
Perfect, if you are unsure about it I am more than willing to take this ticket up @cristaloleg |
@strideynet feel free to pick it up, I'm out of time and motivation. Sorry for a late ping. |
@strideynet I could also take a stab if you're busy; let me know what you think! I'll wait for a few days and hunt for some more information in the meantime. |
@tpaschalis I'm just undergoing a change in employment so it'll be a while until I'm freed up, so it's all yours |
@strideynet I wish you the best of luck mate, hope you land a great offer! I've already got a POC which seems to work; I'll iron it out and send it for review this week. |
Change https://golang.org/cl/310033 mentions this issue: |
This is a revive of #10655
Motivation
Consider the following code & corresponding tests:
Those tests pass, everything looks fine, but they're order dependent. Running them in another order will fail.
To prevent such hidden and hard to debug mistakes we need to make the order of test random for each test build.
Current workarounds
Possible solution
We need to specify a test run to execute tests with a random/desired order:
-shuffle
to run tests with a random order (used random seed may or should be printed to output depending or not on a-v
flag).-seed <int>
to specify a user-defined seed to have an ability repeat a failed build.Open questions
PS. if/when it will be accepted - will be happy to work on it.
The text was updated successfully, but these errors were encountered: