-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[benchmark] Restore running benchmarks by ordinal numbers and related bugfixes #12415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@jrose-apple Can you trigger a smoke benchmark, please? Thank you! |
Heh, okay. @swift-ci Please smoke benchmark |
@swift-ci Please Python lint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
benchmark/README.md
Outdated
* `--tags` | ||
* Run tests that are labeled with specified [tags] | ||
(https://github.com/apple/swift/blob/master/benchmark/utils/TestsUtils.swift#L19) | ||
(coma separated list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to make it clear that a benchmark requires ALL tags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gladly, I'm just having trouble formulating proper english sentence here… How about:
- Run tests that are labeled with specified tags (coma separated list); Multiple tags are interpreted as logical AND, i.e. run only test that are labeled with all the supplied tags
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds fine to me.
benchmark/utils/DriverUtils.swift
Outdated
return true | ||
registeredBenchmarks.sort() | ||
let indices = Dictionary(uniqueKeysWithValues: | ||
zip(registeredBenchmarks.map{$0.name}, 1...registeredBenchmarks.count)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use just 1...
. It will be much shorter with the same result.
benchmark/utils/DriverUtils.swift
Outdated
|
||
tests = registeredBenchmarks.filter { benchmark in | ||
if benchmarkNamesOrIndices.isEmpty { | ||
return benchmark.tags.isSuperset(of:_tags) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a space after a colon.
@@ -80,7 +92,7 @@ public struct BenchmarkInfo { | |||
/// A set of category tags that describe this benchmark. This is used by the | |||
/// harness to allow for easy slicing of the set of benchmarks along tag | |||
/// boundaries, e.x.: run all string benchmarks or ref count benchmarks, etc. | |||
public var tags: [BenchmarkCategory] | |||
public var tags: Set<BenchmarkCategory> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using a Set, please us an OptionSet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this can work, BenchmarkCategory
is String
backed enum (it is used for printing the categories) and IIRC the OptionSet
requires enum's rawValue
to be FixedWidthInteger
. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
benchmark/scripts/Benchmark_Driver
Outdated
continue | ||
tests.append(m.group(1)) | ||
driver.append('--skip-tags=') | ||
indices, names = zip(*[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always thought zip
returns a single list. How is that supposed to work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I learned it over at Stack Overflow. Adding the star before iterable will turn the zip around and do unzip. It says so right in the official documentation, too:
zip() in conjunction with the * operator can be used to unzip a list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see now.. It's a little too clever. How about we at least extract is into a separate variable?
indexNamePairs = [line.split('\t')[:2] for line in subprocess.check_output(driver).split('\n')[1:-1]]
indices, names = zip(*indexNamePairs)
At least pairs
in the name indicates what's going to happen when zip
is invoked this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow. I have to say I skimmed over this line without parsing it. If python doesn't have an obvious syntax, it'd be nice to comment "# this unzips the array of pairs into two arrays"
benchmark/utils/DriverUtils.swift
Outdated
@@ -407,6 +377,11 @@ func runBench(_ test: Test, _ c: TestConfig) -> BenchResults { | |||
} else { | |||
// Compute the scaling factor if a fixed c.fixedNumIters is not specified. | |||
scale = c.fixedNumIters | |||
if scale == 1 { // FIXME: dirty fix. This method severly needs a refactoring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not clear from the comment what the dirty fix is for...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just venting… I'll remove the comment. The change correctly fixes the previously broken case when you specify --num-iters=1
. I just very much dislike the code repetition.
I hope to completely refactor it, right after demonstrating that my proposed approach works in Benchmark_Driver
. If it works out, next step would be to move that logic down to DriverUtils
, implying a total rework of this part of the code.
Build comment file:Optimized (O)Regression (12)
Improvement (16)
No Changes (306)
Unoptimized (Onone)Regression (1)
Improvement (8)
No Changes (325)
Hardware Overview
|
@moiseev I think I've fixed everything. Could you stamp your approval here and merge? |
@palimondo can you please fix one last thing I just asked for? ;-) |
@moiseev Done. Merge worthy? |
Wait. I would like to review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This specific commit should be split up into a cleanup commit and the numbering commit and the list commit and then I would like to re-review.
That being said, I have realized that I actively dislike the numbering and do not think we should re-add it because:
- It is redundant in the face of being able to invoke singular tests via their name. In fact it is weaker than test names since test names can be used as unique ids over time while these numbers change.
- I talked with all of the heavy users on the Swift team and most of them didn't use it or always thought it was a bug.
@@ -181,7 +176,7 @@ struct TestConfig { | |||
|
|||
// We support specifying multiple tags by splitting on comma, i.e.: | |||
// | |||
// --tags=array,set | |||
// --tags=Array,Dictionary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of doing this, can you rename the categories to Array so that they are lower case. We should match swift style here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow. The categories seem logical to me at the moment: Capitals for classes, lowercase for descriptive categories. Here I have only updated the comment to correctly reflect the current categories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather you just in a separate commit, just change the names of Array, Dictionary rather than update the comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, the reason why I am against using caps here is that this is against the swift style guide. All enum cases should be camelCased. Why should the benchmark suite do something different from /all/ other swift projects. We should be standard here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree to @palimondo here: The categories which refer to concrete types in the stdlib should be named exactly as those types, everything else with lowercase.
As this is a user-visible name the swift enum case style is of minor interest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I am fine with dropping this. But just to be clear, I am not saying that the "user-visible" name shouldn't be capitalized. You would do it like this:
enum BenchmarkTag : String {
case array = "Array"
case string = "String"
case cpubench
}
I am just saying that IMO all code in the swift compiler should follow the swift style guide closely because our swift code is often times read as an example code by external developers. Whenever we have diverged in the past, our users have read the code and gotten confused. We are inviting that confusion here.
@gottesmm I don't understand what you mean... |
@palimondo I reviewed the first commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some more on the first commit.
@@ -80,7 +92,7 @@ public struct BenchmarkInfo { | |||
/// A set of category tags that describe this benchmark. This is used by the | |||
/// harness to allow for easy slicing of the set of benchmarks along tag | |||
/// boundaries, e.x.: run all string benchmarks or ref count benchmarks, etc. | |||
public var tags: [BenchmarkCategory] | |||
public var tags: Set<BenchmarkCategory> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using a Set, please us an OptionSet.
I am using the indexes in my work on improving the robustness of benchmark suite (to eliminate all the sporadic false reports of improvements and regressions). I have a pretty long running branch that depends on that functionality at the moment. I have rebased it on this branch and was hoping to open next PR as soon as this lands. Please have a quick look here. I have already spent 2 lengthy conflict resolution rebase merges today to accommodate all the above comments from @atrick and @moiseev. Can we please not complicate this any more than is absolutely necessary? |
I appreciate your side branch, but you are making a lot of changes here, very fast. Can we step back a second and talk about the overall direction things are going on swift-dev? I would like to understand the overall design that you are following here and we should look to the community for buy-in. Additionally, there are many users of the Benchmark suite who can't fully engage in this discussion via a PR review. A design discussion on swift-dev will let these users engage with the design process without needing to be on during a full on PR review. |
That followup PR from capillary-dilation branch includes parasitic re-implementation of most of the |
I've started that work on 12 Jun and my main issue is that I'm working too slow. The main branch started to shift from under me with the recent What should I write to swift-dev about? (I'm mainly unsure how much of the backstory to share...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from overall direction, just some spelling nits.
benchmark/README.md
Outdated
* Print a list of available tests matching specified criteria | ||
* `--tags` | ||
* Run tests that are labeled with specified [tags](https://github.com/apple/swift/blob/master/benchmark/utils/TestsUtils.swift#L19) | ||
(coma separated list); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- "Comma" not "coma"
- What is the trailing semicolon doing?
benchmark/README.md
Outdated
* Run tests that are labeled with specified [tags](https://github.com/apple/swift/blob/master/benchmark/utils/TestsUtils.swift#L19) | ||
(coma separated list); | ||
* `--skip-tags` | ||
* Don't run tests that are labeled with specified tags (coma separated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
benchmark/README.md
Outdated
(coma separated list); | ||
* `--skip-tags` | ||
* Don't run tests that are labeled with specified tags (coma separated | ||
list); Default value: `skip,unstable`; To get complete list of tests, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't capitalize after a semicolon
Can I get replies to inline comment in the source? |
I don't see the reason to split this into 3 PRs, other than to increase difficulty of contributing. Is that your intention here, @gottesmm? This PR contains synergistic changes to 4 files (one of them documentation). They restore functionality to previously documented state. It seems you blocked this because you think running by indexes is unused. But I use them. Why isn't that enough? |
@gottesmm Or did you really mean splitting commit (not PR), rewriting history? I didn’t know that was even possible — just read about how to do that with interactive rebase. Still, why’s such nitpicking necessary? |
@gottesmm Re: what to discuss on swift-dev |
@palimondo I am currently at the LLVM Dev conference for the next day or two (and was for the past day or two), so review is slowing down. That being said what I am saying is that just you using this feature is not enough. It needs to be able to hold its own weight. There is no reason that you couldn't use just test names and map those locally to whatever index you need in your script. Lets talk about this more after the LLVM Dev conference is over. |
At the moment, my branch depends on this documented feature, which is why I'm contributing this PR to restore it's accidental removal during introduction of In addition to being a convenient shortcut to run tests manually, I'm using it to calibrate the For the time being, I want to continue to use indices as trivial compression scheme, to guard against hitting the command line length limit. At the moment, the 471 benchmark names list is 14332 bytes long. Using indices, this goes down to 1776 (91% savings). I know that full benchmark names list is less then I believe this is enough for this feature to hold its weight (3 lines of code !!!):
If and when we move functionality I developed in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
As @gottesmm already noted, there are some changes in the PR which have nothing to do with the PR's description. You did split the changes into multiple commits - that's good. So I'm fine if you just edit the title and description of the PR, e.g. add a list of independent changes.
@@ -181,7 +176,7 @@ struct TestConfig { | |||
|
|||
// We support specifying multiple tags by splitting on comma, i.e.: | |||
// | |||
// --tags=array,set | |||
// --tags=Array,Dictionary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree to @palimondo here: The categories which refer to concrete types in the stdlib should be named exactly as those types, everything else with lowercase.
As this is a user-visible name the swift enum case style is of minor interest.
@@ -181,7 +176,7 @@ struct TestConfig { | |||
|
|||
// We support specifying multiple tags by splitting on comma, i.e.: | |||
// | |||
// --tags=array,set | |||
// --tags=Array,Dictionary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I am fine with dropping this. But just to be clear, I am not saying that the "user-visible" name shouldn't be capitalized. You would do it like this:
enum BenchmarkTag : String {
case array = "Array"
case string = "String"
case cpubench
}
I am just saying that IMO all code in the swift compiler should follow the swift style guide closely because our swift code is often times read as an example code by external developers. Whenever we have diverged in the past, our users have read the code and gotten confused. We are inviting that confusion here.
elapsed_time = sampler.run(test.name, fn: testFn, num_iters: 1) | ||
test.tearDownFunction?() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not move this out of the loop. It is important that we have setup/teardown before/after each invocation of the benchmarking function. Benchmark writers should be able to assume that they will have a pristine state when the benchmark starts running. Otherwise, you could have one benchmark iteration affecting another benchmark iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this refactoring is safe, because multiple iterations are handled by passing the scale parameter to the sampler.run
method, see line 402. This means that setUp
and tearDown
aren't invoked around every iteration.
Even if you're thinking about a case which measures some kind of a mutable operation (say removing elements from collection), the run
function already has to handle the setup of the workload for each iteration. In such a case you'd do the expensive collection building in the setUp
, and do a cheaper copy at the beginning of each inner loop, releasing the collection in tearDown
. All this implies that the proper names for these functions should be beforeTest
and afterTest
, but that's for another refactoring.
That's why I think that measuring multiple samples with single setUp
and tearDown
hoisted outside of the sample loop is safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, the setUp/tearDown should be inside the loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eeckstein The timing of our comments look simultaneous. Did you read my explanation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll move setUp
and tearDown
invocations inside the numSamples
loop to get this in.
But I still think we should rename these to beforeTest and afterTest and move them outside of the loop in a future update.
// REQUIRES: asserts | ||
// REQUIRES: CMAKE_GENERATOR=Ninja | ||
|
||
// Integration tests between Benchmark_Driver and Benchmark_O |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me be more specific.
-
If we /do/ want to put the test here, we need to have a "lit" feature that says that benchmarks are available and if this feature is not set, do not run this test. Otherwise, we will try and run this test when the benchmarks are not enabled and the test will fail. So no matter what, we need that.
-
The benchmark's standalone configuration should be able to invoke these tests independently via its cmake. Now I am not saying that this PR is the right place to teach the benchmark's cmake how to do that. Rather I am suggesting that by doing the PR in this way, that becomes more difficult. Instead, we should follow the model of the validation-tests where despite validation-tests not being in the test directory itself, we have taught lit how to delegate to it. The way that this is done can be seen in the file ./test/CMakeLists.txt. I am happy to answer questions/pointers here.
@gottesmm Ever since you offered to help me with my first Swift PR ("Got you covered, man!"), I've considered you a friend. You did a lot to help me understand how I can lend a hand and contribute to the project. But something changed with this PR (and in our following interactions)... First you swept in after two reviewers already gave their thumbs up (back in October 2017) and started asking for procedural changes:
I'll skip over the details of our resulting interchange, and focus on your most recent input to this PR:
This is after I've, per your original request, completely cleaned up the commit history to offer a fully documented set of minimal commits that implements the stated goal of his PR in my most recent push. While re-implementing the execution by ordinal numbers I had to fix couple of bugs in the immediately surrounding code, that got introduced after the introduction of These escalating requests about previously unstated and undocumented project development procedures come over as quite arbitrary hindrances and I am starting to doubt this is done in the spirit of friendly cooperation. I hope I just got a wrong impression! What is really going on here? |
@palimondo There seems to be a misunderstanding. We, including gottesmm, want this PR to land. But there is one thing which has to be fixed: the test must not run if benchmarks are not built (build-script --skip-build-benchmarks). To do that add a "feature" in lit.site.cfg.in, e.g. "benchmarks", based on the cmake variable SWIFT_BUILD_PERF_TESTSUITE. The other thing with making the test work in standalone configuration is not that important. You can ignore that for now. We can fix that later. Does this make sense? |
@eeckstein I've updated the PR title and description, moved setUp and tearDown back into |
Added `lit` substitution for running Benchmark_O binary. Introduced `lit` tests for `Bechmark_O` to demonstrate bugs: * listing benchmarks’ tags * when running with `--num-iters=1`
When listing benchmarks with `--list` parameter, present the tags in format that is actually accepted by the `--tags` and `--skip-tags` parameters. Changes the `--list` output from ```` Enabled Tests,Tags AngryPhonebook,[TestsUtils.BenchmarkCategory.validation, TestsUtils.BenchmarkCategory.api, TestsUtils.BenchmarkCategory.String] ... ```` into ```` Enabled Tests,Tags AngryPhonebook,[String, api, validation] … ````
Fixed bug where the `elapsed_time` was always 0 when `--num-iters=1` was specified.
Added test: It should be possible to run benchmark by name, even if its tags match the default skip-tags. Added verification tests for benchmark filtering with `tags` and `skip-tags` parameters.
Also updated benchmark documentation with more detailed description of tag handling.
Fixed failure in `get_tests` which depended on the removed `Benchmark_O --run-all` option for listing all test (not just the pre-commit set). Fix: Restored the ability to run tests by ordinal number from `Benchmark_Driver` after the support for this was removed from `Benchmark_O`. Added tests that verify output format of `Benchmark_O --list` and the support for `--skip-tags= ` option which effectively replaced the old `--run-all` option. Other tools, like `Benchmark_Driver` depend on it. Added integration tests for the dependency between `Benchmark_Driver` and `Benchmark_O`. Running pre-commit test set isn’t tested explicitly here. It would take too long and it is run fairly frequently by CI bots, so if that breaks, we’ll know soon enough.
Reintroduced feature lost during `BenchmarkInfo` modernization: All registered benchmarks are ordered alphabetically and assigned an index. This number can be used as a shortcut to invoke the test instead of its full name. (Adding and removing tests from the suite will naturally reassign the indices, but they are stable for a given build.) The `--list` parameter now prints the test *number*, *name* and *tags* separated by delimiter. The `--list` output format is modified from: ```` Enabled Tests,Tags AngryPhonebook,[String, api, validation] ... ```` to this: ```` \#,Test,[Tags] 2,AngryPhonebook,[String, api, validation] … ```` (There isn’t a backslash before the #, git was eating the whole line without it.) Note: Test number 1 is Ackermann, which is marked as “skip”, so it’s not listed with the default `skip-tags` value. Fixes the issue where running tests via `Benchmark_Driver` always reported each test as number 1. Each test is run independently, therefore every invocation was “first”. Restoring test numbers resolves this issue back to original state: The number reported in the first column when executing the tests is its ordinal number in the Swift Benchmark Suite.
* Description of listing and running tests by ordinal numbers
@eeckstein I think I have the I think I've addressed all the PR comments and this is ready to merge. |
Thanks. Looks good now. |
@swift-ci smoke test and merge |
1 similar comment
@swift-ci smoke test and merge |
@eeckstein, @gottesmm and other reviewers: Thank you for your help and patience in guiding me through this. Swift’s build and testing infrastructure is extremely complex and very sparsely documented. It seems like most of the know-how is in Apple dev’s heads and there isn’t much externally visible institutional knowledge to draw on. It really helps outside contributor like me, if you can point to concrete files to be modified and mention particular environment variables for dependencies. Vague requirements stated in general terms, though perfectly reasonable for those in the know, are insurmountable obstacles without easy access to you. I wish I could just drop by your computers and ask about the info I’m missing. Being responsive, proactive and caring in your communication is super helpful and fosters the atmosphere of welcoming community. Thank you! |
@@ -88,6 +88,9 @@ config.available_features.add("CMAKE_GENERATOR=@CMAKE_GENERATOR@") | |||
if "@SWIFT_ENABLE_SOURCEKIT_TESTS@" == "TRUE": | |||
config.available_features.add('sourcekit') | |||
|
|||
if "@SWIFT_BUILD_PERF_TESTSUITE@" == "TRUE": | |||
config.available_features.add('benchmark') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. This broke my local testing, where I usually configure the benchmarks to build (so I have the option) but don't actually always build them. Since check-swift
doesn't depend on the benchmark targets, can you make this a dynamic check instead, based on the presence or absence of Benchmark_O
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry for that! But I'm not sure how to properly fix it. I've added this condition following explicit guidance from @eeckstein. Any pointers for analogous dynamic check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, well, you're already using inferSwiftBinary
to check for the presence of Benchmark_O
. Maybe you can see if that came back negative? (meaning, it returned the unqualified Benchmark_O
instead of finding the actual executable)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit tick today… so let me walk through this slowly, if I get your meaning:
All these lit.cfg
files look like configuration files, but are in fact Python scripts. Which means I can extend that offending if
with additional check for config.benchmark_o
which carries the result from inferSwiftBinary
and I check that it is not just plain 'Benchmark_O'
. Correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, basically. I actually think you should drop the current check and instead put one in the regular lit.cfg
. IIRC the generated lit.site.cfg
gets read very early on and shouldn't be doing much other than checking the configuration options.
This PR reintroduces functionality lost during the
BenchmarkInfo
refactoring to run benchmarks by numbers and fixes bugs introduced in the subsequent move to the use of tags for filtering benchmarks by category. Also adds regression tests for the fixed bugs and updates the related documentation.--num-iters=1
Benchmark_Driver
running testsSee individual commit descriptions for more details.
The restored functionality assigns all registered benchmarks an ordinal number (in alphabetical order). This can be used as a shortcut to invoke the given tests by number. This test number is also reported when running benchmarks and when listing available benchmarks with the
--list
option. The updated list format looks like this:Note: Test number 1 is Ackermann, which is marked as “skip”, so it’s not listed with the default
skip-tags
value.Output format when running test is structurally unchanged, but the first column again reports the index number of the test, instead of counting the executed tests:
This fixes a bug where running tests with
Benchmark_Driver
reported all tests as being number 1, because every independent test run was first during thatBenchmark_O
invocation.