-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Remove interpolated strings from benchmark CheckResults #9298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@swift-ci Please smoke benchmark |
BTW, this looks more like "Remove |
@palimondo I don't understand why you say that. The signature of the old function was CheckResults(_ resultsMatch: Bool, _ message: @autoclosure () -> String) |
I was joking about the fact that almost half of the changed tests didn't use interpolated strings in the error message, but a static string. Never mind. The benchmark is sure taking its sweet time… |
Build comment file:Optimized (O) Regression (16)
Improvement (15)
No Changes (238)
Regression (3)
Improvement (7)
No Changes (259)
|
There were reported improvements from 6%-12% (O build) in DropFirstArrayLazy, DropWhileArrayLazy, DropWhileSequenceLazy and DropWhileArray -- tests that you did not change with this commit. IMO this points to the need to improve how we detect changes to better use measured statistics: SR-4600 Performance comparison should use MEAN and SD for analysis. These are test with very low runtimes per iteration and are more susceptible to noise during testing. See also SR-4669 Add a Benchmark_Driver --rerun N option. |
@dabrahams, @eeckstein - How about we try this? public func CheckResults(
_ resultsMatch: Bool,
file: String = #file,
function: String = #function,
line: Int = #line
) {
guard _fastPath(resultsMatch) else {
print("Incorrect result in \(function), \(file):\(line)")
abort()
}
} |
@palimondo Still, passing strings will probably generate some retain/release traffic. We could just pass the line number. IMO this is more than sufficient to locate the source of a fail. Most benchmarks (if not all) only have a single CheckResult anyway. |
Those should be We could be more specific here: public func CheckResults(
_ resultsMatch: Bool,
file: StaticString = #file,
function: StaticString = #function,
line: Int = #line
) {
guard _fastPath(resultsMatch) else {
print("Incorrect result in \(function), \(file):\(line)")
abort()
}
} |
Here's another benchmark run, that tested no changes in performance tests, only modifications to test harness generation. It also has improvements and regressions. StringHasSuffix and StringHasPrefix also appear in regressions there. I'm not disputing the changes in this commit are not showing these test need reevaluation -- just say they were spuriously showing as false positives / false negatives before. |
This is another benchmark result -- again, no change in performance tests. This one probably had a quiet machine during test and has no false changes reported (OK, just 1 😉). We definitely have an issue of false changes reported the whole time. See the two SR's above for suggested solutions. |
@palimondo wrote:
Whoa, that's unexpected. I didn't realize I had overlooked some. Let me go back and fix that.
No argument there. There is so much we need to improve about the way we benchmark, and we've known it for a long time... |
a4da270
to
e839525
Compare
@palimondo We have such a problem with reliable benchmarking that I want to start by stripping out everything that could possibly interfere with getting meaningful results. Ideally that means we should not do any work that we're not trying to measure, and adding strings of any kind is really not needed here. If it becomes useful, we can find a way to bring them back. As a concession to the people who work on the optimizer, we're keeping some result checking code because they say it sometimes finds optimizer bugs, but that code itself needs to be carefully reviewed to ensure it's not doing any significant work compared to what gets measured. Let me also say, thank you very much for your interest in improving this part of Swift. It's critical that we get it right, but the reality is that we have never been able to give it the attention it needs, so your participation is very much appreciated! |
This call was in many cases skewing the benchmark results. Note: Intentionally staging this in without removing the old overload initially.
e839525
to
e32fb8b
Compare
@swift-ci Please smoke benchmark |
@dabrahams Please have a look at #9330. I think I've done what you wanted while you were sleeping… 😉 |
Build comment file:Optimized (O) Regression (19)
Improvement (55)
No Changes (195)
Regression (5)
Improvement (1)
No Changes (263)
|
@swift-ci Please smoke test and merge |
I guess it might be time to start playing with the length of sequences that get processed in those Drop.* Suffix.* and Prefix.* tests, as I suggested to @lplarson in comments to PR 8641. He wanted @therealbnut's feedback on that, too. There I've argued only for balancing the I think we need to start with the slowest one, get the |
@palimondo Any attention you can give this is much appreciated. One thing we need to be aware of is that there is some tracking of these results going on (/cc @bob-wilson), and every time we make a change like this it will require a reset of the expectations used by that tracking. We may need to coordinate fixes somehow. |
@dabrahams Couldn't you have waited for the benchmark results on #9330 before deciding which one to merge? If this lands, that one will most definitely be in conflict and I've wasted quite a lot of time with it… Should I go play elsewhere? |
@palimondo I don't see why there would be a serious conflict; IIUC your patch also drops explicit string arguments and proposes a different implementation of CheckResults(?). The good thing is that once this is merged, we can benchmark yours with respect to it. If there are no costs, we can think about merging it. |
Is that tracking visible from the outside? |
@palimondo: I don't think there are externally-visible tracking results yet, unfortunately. [Sorry if my merge seemed premature; please don't go play elsewhere! As I've been trying to say, your participation here is very much appreciated.] |
Sorry, I'm getting a bit frustrated. Today is rough… the tip of the tree failed to build for me twice already today. Always after 3 hours of compilation. I'm running a third one now, to get a local build of #9310. Could you be so kind a trigger a smoke benchmark for me there? |
@swift-ci Please smoke test and merge |
This call was in many cases skewing the benchmark results.
Note: Intentionally staging this in without removing the old overload initially.