-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[benchmark] Added Benchmark Check Report #20807
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[benchmark] Added Benchmark Check Report #20807
Conversation
Promoting previously DEBUG message to INFO.
`logging.Handler` that creates nicely formatted report from `BecnhmarkDoctor`’s `check` in Markdown table for display on GitHub.
Produce Markdown formatted report, analyzing the quality of newly added benchmarks.
This is the reason why I'd like to finish and merge the #20334 [benchmark] Naming Convention |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically LGTM! (Please see my comment inline)
Also, can you add a temporary test-commit to this PR which introduces some new "bad" benchmarks, which violate several rules? So that we can test this change with "swift-ci benchmark".
Removed the option to `-check-added`, as it’s now run by default. Replaced with option to skip checking added benchmarks: `-skip-check-added`.
@eeckstein I've added a |
Simplify the code by removing indirection: After removal of -check-added option, only single function remains.
@eeckstein Removed the indirection, |
BTW, it looks like swift-ci doesn’t support combining PRs for |
Can you just temporarily add a test-commit to this PR, and then after the benchmark run, remove it again? |
@swift-ci benchmark |
I've noticed the |
a6f26e2
to
cd47f32
Compare
I hope the CI bot's managed to clone before I've force-pushed 🙃 |
Build comment file:Performance: -O
Performance: -Osize
Performance: -Onone
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
@swift-ci smoke test |
1 similar comment
@swift-ci smoke test |
@swift-ci benchmark |
Build comment file:No performance and code size changes |
@milseman Can you help me understand the unusual memory variance seen in the demonstration? Similar variance issues manifested themselves over in Breadcrumbs. Without knowing any details about their implementation, I'd guess there's some kind of memory leak in the new Strings? Granted, "constant memory use" rule is quite unproven recommendation at this time: I asked for advice in the forums before… |
That's weird, let's make sure it's not a formal memory leak. Which one seems to be the biggest offender? |
Nothing stands out especially, i.e. all look very strange. Let me explain the metric some more: This means: Looking at all of the above |
This might be because those benchmarks don't reset their state after every iteration: @inline(__always)
func insertTowardsEndIndex(_ c: Character, in string: String, count: Int) {
var workload = string
var index = workload.endIndex
for i in 0..<count {
workload.insert(identity(c), at: index)
if i % 1000 == 0 {
index = workload.endIndex
}
}
blackHole(workload)
}
@inline(never)
func run_InsertCharacterTowardsEndIndex(_ N: Int) {
insertTowardsEndIndex("s", in: str, count: N * 3000)
} It seems like |
@milseman That's actually good to hear, we can fix that! And looking at new
|
Well, actually, what you were saying about But the the way |
I was assuming "iteration" means the count that's passed into the run function, so Or, is that what you mean by "sample"? I thought multiple "samples" would be running the same benchmark on the same |
You are correct. The That issue wasn't caught by the corresponding heuristic in normal_range = 15 # pages
…
if abs(min_i1 - min_i2) > max(range_i1, range_i2, normal_range): Let's have a look at this one:
The runs within each i-series, i.e. |
We start with |
Is mem_pages a record of the number of pages used during overall execution, or peak memory usage? When the string grows in size, it can trigger an allocation and copy, so we'd expect to see many pages allocated and deallocated over the course of this benchmark. |
The Thinking about this more… the ideal case assumes the allocator will reuse same memory pages for all samples, which might not be the case, as we are still running a lot of samples and the same page might get allocated to different process, right? I'm not familiar with how the system's allocator does that in detail. |
Hmm... that probably means that memory usage measurement needs to be always performed with a single sample only and therefore separated from the runtime measurements. What do you think? Update: I ran this 10 x: So it's improved, but still not without variance… 🤷♂️. But R=5 is a lot better. So I guess I'll test that on the whole SBS and report back. |
Hmm… |
Integrates the
Benchmark_Driver check
command intorun_smoke_bench
, producing Markdown formatted report table, analyzing the quality of newly added benchmarks.Example output:
String.insert.EmojiChar.NearEnd
name doesn`t conform to UpperCamelCase convention.See http://bit.ly/UpperCamelCase
String.insert.EmojiChar.NearEnd
execution took at least 1871 μs.Decrease the workload of
String.insert.EmojiChar.NearEnd
by a factor of 2 (10), to be less than 1000 μs.String.insert.EmojiChar.EndIndex
name doesn`t conform to UpperCamelCase convention.See http://bit.ly/UpperCamelCase
String.insert.EmojiChar.EndIndex
has very wide range of memory used between independent, repeated measurements.String.insert.EmojiChar.EndIndex
mem_pages [i1, i2]: min=[12, 22] 𝚫=10 R=[15, 18]If there are no issue with added benchmarks, the result looks like this: