-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[benchmark] Add some benchmarks for String breadcrumbs #20769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The CopyUTF16CodeUnits benchmark here will crash until #20768 lands. |
I haven't yet finished integration of BenchmarkDoctor into
(ignoring the naming complaints it will spew, as we have a hung jury back in #20334 😜) |
Did you profile this and make sure all the time is spent in the corresponding methods on String? I just want to make sure there isn't any ARC or missed devirtualization opportunities that would change what we're measuring. |
@palimondo The doctor complained about the names and the "very wide range of memory used", but I scaled things so that it likes the timings. |
@milseman I did some spot checks -- the heavy stack frames were inside the target entry points and looked breadcrumby enough. I wasn't looking for missed optimization opportunities though. I'll have a more detailed look tomorrow, just to make sure these are all measuring what I think they're measuring. |
🧐Interesting. Could you re-check those with
|
@palimondo Here's a quick sample of the memory-related output:
|
Build comment file: |
@swift-ci please smoke test os x platform |
@swift-ci please benchmark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very fresh approach to writing benchmarks! 👍
So far I have these comments:
- Documentation block at the beginning that explains what are the
Breadcrumbs
being tested here, would be very helpful. - The test names would ideally fit under 40 characters, renames suggested here.
- The debug print should be removed.
@palimondo where is the 40-character limit coming from and is there any way to get more? Is it Github's in-PR renderer that's constraining this? |
Build comment file:Performance: -O
Performance: -Osize
Performance: -Onone
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
Yes, primarily GitHub's fixed table width. As the above report shows, these name at 48 chars are filling the entire available space. But this is a best case scenario, as the Also formatting the result table in console ( |
The variance in
Only the last one looks fine. This might be background process interference, with @lorentey Could you please check locally the variance isn't this crazy, when measured properly?
🤨 These are the same tests that reported very wide range of memory used. There's some issue with the benchmarks or the tested functionality. |
68e886b
to
ec150df
Compare
|
Rebased, applied review notes, cleaned up some things. @swift-ci please smoke test |
@swift-ci benchmark |
Ah, I forgot to push a comment typo fix. Let's wait out the benchmark run first. |
@milseman I profiled all tests looking for anomalies; they seem to work as intended. I did not notice anything left unspecialized, but it seems noteworthy that the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Few cosmetic nits in the comments. Thank's for hearing me out, Karoy!
@swift-ci please smoke check |
@lorentey BTW, brace for |
Build comment file:Performance: -O
Performance: -Osize
Performance: -Onone
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
@lorentey It looks like it didn't run the benchmarks with the Benchmark Check Report from other PR. I guess this only works for |
🧐 |
@swift-ci please smoke test |
@lorentey, you didn't answer my question from older conversation:
How so? I only see the same string repeated multiple times. That seems to me just like a different way to scale the workload size, instead of setting the inner-loop multiplier. Am I mistaken? I don't understand what made you persist with the different names, instead of going with |
It's just that repeating a measurement 100 times is not the same as making a single measurement with a 100-times longer input.
|
[benchmark] Add some benchmarks for String breadcrumbs
Upon further thought, the variance can be explained this way: @lorentey @milseman There's something strange going on with these benchmarks and/or the breadcrumbs implementation. After Karoy ran the
Strangely, as soon as 3rd of December some of these start to sporadically appear in the
There were no benchmark reports from @swift-ci in any PR's here on GitHub that would include the improvement as intentional change… I have noticed that something is strange here during my janitor duty, as even on my measly machine the I've looked at it in Instruments: ASCII workload's stack trace:
Mixed workload's stack trace:
It looks like ASCII workload is not using any breadcrumbs. Is that normal or a bug?! |
ASCII <-> UTF-16 can be calculated in constant time. It shouldn't use breadcrumbs. We can either have a benchmark demonstrating the triviality of this workload, or maybe a SIL-level test as I think the ASCII check is in inlinable code. |
I don't understand why the performance demonstrated in the ASCII benchmarks changed between when @lorentey wrote them as seen in the last benchmark report above and later, given that StringBreadcrumbs.swift wasn't modified during that time. And as I said before, there are no publicly visible benchmark reports that would show this improvement was intentional effect of some PR. Can you please check in your internal performance tracking when these changed? Let's say the |
Oh! I have apparently searched through pulls without the proper filters... Thank you! |
I guess this is fine as is — I'll keep an eye out to see if the sub 20 μs runtime will make these benchmarks show up as false changes. I'm sorry to have bothered you with this. And thanks again for digging up the relevant PR, @milseman! |
This adds some benchmarks for
String
breadcrumbing, via low-level stdlib entry points exposed byString
.StringBreadcrumbs.UTF16ToIndex
: Convert some random UTF-16 offsets into String indices.StringBreadcrumbs.IndexToUTF16
: Convert some random String indices into UTF-16 offsets.StringBreadcrumbs.UTF16ToIndexRange
: Split a string into a bunch of random slices and convert their UTF-16 offsets into String index ranges.StringBreadcrumbs.IndexToUTF16Range
: Do the same in reverse direction.StringBreadcrumbs.CopyUTF16CodeUnits
: Copy UTF-16 code units from random ranges.StringBreadcrumbs.MutatedUTF16ToIndex
: This is likeUTF16ToIndex
but appends to the string after every index conversion. In effect, this tests breadcrumb creation performance.StringBreadcrumbs.MutatedIndexToUTF16
: This is likeIndexToUTF16
but appends to the string after every index conversion. In effect, this tests breadcrumb creation performance.https://bugs.swift.org/browse/SR-9226