Add strength reduction benchmarks #4317

jakobbotsch · 2024-07-17T19:48:22Z

This adds strength reduction benchmarks for arrays of a few different element sizes, motivated by the differences in codegen. The element sizes give different characteristics of how we access each element. For x64, the current instruction codegen looks like:

2: load
3: lea + load
4: load
8: load
12: lea + load
16: shl + load
29: imul + load

Each size has 3 variants of benchmarks: an array version, a span version, and a fully strength reduced manual version. The JIT is expected to be able to transform the array version into the strength reduced version soon. The span version will also be transformed, but not quite all the way (the strength reduction will not be able to fold in the base byref of the span).

There is one current annoyance to work around in the JIT: we do not align the strength-reduced versions of the loops because they end up being "too small", meaning that they still fit within a single cache line. However, it turns out alignment is still beneficial in these cases, and this skews the results compared to the non-strength reduced versions. I have opened dotnet/runtime#104665 about this. To work around the problem in these benchmarks I have added a superfluous bitwise or operation in the body of all the loops.

On my Intel CPU the current results are:

Method	Mean	Error	StdDev	Median	Min	Max	Ratio	RatioSD	Code Size	Allocated	Alloc Ratio
SumS12Array	4.813 us	0.2538 us	0.2923 us	4.722 us	4.445 us	5.394 us	1.00	0.08	73 B	-	NA
SumS12Span	4.530 us	0.1372 us	0.1580 us	4.467 us	4.334 us	4.844 us	0.94	0.06	126 B	-	NA
SumS12ArrayStrengthReduced	3.712 us	0.0918 us	0.1058 us	3.653 us	3.608 us	3.913 us	0.77	0.05	65 B	-	NA

SumS16Array	4.557 us	0.0569 us	0.0532 us	4.544 us	4.482 us	4.677 us	1.00	0.02	73 B	-	NA
SumS16Span	4.529 us	0.0277 us	0.0259 us	4.532 us	4.479 us	4.572 us	0.99	0.01	126 B	-	NA
SumS16ArrayStrengthReduced	3.836 us	0.0349 us	0.0326 us	3.828 us	3.796 us	3.920 us	0.84	0.01	65 B	-	NA

SumS29Array	5.260 us	0.0623 us	0.0583 us	5.243 us	5.181 us	5.378 us	1.00	0.02	84 B	-	NA
SumS29Span	5.265 us	0.0474 us	0.0444 us	5.258 us	5.200 us	5.349 us	1.00	0.01	132 B	-	NA
SumS29ArrayStrengthReduced	4.340 us	0.0349 us	0.0326 us	4.339 us	4.282 us	4.384 us	0.83	0.01	76 B	-	NA

SumS3Array	4.324 us	0.0318 us	0.0298 us	4.321 us	4.287 us	4.392 us	1.00	0.01	74 B	-	NA
SumS3Span	4.352 us	0.0243 us	0.0228 us	4.360 us	4.301 us	4.382 us	1.01	0.01	127 B	-	NA
SumS3ArrayStrengthReduced	3.289 us	0.0195 us	0.0183 us	3.287 us	3.266 us	3.339 us	0.76	0.01	66 B	-	NA

SumS8Array	3.794 us	0.1145 us	0.1319 us	3.744 us	3.691 us	4.177 us	1.00	0.05	69 B	-	NA
SumS8Span	3.743 us	0.0213 us	0.0199 us	3.738 us	3.720 us	3.805 us	0.99	0.03	122 B	-	NA
SumS8ArrayStrengthReduced	3.435 us	0.0647 us	0.0719 us	3.425 us	3.346 us	3.713 us	0.91	0.03	65 B	-	NA

SumIntsArray	3.719 us	0.1488 us	0.1713 us	3.631 us	3.568 us	4.032 us	1.00	0.06	70 B	-	NA
SumIntsSpan	3.621 us	0.0188 us	0.0176 us	3.621 us	3.589 us	3.646 us	0.98	0.04	121 B	-	NA
SumIntsArrayStrengthReduced	3.447 us	0.2373 us	0.2733 us	3.338 us	3.286 us	4.516 us	0.93	0.08	65 B	-	NA

SumLongsArray	3.785 us	0.1116 us	0.1285 us	3.741 us	3.669 us	4.180 us	1.00	0.05	69 B	-	NA
SumLongsSpan	3.792 us	0.1443 us	0.1662 us	3.695 us	3.617 us	4.123 us	1.00	0.05	123 B	-	NA
SumLongsArrayStrengthReduced	3.454 us	0.0634 us	0.0593 us	3.443 us	3.402 us	3.656 us	0.91	0.03	65 B	-	NA

SumShortsArray	3.626 us	0.0810 us	0.0933 us	3.619 us	3.507 us	3.869 us	1.00	0.04	70 B	-	NA
SumShortsSpan	3.586 us	0.0416 us	0.0389 us	3.581 us	3.526 us	3.693 us	0.99	0.03	122 B	-	NA
SumShortsArrayStrengthReduced	3.317 us	0.0541 us	0.0506 us	3.302 us	3.263 us	3.442 us	0.92	0.03	66 B	-	NA

This adds strength reduction benchmarks for arrays of a few different element sizes, motivated by the differences in codegen. 2: load 3: lea + load 4: load 8: load 12: lea + load 16: shl + load 29: imul Each size has 3 variants of benchmarks: an array version, a span version, and a fully strength reduced manual version. The JIT is expected to be able to transform the array version into the strength reduced version soon. The span version will also be transformed, but not quite all the way (the strength reduction will not be able to fold in the base byref of the span).

jakobbotsch · 2024-07-17T19:53:15Z

cc @LoopedBard3 @DrewScoggins

jakobbotsch · 2024-07-18T10:42:36Z

I don't think the CI failures are related.

LoopedBard3 · 2024-07-18T17:03:38Z

Yes, the current ci failures do not seem to be related, is this ready for review and merge?

jakobbotsch · 2024-07-18T17:06:32Z

Yes, the current ci failures do not seem to be related, is this ready for review and merge?

Yeah, this is ready.

LoopedBard3

Looks good to me, thank you for the benchmarks!

jakobbotsch added 3 commits July 17, 2024 21:55

Remove a couple of duplicative variants

a3cc1cd

Remove [MemoryRandomization] attribute

1ad3ab6

Delete struct declarations

2d58a55

jakobbotsch mentioned this pull request Jul 17, 2024

Improve JIT loop optimizations (.NET 9) dotnet/runtime#93144

Closed

21 tasks

build-analysis bot mentioned this pull request Jul 17, 2024

RestApiException`1: The response contained an invalid status code 500 Internal Server Error dotnet/dnceng#2298

Open

3 tasks

LoopedBard3 approved these changes Jul 18, 2024

View reviewed changes

LoopedBard3 merged commit 49f03f7 into dotnet:main Jul 18, 2024
45 of 63 checks passed

jakobbotsch deleted the strength-reduction-benchmarks branch July 18, 2024 17:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add strength reduction benchmarks #4317

Add strength reduction benchmarks #4317

Uh oh!

jakobbotsch commented Jul 17, 2024 •

edited

Loading

Uh oh!

jakobbotsch commented Jul 17, 2024

Uh oh!

jakobbotsch commented Jul 18, 2024 •

edited

Loading

Uh oh!

LoopedBard3 commented Jul 18, 2024

Uh oh!

jakobbotsch commented Jul 18, 2024

Uh oh!

LoopedBard3 left a comment

Uh oh!

Uh oh!

Uh oh!

Add strength reduction benchmarks #4317

Add strength reduction benchmarks #4317

Uh oh!

Conversation

jakobbotsch commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jakobbotsch commented Jul 17, 2024

Uh oh!

jakobbotsch commented Jul 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LoopedBard3 commented Jul 18, 2024

Uh oh!

jakobbotsch commented Jul 18, 2024

Uh oh!

LoopedBard3 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jakobbotsch commented Jul 17, 2024 •

edited

Loading

jakobbotsch commented Jul 18, 2024 •

edited

Loading