Skip to content

[ARM64] Possible perf regression: slicing #41704

@adamsitnik

Description

@adamsitnik

After running benchmarks for 3.1 vs 5.0 using "Ubuntu arm64 Qualcomm Machines" owned by the JIT Team, I've found few regressions related to slicing.

It looks like these are ARM64 specific regressions, I was not able to reproduce it for ARM (the 32-bit variant).

Repro

git clone https://github.com/dotnet/performance.git
py ./performance/scripts/benchmarks_ci.py -f netcoreapp3.1 netcoreapp5.0 --architecture arm64 --filter 'System.Memory.Slice*'
BenchmarkDotNet=v0.12.1.1405-nightly, OS=ubuntu 16.04
Unknown processor
  [Host]     : .NET Core 3.1.8 (CoreCLR 4.700.20.41105, CoreFX 4.700.20.41903), Arm64 RyuJIT
  Job-PVNQZA : .NET Core 3.1.8 (CoreCLR 4.700.20.41105, CoreFX 4.700.20.41903), Arm64 RyuJIT
  Job-PXIHWO : .NET Core 5.0.0 (CoreCLR 5.0.20.41714, CoreFX 5.0.20.41714), Arm64 RyuJIT
Type Method Toolchain Mean Ratio
Slice<Byte> SpanStart netcoreapp3.1 3.831 ns 1.00
Slice<Byte> SpanStart netcoreapp5.0 2.550 ns 0.67
Slice<String> SpanStart netcoreapp3.1 11.526 ns 1.00
Slice<String> SpanStart netcoreapp5.0 16.482 ns 1.43
Slice<Byte> SpanStartLength netcoreapp3.1 3.782 ns 1.00
Slice<Byte> SpanStartLength netcoreapp5.0 3.202 ns 0.85
Slice<String> SpanStartLength netcoreapp3.1 11.720 ns 1.00
Slice<String> SpanStartLength netcoreapp5.0 16.823 ns 1.44
Slice<Byte> ReadOnlySpanStart netcoreapp3.1 3.801 ns 1.00
Slice<Byte> ReadOnlySpanStart netcoreapp5.0 2.867 ns 0.75
Slice<String> ReadOnlySpanStart netcoreapp3.1 9.144 ns 1.00
Slice<String> ReadOnlySpanStart netcoreapp5.0 16.039 ns 1.75
Slice<Byte> ReadOnlySpanStartLength netcoreapp3.1 3.779 ns 1.00
Slice<Byte> ReadOnlySpanStartLength netcoreapp5.0 3.156 ns 0.83
Slice<String> ReadOnlySpanStartLength netcoreapp3.1 9.279 ns 1.00
Slice<String> ReadOnlySpanStartLength netcoreapp5.0 16.418 ns 1.77
Slice<Byte> MemoryStart netcoreapp3.1 3.779 ns 1.00
Slice<Byte> MemoryStart netcoreapp5.0 6.418 ns 1.70
Slice<String> MemoryStart netcoreapp3.1 12.952 ns 1.00
Slice<String> MemoryStart netcoreapp5.0 25.550 ns 1.97
Slice<Byte> MemoryStartSpan netcoreapp3.1 6.416 ns 1.00
Slice<Byte> MemoryStartSpan netcoreapp5.0 10.515 ns 1.64
Slice<String> MemoryStartSpan netcoreapp3.1 24.265 ns 1.00
Slice<String> MemoryStartSpan netcoreapp5.0 33.036 ns 1.36
Slice<Byte> MemoryStartLength netcoreapp3.1 3.805 ns 1.00
Slice<Byte> MemoryStartLength netcoreapp5.0 5.899 ns 1.55
Slice<String> MemoryStartLength netcoreapp3.1 12.285 ns 1.00
Slice<String> MemoryStartLength netcoreapp5.0 18.409 ns 1.50
Slice<Byte> MemoryStartLengthSpan netcoreapp3.1 6.245 ns 1.00
Slice<Byte> MemoryStartLengthSpan netcoreapp5.0 9.975 ns 1.60
Slice<String> MemoryStartLengthSpan netcoreapp3.1 23.963 ns 1.00
Slice<String> MemoryStartLengthSpan netcoreapp5.0 31.040 ns 1.30
Slice<Byte> ReadOnlyMemoryStart netcoreapp3.1 3.807 ns 1.00
Slice<Byte> ReadOnlyMemoryStart netcoreapp5.0 6.394 ns 1.68
Slice<String> ReadOnlyMemoryStart netcoreapp3.1 9.371 ns 1.00
Slice<String> ReadOnlyMemoryStart netcoreapp5.0 22.140 ns 2.36
Slice<Byte> ReadOnlyMemoryStartSpan netcoreapp3.1 6.379 ns 1.00
Slice<Byte> ReadOnlyMemoryStartSpan netcoreapp5.0 9.150 ns 1.43
Slice<String> ReadOnlyMemoryStartSpan netcoreapp3.1 22.247 ns 1.00
Slice<String> ReadOnlyMemoryStartSpan netcoreapp5.0 32.229 ns 1.45
Slice<Byte> ReadOnlyMemoryStartLength netcoreapp3.1 3.733 ns 1.00
Slice<Byte> ReadOnlyMemoryStartLength netcoreapp5.0 6.005 ns 1.61
Slice<String> ReadOnlyMemoryStartLength netcoreapp3.1 8.833 ns 1.00
Slice<String> ReadOnlyMemoryStartLength netcoreapp5.0 16.008 ns 1.81
Slice<Byte> ReadOnlyMemoryStartLengthSpan netcoreapp3.1 6.087 ns 1.00
Slice<Byte> ReadOnlyMemoryStartLengthSpan netcoreapp5.0 9.430 ns 1.55
Slice<String> ReadOnlyMemoryStartLengthSpan netcoreapp3.1 22.489 ns 1.00
Slice<String> ReadOnlyMemoryStartLengthSpan netcoreapp5.0 29.502 ns 1.31
Slice<Byte> MemorySpanStart netcoreapp3.1 8.985 ns 1.00
Slice<Byte> MemorySpanStart netcoreapp5.0 11.703 ns 1.31
Slice<String> MemorySpanStart netcoreapp3.1 23.013 ns 1.00
Slice<String> MemorySpanStart netcoreapp5.0 23.544 ns 1.02
Slice<Byte> MemorySpanStartLength netcoreapp3.1 8.289 ns 1.00
Slice<Byte> MemorySpanStartLength netcoreapp5.0 9.989 ns 1.21
Slice<String> MemorySpanStartLength netcoreapp3.1 23.611 ns 1.00
Slice<String> MemorySpanStartLength netcoreapp5.0 23.401 ns 0.99
Slice<Byte> ReadOnlyMemorySpanStart netcoreapp3.1 8.519 ns 1.00
Slice<Byte> ReadOnlyMemorySpanStart netcoreapp5.0 11.698 ns 1.37
Slice<String> ReadOnlyMemorySpanStart netcoreapp3.1 19.716 ns 1.00
Slice<String> ReadOnlyMemorySpanStart netcoreapp5.0 22.038 ns 1.12
Slice<Byte> ReadOnlyMemorySpanStartLength netcoreapp3.1 6.770 ns 1.00
Slice<Byte> ReadOnlyMemorySpanStartLength netcoreapp5.0 10.912 ns 1.61
Slice<String> ReadOnlyMemorySpanStartLength netcoreapp3.1 21.624 ns 1.00
Slice<String> ReadOnlyMemorySpanStartLength netcoreapp5.0 22.292 ns 1.03

@kunalspathak is there any chance you could take a look at the produced assembly code and verify if this is an actual regression in code gen or not?

category:cq
theme:ssa
skill-level:expert
cost:large

Metadata

Metadata

Assignees

Labels

arch-arm64area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMItenet-performancePerformance related issue

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions