Skip to content

Conversation

@tkp1n
Copy link
Contributor

@tkp1n tkp1n commented Jan 21, 2021

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following matches the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

In the context of #1476 I looked for low hanging fruit on the hot path of the JPEG encoding pipeline.
Using a benchmark (encoding a 4K image) with the ETW Profiler attached, I've noticed that Block8x8F.Scale16X16To8X8 is responsible for 9.8% of the total time taken. I've noticed that the method is the optimal candidate for vectorization (AVX2) and went for it.

  • The benchmarks below show a 12x improvement in Block8x8F.Scale16X16To8X8 🚀
  • This reduces the total encoding time of my sample image by ~9% from 475.9 ms to 434.1 ms

Benchmarks

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.19042
Intel Core i7-9750H CPU 2.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=5.0.102
  [Host] : .NET Core 3.1.11 (CoreCLR 4.700.20.56602, CoreFX 4.700.20.56604), X64 RyuJIT
  main   : .NET Core 3.1.11 (CoreCLR 4.700.20.56602, CoreFX 4.700.20.56604), X64 RyuJIT
  PR     : .NET Core 3.1.11 (CoreCLR 4.700.20.56602, CoreFX 4.700.20.56604), X64 RyuJIT

Runtime=.NET Core 3.1  
Method Job Mean Error StdDev Ratio Gen 0 Gen 1 Gen 2 Allocated Allocated native memory Native memory leak
Scale16X16To8X8 main 189.83 ns 2.875 ns 2.689 ns 1.00 - - - - - -
Scale16X16To8X8 PR 14.97 ns 0.179 ns 0.159 ns 0.08 - - - - - -

@codecov
Copy link

codecov bot commented Jan 21, 2021

Codecov Report

Merging #1517 (4fb1859) into master (7eb5cc0) will decrease coverage by 0.02%.
The diff coverage is 89.65%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1517      +/-   ##
==========================================
- Coverage   83.53%   83.50%   -0.03%     
==========================================
  Files         742      742              
  Lines       32772    32801      +29     
  Branches     3669     3671       +2     
==========================================
+ Hits        27375    27392      +17     
- Misses       4680     4691      +11     
- Partials      717      718       +1     
Flag Coverage Δ
unittests 83.50% <89.65%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...rc/ImageSharp/Formats/Jpeg/Components/Block8x8F.cs 90.16% <89.28%> (-2.27%) ⬇️
...mageSharp/Common/Helpers/SimdUtils.HwIntrinsics.cs 99.56% <100.00%> (+<0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7eb5cc0...4fb1859. Read the comment docs.

@JimBobSquarePants
Copy link
Member

Ah fantastic @tkp1n !!

I had a feeling this might be a bottle neck... I wonder, do you have screenshots of your trace we could use to identify further points of interest?

Vector256<float> c = in2;
Vector256<float> d = Unsafe.Add(ref in2, 1);

Vector256<float> calc1 = Avx.Shuffle(a, c, 0b10_00_10_00);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish there was a way to produce constants via SimdUtils.Shuffle.MmShuffle() in a way that could be inlined.

Copy link
Member

@JimBobSquarePants JimBobSquarePants left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

12X... Can't really ask for more than that. Fantastic stuff!

@JimBobSquarePants JimBobSquarePants merged commit 954d233 into SixLabors:master Jan 21, 2021
@antonfirsov
Copy link
Member

antonfirsov commented Jan 21, 2021

9% from such a little addition, very nice!

@tkp1n tkp1n deleted the feature/vectorize-scale-to-dst-block branch January 22, 2021 14:55
@tkp1n tkp1n mentioned this pull request Feb 14, 2021
4 tasks
JimBobSquarePants added a commit that referenced this pull request Mar 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants