Skip to content

Conversation

@brianpopow
Copy link
Collaborator

@brianpopow brianpopow commented Nov 12, 2021

Prerequisites

  • I have written a descriptive pull-request title
  • I have verified that there are no overlapping pull-requests open
  • I have verified that I am following the existing coding patterns and practice as demonstrated in the repository. These follow strict Stylecop rules 👮.
  • I have provided test coverage for my change (where applicable)

Description

This PR adds a AVX version of CollectColorBlueTransforms and CollectColorRedTransforms, which is used during encoding of lossless webp images.
Related to: #1786

TODO:
- There are still 2 failing tests. Fixed by @JimBobSquarePants with 55f04f6

Profiling results with test image: Images\Input\Webp\earth_lossless.webp :

Master

CollectColor

PR

ColectColorAvx

@brianpopow
Copy link
Collaborator Author

brianpopow commented Nov 13, 2021

The AVX version does not produce the same results as the SSE41 version. Can't tell why that is so far, both versions look the same.

edit: I am guessing somethings wrong with the shuffle masks.

@JimBobSquarePants
Copy link
Member

@brianpopow I’ll try and have a look later. Are the masks a byte? If so there’s a method in the source MMShuffle you could use to compare to.

@brianpopow
Copy link
Collaborator Author

brianpopow commented Nov 14, 2021

@brianpopow I’ll try and have a look later. Are the masks a byte? If so there’s a method in the source MMShuffle you could use to compare to.

The masks are explicitly defined following the pattern from SSE41: Mask

Maybe it was naive from me to assume, I could just use the same pattern as in SSE41. This time there is no reference in libwebp, I could compare it to.

I was testing it with simple test pattern images, like only red, only blue, only green images and it seems to work, but not with more complex examples: See the failing ColorSpaceTransform_WithBikeImage_ProducesExpectedData

@brianpopow brianpopow closed this Nov 14, 2021
@brianpopow brianpopow reopened this Nov 14, 2021
@antonfirsov
Copy link
Member

@brianpopow my recommendation is to extract the AVX, the SSE and the scalar path to separate unit-testable methods and debug them side-by side to spot the issue.

@JimBobSquarePants
Copy link
Member

Are you sure your Sse version is correct?

image

@brianpopow
Copy link
Collaborator Author

Are you sure your Sse version is correct?

I am pretty sure that the SSE version produces the same results as the none vectorized version CollectColorBlueTransformsNoneVectorized, yes.

Here is an example with all pixels red (with alpha at 255):
red_example

uint[] bgrRed =
{
    4294901760, 4294901760, 4294901760, 4294901760, 4294901760, 4294901760, 4294901760, 4294901760,
    4294901760, 4294901760, 4294901760, 4294901760, 4294901760, 4294901760, 4294901760, 4294901760,
    4294901760, 4294901760, 4294901760, 4294901760, 4294901760, 4294901760, 4294901760, 4294901760,
    4294901760, 4294901760, 4294901760, 4294901760, 4294901760, 4294901760, 4294901760, 4294901760,
    4294901760
};

@JimBobSquarePants
Copy link
Member

JimBobSquarePants commented Nov 14, 2021

Ah right, my bad. My input was in byte ranges. Looks to me like you're not crossing the boundaries properly.

CollectColorBlueTransformsShuffleLowMask256 and CollectColorBlueTransformsShuffleHighMask256 should be something like.

private static readonly Vector256<byte> CollectColorBlueTransformsShuffleLowMask256 = Vector256.Create(255, 2, 255, 6, 255, 10, 255, 14, 255, 255, 255, 255, 255, 255, 255, 255, 255, 18, 255, 22, 255, 26, 255, 30, 255, 255, 255, 255, 255, 255, 255, 255);
private static readonly Vector256<byte> CollectColorBlueTransformsShuffleHighMask256 = Vector256.Create(255, 255, 255, 255, 255, 255, 255, 255, 255, 2, 255, 6, 255, 10, 255, 14, 255, 255, 255, 255, 255, 255, 255, 255, 255, 18, 255, 22, 255, 26, 255, 30);

image

@codecov
Copy link

codecov bot commented Nov 15, 2021

Codecov Report

Merging #1824 (c491cbb) into master (69c30f8) will decrease coverage by 0%.
The diff coverage is 97%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #1824    +/-   ##
=======================================
- Coverage      87%     87%    -1%     
=======================================
  Files         936     937     +1     
  Lines       48320   48385    +65     
  Branches     6038    6048    +10     
=======================================
- Hits        42225   42185    -40     
- Misses       5094    5194   +100     
- Partials     1001    1006     +5     
Flag Coverage Δ
unittests 87% <97%> (-1%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
.../ImageSharp/Formats/Webp/Lossless/LosslessUtils.cs 88% <ø> (-9%) ⬇️
.../Formats/Webp/Lossless/ColorSpaceTransformUtils.cs 97% <97%> (ø)
...ageSharp/Formats/Webp/Lossless/PredictorEncoder.cs 87% <100%> (-11%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 69c30f8...c491cbb. Read the comment docs.

@JimBobSquarePants
Copy link
Member

@brianpopow I updated the masks for you and everything is passing now. I won't approve/merge yet until you're happy that you've covered everything you need to.

@brianpopow
Copy link
Collaborator Author

@brianpopow I updated the masks for you and everything is passing now. I won't approve/merge yet until you're happy that you've covered everything you need to.

Awesome, thanks for your help!

I think Anton is right. The color transform methods are not easily unit testable. I want to do a favor to "future me" and separate them into a own class.

@brianpopow brianpopow changed the title WIP: Add AVX version of CollectColorBlueTransforms and CollectColorRedTransforms Add AVX version of CollectColorBlueTransforms and CollectColorRedTransforms Nov 15, 2021
@brianpopow
Copy link
Collaborator Author

I think this is now ready for a final review

brianpopow and others added 2 commits November 15, 2021 19:21
This was referenced Jul 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants