Refactor AVX2 blitters using macros #2055
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This moves code repetition out of the blitter functions themselves into the macros developed for running the new AVX2 alpha blitters. Also adjusted a couple set_ instructions (like
_mm256_set_epi16
) to use set1_ variants instead so the value can just be provided once.Makes simd_blitters_avx2 800 lines shorter.
I believe this has a small performance regression, which could be solved by rewiring the
RUN_AVX2_BLITTER
implementation strategy. Maybe we've got to put the DUFF loops back in there?