Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Parallel Suffix #103

Merged
merged 1 commit into from
Sep 15, 2024
Merged

Update Parallel Suffix #103

merged 1 commit into from
Sep 15, 2024

Conversation

jamierpond
Copy link
Collaborator

@jamierpond jamierpond commented Sep 15, 2024

----------------------------------------------------------------------------------------------
Benchmark                                                    Time             CPU   Iterations
----------------------------------------------------------------------------------------------
runCompressions<4, UseSWAR>                               1318 ns         1291 ns       541838
runCompressions<4, UseOldParallelSuffix>                  7044 ns         6934 ns       100607
runCompressions<4, CompareNewAndOldParallelSuffix>        8651 ns         8514 ns        82220
runCompressions<8, UseSWAR>                               7271 ns         7149 ns        96997
runCompressions<8, UseOldParallelSuffix>                 15037 ns        14796 ns        47364
runCompressions<8, CompareNewAndOldParallelSuffix>       20979 ns        20657 ns        33887
runCompressions<16, UseSWAR>                             12322 ns        12126 ns        57702
runCompressions<16, UseOldParallelSuffix>                27133 ns        26641 ns        26617
runCompressions<16, CompareNewAndOldParallelSuffix>      37861 ns        37270 ns        18767
runCompressions<32, UseSWAR>                             19148 ns        18784 ns        37240
runCompressions<32, UseOldParallelSuffix>                44419 ns        43650 ns        15744
runCompressions<32, CompareNewAndOldParallelSuffix>      62413 ns        61247 ns        11397
runCompressions<64, UseSWAR>                             26320 ns        25932 ns        27007
runCompressions<64, UseOldParallelSuffix>                67408 ns        66484 ns        10566
runCompressions<64, CompareNewAndOldParallelSuffix>      91202 ns        89833 ns         7829

@thecppzoo
Copy link
Owner

Nice!
You doubled the speed of the SWAR PEXT.

@thecppzoo thecppzoo merged commit 5dd2758 into master Sep 15, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants