Skip to content

Optimize find_nnz() using VBMI2 #6186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

mstembera
Copy link
Contributor

@mstembera mstembera commented Jul 27, 2025

I get about a 0.7% speedup for the x86-64-avx512icl ARCH. Hopefully someone else can also bench.

No functional change
bench: 2902492

Copy link

github-actions bot commented Jul 27, 2025

clang-format 20 needs to be run on this PR.
If you do not have clang-format installed, the maintainer will run it when merging.
For the exact version please see https://packages.ubuntu.com/plucky/clang-format-20.

(execution 16631793868 / attempt 1)

@mstembera mstembera force-pushed the nnzICL branch 2 times, most recently from 7819a4c to 6e92051 Compare July 28, 2025 18:34
No functional change
bench: 2902492
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🚀 gainer to be merged Will be merged shortly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants