-
Notifications
You must be signed in to change notification settings - Fork 100
AVX512 diff algorithm #131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Not sure if it's my commit that broke windows build, but it reproduces for me locally and is fixed by 8adfb60. |
|
It didn't help on the CI, same error 🤷 |
|
Yes CI is clogged by unrelated reasons. I'll need to fix it first |
|
i am like 90% certain that this is just an issue with using x86_64-windows-gnu (mingw) for compilation while the vcpkg lib is compiled with msvc rebasing this on to #126 should work without any issues (my build zig is very different than the current head tho so i expect some nasty conflicts (sorry 😅) |
|
I did a rebase and also added avx enabled CI run. |
|
niceee!! btw can we do some sort of runtime detection here? like the primary way odiff will be used is from npm so most people will not build from source. adding a runtime flag that toggles this would probably be a better idea? @dmtrKovalenko what do you think?? |
|
Yeah I think it probably should be runtime enabled or once I test just become a default. Something like --enable-asm flag |
|
I changed the flag to be runtime flag. |
|
I approve of this pr looks based |
dmtrKovalenko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, tested it out on the real avx512 machine today seems like everything is working just fine and gives the actual perf benefit.
Committed a little change to not log the unsupported OS warning because it is obsolete.
Introduce diff algorithm in pure AVX512. A new build flag
avx512_diffis introduced that controls whether the new algorithm should be used.A new test file
test_avx.zigis added, because no existing tests were hitting the AVX diff case (default options, without output image). These tests are simply copies of other tests with changed options.Note that the new algorithm operates on floats, not ints.
Performance
Testing this on my local a few-years-old laptop. This machine seem to have 128 bits SSE units, but it claims AVX512 support.
The results here are not very optimistic. On this machine, the new algorithm as actually slower than normal zig version. The actual results will very likely vary greatly from CPU to CPU, depending on the AVX support.
Some of this difference might be due to operating on floats rather than ints.