-
Notifications
You must be signed in to change notification settings - Fork 711
Added AVX512 support for space_l2 and space_ip. #339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks you so much for the PR!
It seems like there is #endif missing. Can you please check?
Also, I wonder if you've tested it with sustained multithreaded load? I heard AVX512 can cause cloud instances to overheat and throttle, so they can even perform worse.
I've tried to run the code and noticed that |
…le time. Check for the __AVX512F__ define. All the AVX512 instructions that we are using are part of the Foundation set.
…, using PORTABLE_ALIGN32. However, we need to align it to 64 bytes becase of the use of unaligned store instruction _mm512_store_ps .
Hi - Sorry, this was a mistake on my part. For some reason, I did not realize that we are checking for the presence of AVX/AVX2/etc at compile time, and I assumed that the application using hnswlib is setting the defines. In highsight, that does not make sense... I fixed it now. We are now checking for AVX512F (which is the foundational set of AVX512 instructions.). All the instructions that we are using are part of AVX512F In addition, I fixed a bug which was causing a crash on some system due to not aligning the memory correctly for the store instruction (_mm512_store_ps). This instruction requires memory to be aligned to 64 bytes. |
hnswlib/space_l2.h
Outdated
|
||
// Favor using AVX512 if available. | ||
static float | ||
L2SqrSIMD16Ext(float *pVect1, float *pVect2, size_t qty) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for a long response. It seems float
type here obstructs the compilation.
It should be void *
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for a long response. It seems
float
type here obstructs the compilation. It should bevoid *
Thanks for catching this. I have fixed it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried and there other errors with constant void *
to void *
conversion (I guess it should be made const void * down the stack).
Can you please fix it and test it with pip install .
and python -m unittest discover --start-directory python_bindings/tests --pattern "*_test*.py"
(to run the tests) on a linux system.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the help! I fixed those errors. I was able to compile it and test it without errors following your instructions.
Thanks for the PR!! |
Add AVX512 support to L2 and IP algorithms.
Testing with Ann-Benchmarks shows measurable improvement on c5.4xlarge and m6i.4xlarge instances.
c5.4xlarge results




fashion-mnist-784-euclidean
mnist-784-euclidean
nytimes-256-angular
sift-128-euclidean
m6i.4xlarge results:




fashion-mnist-784-euclidean
mnist-784-euclidean
nytimes-256-angular
sift-128-euclidean