-
Notifications
You must be signed in to change notification settings - Fork 114
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Faiss SQFP16 Quantization and enable AVX2 Optimization #1421
Add support for Faiss SQFP16 Quantization and enable AVX2 Optimization #1421
Conversation
f6654db
to
4ef2548
Compare
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #1421 +/- ##
============================================
+ Coverage 84.90% 85.02% +0.11%
- Complexity 1273 1277 +4
============================================
Files 167 167
Lines 5186 5207 +21
Branches 491 493 +2
============================================
+ Hits 4403 4427 +24
+ Misses 574 572 -2
+ Partials 209 208 -1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enabling Faiss AVX2 SIMD optimization, which helps to boost the overall performance on x86 architecture (reduce the query latencies and improve indexing throughput but the vector dimension must be multiple of 8).
One concern I have with this change is that this could break bwc for users running machines without avx2. What is the plan for handling this?
Update the Faiss submodule.
What did we update the submodule to and why? Is this documented somewhere? I see there is mention of temporary patches - is there a comment on GH explaining this?
223bd84
to
e29fc2b
Compare
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
e29fc2b
to
45e4091
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job. We should add a validation that with the index description we pass, we correctly create an SQ index (https://github.com/opensearch-project/k-NN/tree/main/jni/tests).
Also, its nice we didnt have to modify our jni wrapper at all.
jni/CMakeLists.txt
Outdated
@@ -111,7 +111,11 @@ endif () | |||
if (${CONFIG_FAISS} STREQUAL ON OR ${CONFIG_ALL} STREQUAL ON OR ${CONFIG_TEST} STREQUAL ON) | |||
set(BUILD_TESTING OFF) # Avoid building faiss tests | |||
set(BLA_STATIC ON) # Statically link BLAS | |||
set(FAISS_OPT_LEVEL generic) # Keep optimization level generic | |||
if(WIN32 OR ${CMAKE_SYSTEM_PROCESSOR} MATCHES "aarch64" OR NOT ${ENABLE_SIMD}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wrap some variables in ${} and not WIN32? Lets be consistent here. Also, can you confirm that WIN32 will refer to the var and not a constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the variable. WIN32
is a variable which is True on a Windows machine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right but why doesnt it need to be wrapped in ${} like ${WIN32}?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure. I took this reference, tried and it worked
https://stackoverflow.com/a/17845462
61d08c4
to
f793442
Compare
f793442
to
a39069c
Compare
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
a39069c
to
8e4f404
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM nice job!
5f9511b
into
opensearch-project:main
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-1421-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5f9511b6407691543b1ea0bedd30f3a11f076236
# Push it to GitHub
git push --set-upstream origin backport/backport-1421-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x Then, create a pull request where the |
opensearch-project#1421) * Add Support for Faiss SQFP16 and enable Faiss AVX2 Optimization Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Add Patch Script to fix build on Linux CI Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Disable AVX2 support on Windows Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Add CHANGELOG Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Update Faiss Submodule Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Address Review Comments Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Update UX Interface Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Add Parameter to enable SIMD Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Update DEVELOPER_GUIDE Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Address Review Comments Signed-off-by: Naveen Tatikonda <navtat@amazon.com> --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
opensearch-project#1421) * Add Support for Faiss SQFP16 and enable Faiss AVX2 Optimization Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Add Patch Script to fix build on Linux CI Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Disable AVX2 support on Windows Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Add CHANGELOG Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Update Faiss Submodule Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Address Review Comments Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Update UX Interface Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Add Parameter to enable SIMD Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Update DEVELOPER_GUIDE Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Address Review Comments Signed-off-by: Naveen Tatikonda <navtat@amazon.com> --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
#1421) (#1448) * Add Support for Faiss SQFP16 and enable Faiss AVX2 Optimization * Add Patch Script to fix build on Linux CI * Disable AVX2 support on Windows * Add CHANGELOG * Update Faiss Submodule * Address Review Comments * Update UX Interface * Add Parameter to enable SIMD * Update DEVELOPER_GUIDE * Address Review Comments --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
…imization (opensearch-project#1421) (opensearch-project#1448)" This reverts commit 0e17711. Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
* Revert "Set Default value for SIMD_ENABLED in CMakeList (#1455)" This reverts commit fc2c154. Signed-off-by: Naveen Tatikonda <navtat@amazon.com> * Revert "Add support for Faiss SQFP16 Quantization and enable AVX2 Optimization (#1421) (#1448)" This reverts commit 0e17711. Signed-off-by: Naveen Tatikonda <navtat@amazon.com> --------- Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Description
This PR includes the following changes:
sq
of typefp16
to the Faiss engine which quantizes 32 bit float vectors into 16 bit float values for bothhnsw
andivf
methods using the Faiss SQFP16 Scalar Quantizer which optimizes memory upto 48% with a very minimal loss of precision. The UX is shown below:8
).FAISS_ALWAYS_INLINE
directives which improves the performance of Scalar Quantizer. Soon, the submodule will be updated to include the SIMD NEON changes after fixing the conflict issues wrt to Multi vector.8
) to Scalar Quantizer for ARM architectureIssues Resolved
#1138
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.