Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Faiss SQFP16 Quantization and enable AVX2 Optimization #1421

Merged

Conversation

naveentatikonda
Copy link
Member

@naveentatikonda naveentatikonda commented Jan 25, 2024

Description

This PR includes the following changes:

  • Adds a new encoder sq of type fp16 to the Faiss engine which quantizes 32 bit float vectors into 16 bit float values for both hnsw and ivf methods using the Faiss SQFP16 Scalar Quantizer which optimizes memory upto 48% with a very minimal loss of precision. The UX is shown below:
    "encoder":{
      "name":"sq",
      "parameters":{
        "type": "fp16"
      }
    }
  • Enabling Faiss AVX2 SIMD optimization, which helps to boost the overall performance on x86 architecture (reduce the query latencies and improve indexing throughput but the vector dimension must be multiple of 8).
  • Added a patch script to fix CI build on Linux.
  • Update the Faiss submodule to the commit 0013c70 to include the FAISS_ALWAYS_INLINE directives which improves the performance of Scalar Quantizer. Soon, the submodule will be updated to include the SIMD NEON changes after fixing the conflict issues wrt to Multi vector.
  • Added a temporary patch file which includes the Faiss SIMD NEON optimization (vector dimension must be multiple of 8) to Scalar Quantizer for ARM architecture

Issues Resolved

#1138

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link

codecov bot commented Jan 25, 2024

Codecov Report

Attention: 4 lines in your changes are missing coverage. Please review.

Comparison is base (fe592f5) 84.90% compared to head (8e4f404) 85.02%.
Report is 1 commits behind head on main.

Files Patch % Lines
.../main/java/org/opensearch/knn/index/Parameter.java 69.23% 3 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1421      +/-   ##
============================================
+ Coverage     84.90%   85.02%   +0.11%     
- Complexity     1273     1277       +4     
============================================
  Files           167      167              
  Lines          5186     5207      +21     
  Branches        491      493       +2     
============================================
+ Hits           4403     4427      +24     
+ Misses          574      572       -2     
+ Partials        209      208       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabling Faiss AVX2 SIMD optimization, which helps to boost the overall performance on x86 architecture (reduce the query latencies and improve indexing throughput but the vector dimension must be multiple of 8).

One concern I have with this change is that this could break bwc for users running machines without avx2. What is the plan for handling this?

Update the Faiss submodule.

What did we update the submodule to and why? Is this documented somewhere? I see there is mention of temporary patches - is there a comment on GH explaining this?

scripts/linuxScript.sh Outdated Show resolved Hide resolved
src/main/java/org/opensearch/knn/common/KNNConstants.java Outdated Show resolved Hide resolved
@naveentatikonda naveentatikonda force-pushed the add_support_for_sqfp16 branch 3 times, most recently from 223bd84 to e29fc2b Compare January 31, 2024 22:42
jni/CMakeLists.txt Show resolved Hide resolved
jni/CMakeLists.txt Outdated Show resolved Hide resolved
jni/CMakeLists.txt Show resolved Hide resolved
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job. We should add a validation that with the index description we pass, we correctly create an SQ index (https://github.com/opensearch-project/k-NN/tree/main/jni/tests).

Also, its nice we didnt have to modify our jni wrapper at all.

build.gradle Outdated Show resolved Hide resolved
@@ -111,7 +111,11 @@ endif ()
if (${CONFIG_FAISS} STREQUAL ON OR ${CONFIG_ALL} STREQUAL ON OR ${CONFIG_TEST} STREQUAL ON)
set(BUILD_TESTING OFF) # Avoid building faiss tests
set(BLA_STATIC ON) # Statically link BLAS
set(FAISS_OPT_LEVEL generic) # Keep optimization level generic
if(WIN32 OR ${CMAKE_SYSTEM_PROCESSOR} MATCHES "aarch64" OR NOT ${ENABLE_SIMD})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wrap some variables in ${} and not WIN32? Lets be consistent here. Also, can you confirm that WIN32 will refer to the var and not a constant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the variable. WIN32 is a variable which is True on a Windows machine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right but why doesnt it need to be wrapped in ${} like ${WIN32}?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. I took this reference, tried and it worked
https://stackoverflow.com/a/17845462

jni/CMakeLists.txt Outdated Show resolved Hide resolved
src/test/java/org/opensearch/knn/index/FaissIT.java Outdated Show resolved Hide resolved
heemin32
heemin32 previously approved these changes Feb 1, 2024
src/test/java/org/opensearch/knn/jni/JNIServiceTests.java Outdated Show resolved Hide resolved
src/test/java/org/opensearch/knn/jni/JNIServiceTests.java Outdated Show resolved Hide resolved
src/test/java/org/opensearch/knn/jni/JNIServiceTests.java Outdated Show resolved Hide resolved
src/test/java/org/opensearch/knn/jni/JNIServiceTests.java Outdated Show resolved Hide resolved
src/test/java/org/opensearch/knn/index/FaissIT.java Outdated Show resolved Hide resolved
jni/tests/faiss_wrapper_test.cpp Show resolved Hide resolved
.github/workflows/CI.yml Outdated Show resolved Hide resolved
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM nice job!

@naveentatikonda naveentatikonda merged commit 5f9511b into opensearch-project:main Feb 2, 2024
50 of 51 checks passed
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-1421-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 5f9511b6407691543b1ea0bedd30f3a11f076236
# Push it to GitHub
git push --set-upstream origin backport/backport-1421-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-1421-to-2.x.

naveentatikonda added a commit to naveentatikonda/k-NN that referenced this pull request Feb 2, 2024
opensearch-project#1421)

* Add Support for Faiss SQFP16 and enable Faiss AVX2 Optimization

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Add Patch Script to fix build on Linux CI

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Disable AVX2 support on Windows

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Add CHANGELOG

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Update Faiss Submodule

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Update UX Interface

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Add Parameter to enable SIMD

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Update DEVELOPER_GUIDE

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
naveentatikonda added a commit to naveentatikonda/k-NN that referenced this pull request Feb 2, 2024
opensearch-project#1421)

* Add Support for Faiss SQFP16 and enable Faiss AVX2 Optimization

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Add Patch Script to fix build on Linux CI

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Disable AVX2 support on Windows

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Add CHANGELOG

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Update Faiss Submodule

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Update UX Interface

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Add Parameter to enable SIMD

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Update DEVELOPER_GUIDE

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Address Review Comments

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
naveentatikonda added a commit that referenced this pull request Feb 5, 2024
#1421) (#1448)

* Add Support for Faiss SQFP16 and enable Faiss AVX2 Optimization



* Add Patch Script to fix build on Linux CI



* Disable AVX2 support on Windows



* Add CHANGELOG



* Update Faiss Submodule



* Address Review Comments



* Update UX Interface



* Add Parameter to enable SIMD



* Update DEVELOPER_GUIDE



* Address Review Comments



---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
naveentatikonda added a commit to naveentatikonda/k-NN that referenced this pull request Feb 6, 2024
…imization (opensearch-project#1421) (opensearch-project#1448)"

This reverts commit 0e17711.

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
naveentatikonda added a commit that referenced this pull request Feb 6, 2024
* Revert "Set Default value for SIMD_ENABLED in CMakeList (#1455)"

This reverts commit fc2c154.

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

* Revert "Add support for Faiss SQFP16 Quantization and enable AVX2 Optimization (#1421) (#1448)"

This reverts commit 0e17711.

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Features Introduces a new unit of functionality that satisfies a requirement v2.12.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants