Skip to content

Commit

Permalink
Add documentation for Faiss encoder SQfp16
Browse files Browse the repository at this point in the history
Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
  • Loading branch information
naveentatikonda committed Jan 24, 2024
1 parent d41ccb8 commit a9255f8
Showing 1 changed file with 39 additions and 1 deletion.
40 changes: 39 additions & 1 deletion _search-plugins/knn/knn-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ Lucene HNSW implementation ignores `ef_search` and dynamically sets it to the v
### Supported faiss encoders

You can use encoders to reduce the memory footprint of a k-NN index at the expense of search accuracy. faiss has
several encoder types, but the plugin currently only supports *flat* and *pq* encoding.
several encoder types, but the plugin currently only supports *flat*, *pq* and *SQfp16* encoding.

Check failure on line 119 in _search-plugins/knn/knn-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: pq. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: pq. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/knn/knn-index.md", "range": {"start": {"line": 119, "column": 72}}}, "severity": "ERROR"}

The following example method definition specifies the `hnsw` method and a `pq` encoder:

Expand Down Expand Up @@ -144,6 +144,10 @@ Encoder name | Requires training | Description
:--- | :--- | :---
`flat` | false | Encode vectors as floating point arrays. This encoding does not reduce memory footprint.
`pq` | true | An abbreviation for _product quantization_, it is a lossy compression technique that uses clustering to encode a vector into a fixed size of bytes, with the goal of minimizing the drop in k-NN search accuracy. At a high level, vectors are broken up into `m` subvectors, and then each subvector is represented by a `code_size` code obtained from a code book produced during training. For more information about product quantization, see [this blog post](https://medium.com/dotstar/understanding-faiss-part-2-79d90b1e5388).
`SQfp16` | false | Starting with k-NN plugin version 2.12, SQfp16 encoder can be used to quantize 32 bit floating point vectors into 16 bit floats using the Faiss inbuilt ScalarQuantizer in order to reduce the memory footprint with a minimal loss of precision. Besides memory optimization, the overall performance has been improved with the SIMD optimization (using `AVX2` on `x86` architecture and using `NEON` on `ARM` architecture).

For `SQfp16` encoder, the SIMD optimization is supported only on Linux and Mac OS. But, the encoder can be still used on Windows OS with a reduced performance (but with same memory optimization). In order to enable the SIMD support, the vector dimension must be a multiple of **8**.
{: .important}

#### Examples

Expand Down Expand Up @@ -195,6 +199,40 @@ The following example uses the `hnsw` method without specifying an encoder (by d
}
```

The following example uses the `hnsw` method with a `SQfp16` encoder:

```json
"method": {
"name":"hnsw",
"engine":"faiss",
"space_type": "l2",
"parameters":{
"encoder": {
"name": "SQfp16"
},
"ef_construction": 256,
"m": 8
}
}
```

The following example uses the `ivf` method with a `SQfp16` encoder:

```json
"method": {
"name":"ivf",
"engine":"faiss",
"space_type": "l2",
"parameters":{
"encoder": {
"name": "SQfp16"
},
"nprobes": 2
}
}
```


#### PQ parameters

Paramater Name | Required | Default | Updatable | Description
Expand Down

0 comments on commit a9255f8

Please sign in to comment.