Add quantization techniques and links to byte vector (#4893)

kolchfa-aws · Naarcha-AWS · natebower · vagimeli · commit a100d92cf777 · 2023-12-20T15:35:50.000-07:00
* Add quantization techniquest and links to byte vector

Signed-off-by: Fanit Kolchina &lt;kolchfa@amazon.com&gt;

* Added cosine similarity space type quantization

Signed-off-by: Fanit Kolchina &lt;kolchfa@amazon.com&gt;

* Rewording

Signed-off-by: Fanit Kolchina &lt;kolchfa@amazon.com&gt;

* Tech review feedback

Signed-off-by: Fanit Kolchina &lt;kolchfa@amazon.com&gt;

* Tech review feedback

Signed-off-by: Fanit Kolchina &lt;kolchfa@amazon.com&gt;

* Remove redundant line

Signed-off-by: Fanit Kolchina &lt;kolchfa@amazon.com&gt;

* Update _field-types/supported-field-types/knn-vector.md

Co-authored-by: Naarcha-AWS &lt;97990722+Naarcha-AWS@users.noreply.github.com&gt;
Signed-off-by: kolchfa-aws &lt;105444904+kolchfa-aws@users.noreply.github.com&gt;

* Apply suggestions from code review

Co-authored-by: Nathan Bower &lt;nbower@amazon.com&gt;
Signed-off-by: kolchfa-aws &lt;105444904+kolchfa-aws@users.noreply.github.com&gt;

* Update _field-types/supported-field-types/knn-vector.md

Signed-off-by: kolchfa-aws &lt;105444904+kolchfa-aws@users.noreply.github.com&gt;

* Editorial feedback

Signed-off-by: Fanit Kolchina &lt;kolchfa@amazon.com&gt;

---------

Signed-off-by: Fanit Kolchina &lt;kolchfa@amazon.com&gt;
Signed-off-by: kolchfa-aws &lt;105444904+kolchfa-aws@users.noreply.github.com&gt;
Co-authored-by: Naarcha-AWS &lt;97990722+Naarcha-AWS@users.noreply.github.com&gt;
Co-authored-by: Nathan Bower &lt;nbower@amazon.com&gt;
diff --git a/_field-types/supported-field-types/knn-vector.md b/_field-types/supported-field-types/knn-vector.md
@@ -4,11 +4,12 @@ title: k-NN vector
 nav_order: 58
 has_children: false
 parent: Supported field types
+has_math: true
 ---
 
 # k-NN vector 
 
-The k-NN plugin introduces a custom data type, the `knn_vector`, that allows users to ingest their k-NN vectors
+The [k-NN plugin]({{site.url}}{{site.baseurl}}/search-plugins/knn/index/) introduces a custom data type, the `knn_vector`, that allows users to ingest their k-NN vectors.
 into an OpenSearch index and perform different kinds of k-NN search. The `knn_vector` field is highly configurable and can serve many different k-NN workloads. In general, a `knn_vector` field can be built either by providing a method definition or specifying a model id.
 
 ## Example
@@ -47,7 +48,7 @@ PUT test-index
 
 ## Method definitions
 
-Method definitions are used when the underlying Approximate k-NN algorithm does not require training. For example, the following `knn_vector` field specifies that *nmslib*'s implementation of *hnsw* should be used for Approximate k-NN search. During indexing, *nmslib* will build the corresponding *hnsw* segment files.
+[Method definitions]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index#method-definitions) are used when the underlying [approximate k-NN]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/) algorithm does not require training. For example, the following `knn_vector` field specifies that *nmslib*'s implementation of *hnsw* should be used for approximate k-NN search. During indexing, *nmslib* will build the corresponding *hnsw* segment files.
 
 ```json
 "my_vector": {
@@ -77,7 +78,7 @@ model contains the information needed to initialize the native library segment f
 }
 ```
 
-However, if you intend to just use painless scripting or a k-NN score script, you only need to pass the dimension.
+However, if you intend to use Painless scripting or a k-NN score script, you only need to pass the dimension.
  ```json
    "type": "knn_vector",
    "dimension": 128
@@ -91,6 +92,8 @@ By default, k-NN vectors are `float` vectors, where each dimension is 4 bytes. I
 Byte vectors are supported only for the `lucene` engine. They are not supported for the `nmslib` and `faiss` engines.
 {: .note}
 
+In [k-NN benchmarking tests](https://github.com/opensearch-project/k-NN/tree/main/benchmarks/perf-tool), the use of `byte` rather than `float` vectors resulted in a significant reduction in storage and memory usage as well as improved indexing throughput and reduced query latency. Additionally, precision on recall was not greatly affected (note that recall can depend on various factors, such as the [quantization technique](#quantization-techniques) and data distribution). 
+
 When using `byte` vectors, expect some loss of precision in the recall compared to using `float` vectors. Byte vectors are useful in large-scale applications and use cases that prioritize a reduced memory footprint in exchange for a minimal loss of recall.
 {: .important}
  
@@ -163,4 +166,105 @@ GET test-index/_search
   }
 }
 ```
-{% include copy-curl.html %}
+{% include copy-curl.html %}
+
+### Quantization techniques
+
+If your vectors are of the type `float`, you need to first convert them to the `byte` type before ingesting the documents. This conversion is accomplished by _quantizing the dataset_---reducing the precision of its vectors. There are many quantization techniques, such as scalar quantization or product quantization (PQ), which is used in the Faiss engine. The choice of quantization technique depends on the type of data you're using and can affect the accuracy of recall values. The following sections describe the scalar quantization algorithms that were used to quantize the [k-NN benchmarking test](https://github.com/opensearch-project/k-NN/tree/main/benchmarks/perf-tool) data for the [L2](#scalar-quantization-for-the-l2-space-type) and [cosine similarity](#scalar-quantization-for-the-cosine-similarity-space-type) space types. The provided pseudocode is for illustration purposes only.
+
+#### Scalar quantization for the L2 space type
+
+The following example pseudocode illustrates the scalar quantization technique used for the benchmarking tests on Euclidean datasets with the L2 space type. Euclidean distance is shift invariant. If you shift both $$x$$ and $$y$$ by the same $$z$$, then the distance remains the same ($$\lVert x-y\rVert =\lVert (x-z)-(y-z)\rVert$$).
+
+```python
+# Random dataset (Example to create a random dataset)
+dataset = np.random.uniform(-300, 300, (100, 10))
+# Random query set (Example to create a random queryset)
+queryset = np.random.uniform(-350, 350, (100, 10))
+# Number of values
+B = 256
+
+# INDEXING:
+# Get min and max
+dataset_min = np.min(dataset)
+dataset_max = np.max(dataset)
+# Shift coordinates to be non-negative
+dataset -= dataset_min
+# Normalize into [0, 1]
+dataset *= 1. / (dataset_max - dataset_min)
+# Bucket into 256 values
+dataset = np.floor(dataset * (B - 1)) - int(B / 2)
+
+# QUERYING:
+# Clip (if queryset range is out of datset range)
+queryset = queryset.clip(dataset_min, dataset_max)
+# Shift coordinates to be non-negative
+queryset -= dataset_min
+# Normalize
+queryset *= 1. / (dataset_max - dataset_min)
+# Bucket into 256 values
+queryset = np.floor(queryset * (B - 1)) - int(B / 2)
+```
+{% include copy.html %}
+
+#### Scalar quantization for the cosine similarity space type
+
+The following example pseudocode illustrates the scalar quantization technique used for the benchmarking tests on angular datasets with the cosine similarity space type. Cosine similarity is not shift invariant ($$cos(x, y) \neq cos(x-z, y-z)$$). 
+
+The following pseudocode is for positive numbers:
+
+```python
+# For Positive Numbers
+
+# INDEXING and QUERYING:
+
+# Get Max of train dataset
+max = np.max(dataset)
+min = 0
+B = 127
+
+# Normalize into [0,1]
+val = (val - min) / (max - min)
+val = (val * B)
+
+# Get int and fraction values
+int_part = floor(val)
+frac_part = val - int_part
+
+if 0.5 < frac_part:
+ bval = int_part + 1
+else:
+ bval = int_part
+
+return Byte(bval)
+```
+{% include copy.html %}
+
+The following pseudocode is for negative numbers:
+
+```python
+# For Negative Numbers
+
+# INDEXING and QUERYING:
+
+# Get Min of train dataset
+min = 0
+max = -np.min(dataset)
+B = 128
+
+# Normalize into [0,1]
+val = (val - min) / (max - min)
+val = (val * B)
+
+# Get int and fraction values
+int_part = floor(var)
+frac_part = val - int_part
+
+if 0.5 < frac_part:
+ bval = int_part + 1
+else:
+ bval = int_part
+
+return Byte(bval)
+```
+{% include copy.html %}
diff --git a/_search-plugins/knn/knn-index.md b/_search-plugins/knn/knn-index.md
@@ -11,6 +11,10 @@ has_children: false
 The k-NN plugin introduces a custom data type, the `knn_vector`, that allows users to ingest their k-NN vectors
 into an OpenSearch index and perform different kinds of k-NN search. The `knn_vector` field is highly configurable and can serve many different k-NN workloads. For more information, see [k-NN vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector/).
 
+## Lucene byte vector
+
+Starting with k-NN plugin version 2.9, you can use `byte` vectors with the `lucene` engine in order to reduce the amount of storage space needed. For more information, see [Lucene byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#lucene-byte-vector).
+
 ## Method definitions
 
 A method definition refers to the underlying configuration of the Approximate k-NN algorithm you want to use. Method definitions are used to either create a `knn_vector` field (when the method does not require training) or [create a model during training]({{site.url}}{{site.baseurl}}/search-plugins/knn/api#train-model) that can then be used to [create a `knn_vector` field]({{site.url}}{{site.baseurl}}/search-plugins/knn/approximate-knn/#building-a-k-nn-index-from-a-model).