Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions _search-plugins/knn/knn-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -368,6 +368,7 @@ Setting | Default | Updatable | Description
:--- | :--- | :--- | :---
`index.knn` | false | false | Whether the index should build native library indexes for the `knn_vector` fields. If set to false, the `knn_vector` fields will be stored in doc values, but approximate k-NN search functionality will be disabled.
`index.knn.algo_param.ef_search` | 100 | true | The size of the dynamic list used during k-NN searches. Higher values result in more accurate but slower searches. Only available for NMSLIB.
`index.knn.advanced.approximate_threshold` | 15,000 | true | The number of vectors a segment must have before creating specialized data structures for approximate search. Set to `-1` to disable building vector data structures and `0` to always build them.
`index.knn.algo_param.ef_construction` | 100 | false | Deprecated in 1.0.0. Instead, use the [mapping parameters](https://opensearch.org/docs/latest/search-plugins/knn/knn-index/#method-definitions) to set this value.
`index.knn.algo_param.m` | 16 | false | Deprecated in 1.0.0. Use the [mapping parameters](https://opensearch.org/docs/latest/search-plugins/knn/knn-index/#method-definitions) to set this value instead.
`index.knn.space_type` | l2 | false | Deprecated in 1.0.0. Use the [mapping parameters](https://opensearch.org/docs/latest/search-plugins/knn/knn-index/#method-definitions) to set this value instead.
Expand Down
69 changes: 67 additions & 2 deletions _search-plugins/knn/performance-tuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ If your hardware has multiple cores, you can allow multiple threads in native li
Monitor CPU utilization and choose the correct number of threads. Because native library index construction is costly, choosing more threads then you need can cause additional CPU load.


### (Expert-level) Disable vector field storage in the source field
### (Expert level) Disable vector field storage in the source field

The `_source` field contains the original JSON document body that was passed at index time. This field is not indexed and is not searchable but is stored so that it can be returned when executing fetch requests such as `get` and `search`. When using vector fields within the source, you can remove the vector field to save disk space, as shown in the following example where the `location` vector is excluded:

Expand Down Expand Up @@ -95,9 +95,74 @@ In OpenSearch 2.15 or later, you can further improve indexing speed and reduce d
}
```

This is an expert-level setting. Disabling the `_recovery_source` may lead to failures during peer-to-peer recovery. Before disabling the `_recovery_source`, check with your OpenSearch cluster admin to determine whether your cluster performs regular flushes before starting the peer-to-peer recovery of shards before disabling the `_recovery_source`.
This is an expert-level setting. Disabling the `_recovery_source` may lead to failures during peer-to-peer recovery. Before disabling the `_recovery_source`, check with your OpenSearch cluster admin to determine whether your cluster performs regular flushes before starting the peer-to-peer recovery of shards prior to disabling the `_recovery_source`.
{: .warning}

### (Expert level) Build vector data structures on demand

This approach is recommended only for workloads that involve a single initial bulk upload and will be used exclusively for search after force merging to a single segment.

During indexing, vector search builds a specialized data structure for a `knn_vector` field to enable efficient approximate k-NN search. However, these structures are rebuilt during [force merge]({{site.url}}{{site.baseurl}}/api-reference/index-apis/force-merge/) on k-NN indexes. To optimize indexing speed, follow these steps:

1. **Disable vector data structure creation**: Disable vector data structure creation for new segments by setting [`index.knn.advanced.approximate_threshold`]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#index-settings) to `-1`.

To specify the setting at index creation, send the following request:

```json
PUT /test-index/
{
"settings": {
"index.knn.advanced.approximate_threshold": "-1"
}
}
```
{% include copy-curl.html %}

To specify the setting after index creation, send the following request:

```json
PUT /test-index/_settings
{
"index.knn.advanced.approximate_threshold": "-1"
}
```
{% include copy-curl.html %}

1. **Perform bulk indexing**: Index data in [bulk]({{site.url}}{{site.baseurl}}/api-reference/document-apis/bulk/) without performing any searches during ingestion:

```json
POST _bulk
{ "index": { "_index": "test-index", "_id": "1" } }
{ "my_vector1": [1.5, 2.5], "price": 12.2 }
{ "index": { "_index": "test-index", "_id": "2" } }
{ "my_vector1": [2.5, 3.5], "price": 7.1 }
```
{% include copy-curl.html %}

If searches are performed while vector data structures are disabled, they will run using exact k-NN search.

1. **Reenable vector data structure creation**: Once indexing is complete, enable vector data structure creation by setting `index.knn.advanced.approximate_threshold` to `0`:

```json
PUT /test-index/_settings
{
"index.knn.advanced.approximate_threshold": "0"
}
```
{% include copy-curl.html %}

If you do not reset the setting to `0` before the force merge, you will need to reindex your data.
{: .note}

1. **Force merge segments into one segment**: Perform a force merge and specify `max_num_segments=1` to create the vector data structures only once:

```json
POST test-index/_forcemerge?max_num_segments=1
```
{% include copy-curl.html %}

After the force merge, new search requests will execute approximate k-NN search using the newly created data structures.

## Search performance tuning

Take the following steps to improve search performance:
Expand Down
Loading