Skip to content

Commit 29d6661

Browse files
committed
Add new indexing parameter and update performance tuning instruction
Signed-off-by: Vijayan Balasubramanian <balasvij@amazon.com>
1 parent 95c8a8a commit 29d6661

File tree

2 files changed

+16
-0
lines changed

2 files changed

+16
-0
lines changed

_search-plugins/knn/knn-index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -368,6 +368,7 @@ Setting | Default | Updatable | Description
368368
:--- | :--- | :--- | :---
369369
`index.knn` | false | false | Whether the index should build native library indexes for the `knn_vector` fields. If set to false, the `knn_vector` fields will be stored in doc values, but approximate k-NN search functionality will be disabled.
370370
`index.knn.algo_param.ef_search` | 100 | true | The size of the dynamic list used during k-NN searches. Higher values result in more accurate but slower searches. Only available for NMSLIB.
371+
`index.knn.advanced.approximate_threshold` | 15000 | true | This threshold defines the minimum number of vectors required in segment, before creating specialized vector data structures for approximate search. Set `-1` to disable building vector data structures and `0` to build always.
371372
`index.knn.algo_param.ef_construction` | 100 | false | Deprecated in 1.0.0. Instead, use the [mapping parameters](https://opensearch.org/docs/latest/search-plugins/knn/knn-index/#method-definitions) to set this value.
372373
`index.knn.algo_param.m` | 16 | false | Deprecated in 1.0.0. Use the [mapping parameters](https://opensearch.org/docs/latest/search-plugins/knn/knn-index/#method-definitions) to set this value instead.
373374
`index.knn.space_type` | l2 | false | Deprecated in 1.0.0. Use the [mapping parameters](https://opensearch.org/docs/latest/search-plugins/knn/knn-index/#method-definitions) to set this value instead.

_search-plugins/knn/performance-tuning.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -98,6 +98,21 @@ In OpenSearch 2.15 or later, you can further improve indexing speed and reduce d
9898
This is an expert-level setting. Disabling the `_recovery_source` may lead to failures during peer-to-peer recovery. Before disabling the `_recovery_source`, check with your OpenSearch cluster admin to determine whether your cluster performs regular flushes before starting the peer-to-peer recovery of shards before disabling the `_recovery_source`.
9999
{: .warning}
100100

101+
### (Expert-level) Build vector data structures on demand
102+
103+
This should be considered only for workloads where indexing happens as one initial bulk upload and will be available only for search after force merging to 1 segment.
104+
105+
During indexing, vector search builds specialized data structure for a knn_vector field to support approximate neighbor search for efficiently finding k nearest neighbors. However, during `forcemerge`, k-NN indices rebuilds those data structures from beginning. Hence, to speed up indexing, you can update the index settings to improve overall indexing time
106+
107+
108+
* Either create an index or update with setting [index.knn.advanced.approximate_threshold]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#index-settings) to `-1` . This will disable building vector data structures for new segments.
109+
* Perform bulk indexing and, makes sure no search is performed while ingestion. If search is performed when vector data structures are disabled, exact search will be executed in the meantime.
110+
* Once the indexing is completed, reenable the setting by updating index.knn.advanced.approximate_threshold to `0`.
111+
* Perform force merge to max_segments = 1 to build vector data structure one time.
112+
* After force merge, new search request will always execute approximate k-nn search as expected.
113+
114+
If you forgot to update the setting to 0 before force merge, you have to reindex data.
115+
101116
## Search performance tuning
102117

103118
Take the following steps to improve search performance:

0 commit comments

Comments
 (0)