Skip to content

Integrate ANN search #78473

Closed
Closed
@jtibshirani

Description

@jtibshirani

Background

Currently Elasticsearch supports storing vectors through the dense_vector field type and using them when scoring documents. This allows users to perform an exact k-nearest neighbors (kNN) search by scanning all documents. This work builds on that functionality to support fast, approximate nearest neighbor search (ANN). The implementation will use Lucene's new ANN support, which is based on the HNSW algorithm. Since Lucene will ship ANN in its upcoming 9.0 release, this feature will only target Elasticsearch 8.x.

Our plan is to extend the dense_vector field type to support adding vectors to an ANN index. We'll then add a new REST endpoint focused on kNN search. This new endpoint will be marked 'experimental' in the first release, as we expect to make API improvements in response to feedback. At first the endpoint will only perform kNN, but we'll follow-up with support for filtering, hybrid retrieval, aggregations, and more. We are really looking forward to everyone's feedback, which will help define the feature and set its direction.

Implementation Plan

Phase 0: Help prepare Lucene's HNSW implementation

Phase 1: Basic ANN support

Future Plans: Improvements to functionality and performance

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions