Description
Background
Currently Elasticsearch supports storing vectors through the dense_vector
field type and using them when scoring documents. This allows users to perform an exact k-nearest neighbors (kNN) search by scanning all documents. This work builds on that functionality to support fast, approximate nearest neighbor search (ANN). The implementation will use Lucene's new ANN support, which is based on the HNSW algorithm. Since Lucene will ship ANN in its upcoming 9.0 release, this feature will only target Elasticsearch 8.x.
Our plan is to extend the dense_vector
field type to support adding vectors to an ANN index. We'll then add a new REST endpoint focused on kNN search. This new endpoint will be marked 'experimental' in the first release, as we expect to make API improvements in response to feedback. At first the endpoint will only perform kNN, but we'll follow-up with support for filtering, hybrid retrieval, aggregations, and more. We are really looking forward to everyone's feedback, which will help define the feature and set its direction.
Implementation Plan
Phase 0: Help prepare Lucene's HNSW implementation
- Run benchmarks to more deeply understand performance (https://issues.apache.org/jira/browse/LUCENE-9937)
- Ensure Lucene API has required features and plugin points
- Help resolve vector-related blockers to releasing Lucene 9.0
Phase 1: Basic ANN support
- Update
dense_vector
field type to support ANN indexing - Add new API that supports ANN
- Fix issues that pop up in Lucene
- Performance testing and improvements
- Update documentation
Future Plans: Improvements to functionality and performance
- Support ANN with filtering #81788
- Support search timeouts and cancellation
- Support "hybrid retrieval", where kNN results are combined with matches from another query
- Narrow performance gap between Lucene HNSW and nmslib
- Support other vector element types (bfloat16, integers, etc.)
- Figure out best way to support "maximum inner product search" (dot product similarity with unnormalized vectors)
- Allow representing a document with multiple embeddings (dense vectors) #72068