Skip to content

Investigate various implementations of ann search for vector fields #42326

Closed
@mayya-sharipova

Description

@mayya-sharipova

ann (approximate nearest neighbours) will be a licensed feature of Elasticsearch (not OSS).

We plan to implement prototypes of various algorithms for ann for different distance metrics:

  • LSH and multiprobe LSH for euclidean distance
  • partition trees for euclidean/cosine distance
  • clustering-based approaches, including product quantization

We are interested in users' feedback about:

  • application domains
  • what distance metrics are used (euclidean vs cosine vs Hamming etc).
  • what are data types in vectors (vectors of integers, bits, longs, floats etc).

We have decided to adopt Lucene implementations on ann search, so development of ann search is moved here. Relevant Lucene issues: https://issues.apache.org/jira/browse/LUCENE-9004, https://issues.apache.org/jira/browse/LUCENE-9322, https://issues.apache.org/jira/browse/LUCENE-9136

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions