Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[META] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 #123

Closed
10 of 11 tasks
vamshin opened this issue Feb 23, 2023 · 6 comments
Assignees
Labels
Features Introduces a new unit of functionality that satisfies a requirement neural-search

Comments

@vamshin
Copy link
Member

vamshin commented Feb 23, 2023

Is your feature request related to a problem?

BM25 works well in exact match use cases and k-NN score works well in understanding context and getting relevant documents. It is important to get benefits from both of these relevancy mechanisms and one could achieve by combining these scores. One caveat is scores are on different scales and hence some kind of normalization is required.

Older Issues and Discussions:

  1. [FEATURE] Hybrid search using keyword matching and kNN k-NN#717
  2. Discussion on combination of result sets from different query types OpenSearch#4557
  3. Science Benchmarks: https://opensearch.org/blog/semantic-science-benchmarks/

Tasks

High Level Tasks:

Community Requests

  1. https://forum.opensearch.org/t/normalisation-in-hybrid-search/12996
@vamshin vamshin changed the title [FEATURE] Score Combination and Normalization for Semantics Search Score Normalization for k-NN and BM25 [FEATURE] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 Feb 23, 2023
@vamshin vamshin added Features Introduces a new unit of functionality that satisfies a requirement neural-search and removed untriaged labels Feb 23, 2023
@vamshin vamshin changed the title [FEATURE] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 Feb 23, 2023
@navneet1v navneet1v changed the title Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 [META] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 Feb 24, 2023
@navneet1v navneet1v changed the title [META] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 Feb 24, 2023
@navneet1v navneet1v changed the title Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 [META] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 Feb 24, 2023
@navneet1v
Copy link
Collaborator

RFC created with HLD: #126

@martin-gaievski
Copy link
Member

RFC with design for Query: #174
Created new Feature issue for Query: #175

@DarshitChanpura
Copy link
Member

@vamshin Should this issue be moved to 2.11?

@vamshin vamshin added the v2.10.0 Issues targeting release v2.10.0 label Sep 8, 2023
@vamshin
Copy link
Member Author

vamshin commented Sep 8, 2023

@DarshitChanpura we are releasing this feature for 2.10

@navneet1v navneet1v removed the v2.10.0 Issues targeting release v2.10.0 label Sep 22, 2023
@navneet1v
Copy link
Collaborator

@martin-gaievski can we add the perf results on this issue and then resolve this github issue.

@martin-gaievski
Copy link
Member

martin-gaievski commented Sep 29, 2023

Summary for performance benchmark

dataset p50 bool (baseline) p50 hybrid p50 difference, ms p90 bool (baseline) p90 hybrid p90 difference, ms p99 bool (baseline) p99 hybrid p99 difference, ms
nfcorpus 35.1 37 1.9 53 54 1 60.8 62.3 1.5
trec-covid 58.1 61.6 3.5 66.4 70 3.6 70.5 74.6 4.1
scidocs 54.9 57 2.1 66.4 68.6 2.2 81.3 83.2 1.9
quora 61 69 8 69 78 9 73.4 84 10.6
amazon esci 49 50 1 58 59.4 1.4 67 70 3
dbpedia 100.8 107.7 6.9 117 130.9 13.9 129.8 150.2 20.4
fiqa 53.9 56.9 3 61.9 65 3.1 64 67.7 3.7
% change vs. Boolean 6.40% 6.96% 8.27%

We took a bool query with neural and match sub-queries as a baseline.

For the cluster configuration, we used 3 data nodes of type “r5.8xlarge” and 1 leader node of type “c4.2xlarge”. All scripts that we use for benchmarks can be found in this repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Features Introduces a new unit of functionality that satisfies a requirement neural-search
Projects
Status: Done
Development

No branches or pull requests

4 participants