[META] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 #123

vamshin · 2023-02-23T19:19:07Z

navneet1v · 2023-02-28T01:08:16Z

RFC created with HLD: #126

martin-gaievski · 2023-05-19T21:49:43Z

RFC with design for Query: #174
Created new Feature issue for Query: #175

DarshitChanpura · 2023-09-08T17:07:22Z

@vamshin Should this issue be moved to 2.11?

vamshin · 2023-09-08T18:41:07Z

@DarshitChanpura we are releasing this feature for 2.10

navneet1v · 2023-09-22T21:10:21Z

@martin-gaievski can we add the perf results on this issue and then resolve this github issue.

martin-gaievski · 2023-09-29T22:08:17Z

Summary for performance benchmark

dataset	p50 bool (baseline)	p50 hybrid	p50 difference, ms	p90 bool (baseline)	p90 hybrid	p90 difference, ms	p99 bool (baseline)	p99 hybrid	p99 difference, ms
nfcorpus	35.1	37	1.9	53	54	1	60.8	62.3	1.5
trec-covid	58.1	61.6	3.5	66.4	70	3.6	70.5	74.6	4.1
scidocs	54.9	57	2.1	66.4	68.6	2.2	81.3	83.2	1.9
quora	61	69	8	69	78	9	73.4	84	10.6
amazon esci	49	50	1	58	59.4	1.4	67	70	3
dbpedia	100.8	107.7	6.9	117	130.9	13.9	129.8	150.2	20.4
fiqa	53.9	56.9	3	61.9	65	3.1	64	67.7	3.7
% change vs. Boolean			6.40%			6.96%			8.27%

We took a bool query with neural and match sub-queries as a baseline.

For the cluster configuration, we used 3 data nodes of type “r5.8xlarge” and 1 leader node of type “c4.2xlarge”. All scripts that we use for benchmarks can be found in this repository.

vamshin added the untriaged label Feb 23, 2023

vamshin assigned navneet1v and martin-gaievski Feb 23, 2023

vamshin changed the title ~~[FEATURE] Score Combination and Normalization for Semantics Search Score Normalization for k-NN and BM25~~ [FEATURE] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 Feb 23, 2023

vamshin added Features Introduces a new unit of functionality that satisfies a requirement neural-search and removed untriaged labels Feb 23, 2023

vamshin changed the title ~~[FEATURE] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25~~ Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 Feb 23, 2023

navneet1v changed the title ~~Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25~~ [META] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 Feb 24, 2023

navneet1v changed the title ~~[META] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25~~ Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 Feb 24, 2023

This was referenced Feb 24, 2023

[FEATURE] Hybrid search using keyword matching and kNN opensearch-project/k-NN#717

Closed

Discussion on combination of result sets from different query types opensearch-project/OpenSearch#4557

Closed

navneet1v changed the title ~~Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25~~ [META] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 Feb 24, 2023

navneet1v mentioned this issue Feb 28, 2023

[RFC] High Level Approach and Design For Normalization and Score Combination #126

Closed

navneet1v mentioned this issue Apr 5, 2023

Enabling Multiple QueryPhaseSearcher in OpenSearch opensearch-project/OpenSearch#7020

Open

martin-gaievski mentioned this issue May 19, 2023

[RFC] Low Level Design for Normalization and Score Combination Query #174

Closed

martin-gaievski mentioned this issue Jun 3, 2023

[RFC] Low Level Design for Normalization and Score Combination Query Phase Searcher #193

Closed

This was referenced Aug 3, 2023

Added Score Normalization and Combination feature #241

Merged

Reworked feature flag usage for Hybrid search feature #246

Merged

martin-gaievski mentioned this issue Aug 22, 2023

Changed format for hybrid query results to a single list of scores with delimiter #259

Merged

4 tasks

vamshin added this to Vector Search RoadMap Aug 22, 2023

vamshin moved this to 2.10 (September 11th) in Vector Search RoadMap Aug 22, 2023

martin-gaievski mentioned this issue Aug 28, 2023

Added validations for score combination weights in Hybrid Search #265

Merged

3 tasks

peterzhuamazon mentioned this issue Aug 31, 2023

[RELEASE] Release version 2.10.0 opensearch-project/opensearch-build#3743

Closed

71 tasks

martin-gaievski mentioned this issue Sep 4, 2023

[FEATURE] Implement parallel execution of sub-queries for hybrid search #279

Closed

vamshin added the v2.10.0 Issues targeting release v2.10.0 label Sep 8, 2023

navneet1v removed the v2.10.0 Issues targeting release v2.10.0 label Sep 22, 2023

martin-gaievski closed this as completed Sep 29, 2023

github-project-automation bot moved this from 2.10 (September 11th, 2023) to ✅ Done in Vector Search RoadMap Sep 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[META] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 #123

[META] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 #123

vamshin commented Feb 23, 2023 •

edited by navneet1v

Loading

navneet1v commented Feb 28, 2023

martin-gaievski commented May 19, 2023

DarshitChanpura commented Sep 8, 2023

vamshin commented Sep 8, 2023

navneet1v commented Sep 22, 2023

martin-gaievski commented Sep 29, 2023 •

edited

Loading

[META] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 #123

[META] Score Combination and Normalization for Semantics Search. Score Normalization for k-NN and BM25 #123

Comments

vamshin commented Feb 23, 2023 • edited by navneet1v Loading

Is your feature request related to a problem?

Older Issues and Discussions:

Tasks

Community Requests

navneet1v commented Feb 28, 2023

martin-gaievski commented May 19, 2023

DarshitChanpura commented Sep 8, 2023

vamshin commented Sep 8, 2023

navneet1v commented Sep 22, 2023

martin-gaievski commented Sep 29, 2023 • edited Loading

vamshin commented Feb 23, 2023 •

edited by navneet1v

Loading

martin-gaievski commented Sep 29, 2023 •

edited

Loading