Skip to content

[Documentation] Indicate dense vs sparse search in HybridSearchConfig #232

@alberto-agudo

Description

@alberto-agudo

The current configuration of HybridSearch mentions primary_search and secondary_search:

primary_search_results: A list of (document, distance) tuples from
the primary search.
secondary_search_results: A list of (document, distance) tuples from
the secondary search.

Even if you'd think primary search is dense (embeddings-based) and secondary is sparse (keyword-based), there's no guarantee in the class definition that it works this way. There's no need to mention this in the actual function arguments, but I think it is necessary in the docstring.

I didn't realize the actual meaning of primary and secondary until I found the following lines when digging in the code:

combined_results = hybrid_search_config.fusion_function(
dense_results,
sparse_results,
**hybrid_search_config.fusion_function_parameters,
)

Due to the current lack of documentation, it would greatly help others with the current lack of documentation to just change these lines in the HybridSearchConfig definition.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions