feat: upgrade query adapter algorithm #157

lsorber · 2025-06-06T19:17:14Z

Changes:

Only insert missing evals in _bench.py.
Only compute the query adapter if missing in _bench.py.
Add a 'RAGLite with query adapter' to the benchmark command.
Output the current document id as a tqdm postfix in insert_documents.
Parallelise extraction of triplets in _query_adapter.py.
Improve the query adapter by introducing per-query target weights and optimize these on a validation set of evals with L-BFGS.
Add tests to verify that the gradient formula is correct.

Copilot

Pull Request Overview

This PR upgrades the query adapter algorithm by refactoring its implementation for multi-threaded triplet extraction, weight optimization via L-BFGS, and adds gradient-based unit tests. It also extends typing for float tensors, refines the CLI bench command with better error handling and reranker support, and enhances progress bar outputs in document insertion.

Refactored _query_adapter.py to introduce helper functions (_extract_triplets, _optimize_query_target, _compute_query_adapter_grad, etc.), multi-threading, and weight optimization with SciPy’s minimize.
Added FloatTensor alias in _typing.py and a test_query_adapter_grad in tests/test_query_adapter.py to validate the gradient.
Improved the CLI bench command (_cli.py, _bench.py), including a prescore step, optional reranker integration, and user-friendly import errors.
Minor enhancement: show inserted document IDs in the progress bar during _insert.py.

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/test_query_adapter.py	Added gradient correctness test using `scipy.check_grad` and increased `num_evals` for stability
src/raglite/_typing.py	Introduced `FloatTensor` alias for 3-D float arrays
src/raglite/_query_adapter.py	Full refactor of query-adapter optimization logic, threading, and weight learning
src/raglite/_insert.py	Show document ID in insertion progress bar
src/raglite/_cli.py	Guard bench imports, expose reranker option
src/raglite/_bench.py	Added `prescore` hook and reranker support in evaluator

Comments suppressed due to low confidence (1)

src/raglite/_query_adapter.py:80

[nitpick] The _compute_query_adapter helper now implements the core transform logic for both 'dot' and 'cosine' metrics but lacks direct unit tests. Consider adding tests that verify its output under known inputs.

def _compute_query_adapter(

src/raglite/_query_adapter.py

feat: upgrade query adapter algorithm

f0cecc5

lsorber requested review from ThomasDelsart and Copilot June 6, 2025 19:17

lsorber self-assigned this Jun 6, 2025

Copilot AI reviewed Jun 6, 2025

View reviewed changes

src/raglite/_query_adapter.py Outdated Show resolved Hide resolved

src/raglite/_query_adapter.py Show resolved Hide resolved

lsorber mentioned this pull request Jun 6, 2025

Parallelize query adapter code #154

Closed

feat: apply silu activation in objective function

67029bf

lsorber marked this pull request as draft June 16, 2025 07:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: upgrade query adapter algorithm #157

feat: upgrade query adapter algorithm #157

Uh oh!

lsorber commented Jun 6, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

feat: upgrade query adapter algorithm #157

Are you sure you want to change the base?

feat: upgrade query adapter algorithm #157

Uh oh!

Conversation

lsorber commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lsorber commented Jun 6, 2025 •

edited

Loading