Closed
Description
Prerequisites
- I am running the latest code. Mention the version if possible as well.
- I carefully followed the README.md.
- I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- I reviewed the Discussions, and have a new and useful enhancement to share.
Feature Description
Support reranking API and models.
Motivation
Reranking is currently very common techniques used along with embeddings in RAG systems. Also there are models where same model instance can be used for both embeddings and reranking - that is great resource optimisation.
Possible Implementation
Reranking is relatively close to embeddings and there are models for both embed/rerank like bge-m3 - supported by llama.cpp with --embed. I'm guessing that one possible challenge/dilemma is that for inference and embed the OpenAI API schema is being used and OpenAI does not offer rerank API. I think currently there is Jina rerank API commonly used in other projects.
I think that in terms of actual reranking there should not be very complex as it is quite related to embedding calls.