Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add reranking support #1794

Open
donguyen32 opened this issue Oct 14, 2024 · 4 comments
Open

Add reranking support #1794

donguyen32 opened this issue Oct 14, 2024 · 4 comments

Comments

@donguyen32
Copy link

According to the ggerganov/llama.cpp#9510, lllama-cpp supported for reranking model https://huggingface.co/BAAI/bge-reranker-v2-m3.
Please provide support for this version.

@donguyen32
Copy link
Author

@abetlen Sorry but do you have any plans to implement this?

@yutyan0119
Copy link

Hi @donguyen32
I am thinking the same thing and have just submitted a PR to add the rank method to High-Level API.
I don't know if it will be merged or not, but it would be helpful to know how to do the ranking using llama-cpp-python.

@donguyen32
Copy link
Author

donguyen32 commented Nov 4, 2024

@yutyan0119 Arcoding from the original repo, I see that the format of the rerank task is [BOS]query[EOS][SEP]doc[EOS]
https://github.com/ggerganov/llama.cpp/blob/9f409893519b4a6def46ef80cd6f5d05ac0fb157/examples/server/utils.hpp#L185-L196
your inputs are [f"{query}</s><s>{doc}" for doc in documents]
Please check it

@yutyan0119
Copy link

@donguyen32
Thanks for your comment! Actually, I was looking at examples/embedding/embedding.cpp, so I think there are some differences from the server implementation.

I have verified that the output is the same as the original implementation with the following command

./llama-embedding \
    -m models/bge-reranker-v2-m3/ggml-model-f16.gguf \
    -p "what is panda?</s><s>hi\nwhat is panda?</s><s>it's a bear\nwhat is panda?</s><s>The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China." \
    --pooling rank --embd-normalize -1 --verbose-prompt 

The same command also seems to be used for testing on CI.
https://github.com/ggerganov/llama.cpp/blob/a9e8a9a0306a8093eef93b0022d9f45510490072/ci/run.sh#L755

In fact, I do not know how these symbols affect the accuracy of Rerank. If you know, please let me know.

And if we want a return value in the form of the server, I think it would be better to have a separate method in the form of a create_embedding method for embed, like create_rank.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants