@donguyen32
Thanks for your comment! Actually, I was looking at examples/embedding/embedding.cpp, so I think there are some differences from the server implementation.

I have verified that the output is the same as the original implementation with the following command

./llama-embedding \
    -m models/bge-reranker-v2-m3/ggml-model-f16.gguf \
    -p "what is panda?</s><s>hi\nwhat is panda?</s><s>it's a bear\nwhat is panda?</s><s>The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China." \
    --pooling rank --embd-normalize -1 --verbose-prompt

The same command also seems to be used for testing on CI.
https://github.com/ggerganov/llama.cpp/blob/a9e8a9a0306a8093eef93b0022d9f45510490072/ci/run.sh#L755

In fact, I do not know how these symbols affect the accuracy of Rerank. If you know, please let me know.

And if we want a return value in the form of the server, I think it would be better to have a separate method in the form of a create_embedding method for embed, like create_rank.

Add reranking support #1794

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions