Open
Description
Background & Description
from llama_cpp pr 9510, rerank has been supportted using pool=rank
and i found LLamaPoolingType has Rank item, that means we can use LLamaSharp to rerank?
API & Usage
the example: ./llama-embedding.exe --model jina-reranker-v1-tiny-en-FP16.gguf -p "what is panda?hi\nwhat is panda?it's a bear\nwhat is panda?The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China." -ngl 99 -c 0 --pooling rank --embd-normalize -1 --verbose-prompt
output:
rerank score 0: 0.024
rerank score 1: 0.025
rerank score 2: 0.199
How to implement
use LLamaSharp:
var parameters = new ModelParams(modelPath)
{
Embeddings = true,
PoolingType = LLama.Native.LLamaPoolingType.Rank,
ContextSize = 0,
BatchSize = 2048,
UBatchSize = 2048,
GpuLayerCount = gpuLayerCount // How many layers to offload to GPU. Please adjust it according to your GPU memory.
};
var weights = LLamaWeights.LoadFromFile(parameters);
this._reranker = new LLamaEmbedder(weights, parameters);
var scores = (await this._reranker.GetEmbeddings(input, token)).Single();
result:

Metadata
Metadata
Assignees
Labels
No labels