Skip to content

๐Ÿ”ฅ Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation ๐Ÿ”ฅ. Our toolkit integrates 40 pre-retrieved benchmark datasets and supports 7+ retrieval techniques, 24+ state-of-the-art Reranking models, and multiple RAG methods.

Notifications You must be signed in to change notification settings

DataScienceUIBK/Rankify

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

[ English | ไธญๆ–‡]

๐Ÿ”ฅ Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation ๐Ÿ”ฅ

If you like our Framework, don't hesitate to โญ star this repository โญ. This helps us to make the Framework more better and scalable to different models and methods ๐Ÿค—.

A modular and efficient retrieval, reranking and RAG framework designed to work with state-of-the-art models for retrieval, ranking and rag tasks.


๐Ÿš€ Demo

To run the demo locally:

# Make sure Rankify is installed
pip install streamlit

# Then run the demo
streamlit run demo.py
2025-04-01.15-42-57-Cut-Merged-1743516492419.mp4

๐Ÿ”— Navigation

๐ŸŽ‰News

  • [2025-10-14] Updated installation with optional extras: retriever, reranking, rag, and all.

  • [2025-10-14] New CLI (rankify-index) syntax & examples for BM25, DPR, ANCE, Contriever, ColBERT, BGE.

  • [2025-06-11] Many thanks to @tobias124 for implementing Indexing for Custom Dataset.

  • [2025-06-01] Many thanks to @aherzinger for implementing and refactoring the Generator and RAG models.

  • [2025-05-30] Huge thanks to @baraayusry for implementing the Online Retriever using CrawAI and ReACT.

  • [2025-02-10] Released reranking-datasets and reranking-datasets-light on Hugging Face.

  • [2025-02-04] Our paper is released on arXiv.

๐Ÿ”ง Installation

Set up the virtual environment

First, create and activate a conda environment with Python 3.10:

conda create -n rankify python=3.10
conda activate rankify

Install PyTorch 2.5.1

we recommend installing Rankify with PyTorch 2.5.1 for Rankify. Refer to the PyTorch installation page for platform-specific installation commands.

If you have access to GPUs, it's recommended to install the CUDA version 12.4 or 12.6 of PyTorch, as many of the evaluation metrics are optimized for GPU use.

To install Pytorch 2.5.1 you can install it from the following cmd

pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124

Basic Installation

To install Rankify, simply use pip (requires Python 3.10+):

pip install rankify

Recommended Installation

For full functionality, we recommend installing Rankify with all dependencies:

pip install "rankify[all]"

This ensures you have all necessary modules, including retrieval, re-ranking, and RAG support.

Optional Dependencies

If you prefer to install only specific components, choose from the following:

# Retrieval stack (BM25, dense retrievers, web tools)
pip install "rankify[retriever]"

# Install base re-ranking with vLLM support for `FirstModelReranker`, `LiT5ScoreReranker`, `LiT5DistillReranker`, `VicunaReranker`, and `ZephyrReranker'.
pip install "rankify[reranking]"

# RAG endpoints (OpenAI, LiteLLM, vLLM clients)
pip install "rankify[rag]"

Or, to install from GitHub for the latest development version:

git clone https://github.com/DataScienceUIBK/rankify.git
cd rankify
pip install -e .
# For full functionality we recommend installing Rankify with all dependencies:
pip install -e ".[all]"
# Install dependencies for retrieval only (BM25, DPR, ANCE, etc.)
pip install -e ".[retriever]"
# Install base re-ranking with vLLM support for `FirstModelReranker`, `LiT5ScoreReranker`, `LiT5DistillReranker`, `VicunaReranker`, and `ZephyrReranker'.
pip install -e ".[reranking]"
# RAG endpoints (OpenAI, LiteLLM, vLLM clients)
pip install -e ".[rag]"

Using ColBERT Retriever

If you want to use ColBERT Retriever, follow these additional setup steps:

# Install GCC and required libraries
conda install -c conda-forge gcc=9.4.0 gxx=9.4.0
conda install -c conda-forge libstdcxx-ng
# Export necessary environment variables
export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH
export CC=gcc
export CXX=g++
export PATH=$CONDA_PREFIX/bin:$PATH

# Clear cached torch extensions
rm -rf ~/.cache/torch_extensions/*

๐Ÿš€ Quick Start

1๏ธโƒฃ Pre-retrieved Datasets

We provide 1,000 pre-retrieved documents per dataset, which you can download from:

๐Ÿ”— Hugging Face Dataset Repository

Dataset Format

The pre-retrieved documents are structured as follows:

[
    {
        "question": "...",
        "answers": ["...", "...", ...],
        "ctxs": [
            {
                "id": "...",         // Passage ID from database TSV file
                "score": "...",      // Retriever score
                "has_answer": true|false  // Whether the passage contains the answer
            }
        ]
    }
]

Access Datasets in Rankify

You can easily download and use pre-retrieved datasets through Rankify.

List Available Datasets

To see all available datasets:

from rankify.dataset.dataset import Dataset 

# Display available datasets
Dataset.avaiable_dataset()

Retriever Datasets

from rankify.dataset.dataset import Dataset
# Download BM25-retrieved documents for nq-dev
dataset = Dataset(retriever="bm25", dataset_name="nq-dev", n_docs=100)
documents = dataset.download(force_download=False)

Load Pre-retrieved Dataset from File

If you have already downloaded a dataset, you can load it directly:

from rankify.dataset.dataset import Dataset

# Load pre-downloaded BM25 dataset for WebQuestions
documents = Dataset.load_dataset('./tests/out-datasets/bm25/web_questions/test.json', 100)

Now, you can integrate retrieved documents with re-ranking and RAG workflows! ๐Ÿš€

๐Ÿงฑ Indexing via CLI

The CLI entrypoint is rankify-index with a subcommand index.

Common flags

  • corpus_path (positional): path to JSONL corpus.
  • --retriever {bm25,dpr,ance,contriever,colbert,bge}.
  • --output PATH (default: rankify_indices).
  • --index_type {wiki,msmarco} (default: wiki).
  • --threads INT (default: 32, sparse & some dense prep).
  • --device {cpu,cuda} (default: retrieverโ€‘specific, typically cuda).
  • --batch_size INT (dense encoders / Faiss add batches).
  • --encoder MODEL (dense encoders only; sensible defaults used if omitted).

Index layout

  • BM25 โ†’ <output>/<stem>/bm25_index
  • DPR โ†’ <output>/<stem>/dpr_index_<index_type>
  • ANCE โ†’ <output>/<stem>/ance_index_<index_type>
  • BGE โ†’ <output>/<stem>/bge_index_<index_type>
  • Contriever โ†’ <output>/<stem>/contriever_index_<index_type>
  • ColBERT โ†’ <output>/<stem>/colbert_index_<index_type>

BM25

rankify-index index data/wikipedia_10k.jsonl \
  --retriever bm25 \
  --output ./indices

DPR (singleโ€‘encoder by default)

# Wikipedia style
rankify-index index data/wikipedia_100.jsonl \
  --retriever dpr \
  --encoder facebook/dpr-ctx_encoder-single-nq-base \
  --batch_size 16 --device cuda \
  --output ./indices

# MS MARCO
rankify-index index data/msmarco_100.jsonl \
  --retriever dpr --index_type msmarco \
  --encoder facebook/dpr-ctx_encoder-single-nq-base \
  --batch_size 16 --device cuda \
  --output ./indices

ANCE

rankify-index index data/wikipedia_100.jsonl \
  --retriever ance \
  --encoder castorini/ance-dpr-context-multi \
  --batch_size 16 --device cuda \
  --output ./indices

Contriever

rankify-index index data/wikipedia_100.jsonl \
  --retriever contriever \
  --encoder facebook/contriever-msmarco \
  --batch_size 16 --device cuda \
  --output ./indices

ColBERT

rankify-index index data/wikipedia_100.jsonl \
  --retriever colbert \
  --batch_size 32 --device cuda \
  --output ./indices

BGE

rankify-index index data/wikipedia_100.jsonl \
  --retriever bge \
  --encoder BAAI/bge-large-en-v1.5 \
  --batch_size 16 --device cuda \
  --output ./indices

2๏ธโƒฃ Running Retrieval

To perform retrieval using Rankify, you can choose from various retrieval methods such as BM25, DPR, ANCE, Contriever, ColBERT, and BGE.

Example: Running Retrieval on Sample Queries

from rankify.dataset.dataset import Document, Question, Answer, Context
from rankify.retrievers.retriever import Retriever

# Sample Documents
documents = [
    Document(question=Question("the cast of a good day to die hard?"), answers=Answer([
            "Jai Courtney",
            "Sebastian Koch",
            "Radivoje Bukviฤ‡",
            "Yuliya Snigir",
            "Sergei Kolesnikov",
            "Mary Elizabeth Winstead",
            "Bruce Willis"
        ]), contexts=[]),
    Document(question=Question("Who wrote Hamlet?"), answers=Answer(["Shakespeare"]), contexts=[])
]
# BM25 retrieval on Wikipedia
bm25_retriever_wiki = Retriever(method="bm25", n_docs=5, index_type="wiki")

# BM25 retrieval on MS MARCO
bm25_retriever_msmacro = Retriever(method="bm25", n_docs=5, index_type="msmarco")


# DPR (multi-encoder) retrieval on Wikipedia
dpr_retriever_wiki = Retriever(method="dpr", model="dpr-multi", n_docs=5, index_type="wiki")

# DPR (multi-encoder) retrieval on MS MARCO
dpr_retriever_msmacro = Retriever(method="dpr", model="dpr-multi", n_docs=5, index_type="msmarco")

# DPR (single-encoder) retrieval on Wikipedia
dpr_retriever_wiki = Retriever(method="dpr", model="dpr-single", n_docs=5, index_type="wiki")

# DPR (single-encoder) retrieval on MS MARCO
dpr_retriever_msmacro = Retriever(method="dpr", model="dpr-single", n_docs=5, index_type="msmarco")

# ANCE retrieval on Wikipedia
ance_retriever_wiki = Retriever(method="ance", model="ance-multi", n_docs=5, index_type="wiki")

# ANCE retrieval on MS MARCO
ance_retriever_msmacro = Retriever(method="ance", model="ance-multi", n_docs=5, index_type="msmarco")


# Contriever retrieval on Wikipedia
contriever_retriever_wiki = Retriever(method="contriever", model="facebook/contriever-msmarco", n_docs=5, index_type="wiki")

# Contriever retrieval on MS MARCO
contriever_retriever_msmacro = Retriever(method="contriever", model="facebook/contriever-msmarco", n_docs=5, index_type="msmarco")


# ColBERT retrieval on Wikipedia
colbert_retriever_wiki = Retriever(method="colbert", model="colbert-ir/colbertv2.0", n_docs=5, index_type="wiki")

# ColBERT retrieval on MS MARCO
colbert_retriever_msmacro = Retriever(method="colbert", model="colbert-ir/colbertv2.0", n_docs=5, index_type="msmarco")


# BGE retrieval on Wikipedia
bge_retriever_wiki = Retriever(method="bge", model="BAAI/bge-large-en-v1.5", n_docs=5, index_type="wiki")

# BGE retrieval on MS MARCO
bge_retriever_msmacro = Retriever(method="bge", model="BAAI/bge-large-en-v1.5", n_docs=5, index_type="msmarco")


# Hyde retrieval on Wikipedia
hyde_retriever_wiki = Retriever(method="hyde" , n_docs=5, index_type="wiki", api_key=OPENAI_API_KEY )

# Hyde retrieval on MS MARCO
hyde_retriever_msmacro = Retriever(method="hyde", n_docs=5, index_type="msmarco", api_key=OPENAI_API_KEY)

Running Retrieval

After defining the retriever, you can retrieve documents using:

retrieved_documents = bm25_retriever_wiki.retrieve(documents)

for i, doc in enumerate(retrieved_documents):
    print(f"\nDocument {i+1}:")
    print(doc)

3๏ธโƒฃ Running Reranking

Rankify provides support for multiple reranking models. Below are examples of how to use each model.

Example: Reranking a Document

from rankify.dataset.dataset import Document, Question, Answer, Context
from rankify.models.reranking import Reranking

# Sample document setup
question = Question("When did Thomas Edison invent the light bulb?")
answers = Answer(["1879"])
contexts = [
    Context(text="Lightning strike at Seoul National University", id=1),
    Context(text="Thomas Edison tried to invent a device for cars but failed", id=2),
    Context(text="Coffee is good for diet", id=3),
    Context(text="Thomas Edison invented the light bulb in 1879", id=4),
    Context(text="Thomas Edison worked with electricity", id=5),
]
document = Document(question=question, answers=answers, contexts=contexts)

# Initialize the reranker
reranker = Reranking(method="monot5", model_name="monot5-base-msmarco")

# Apply reranking
reranker.rank([document])

# Print reordered contexts
for context in document.reorder_contexts:
    print(f"  - {context.text}")

Examples of Using Different Reranking Models

# UPR
model = Reranking(method='upr', model_name='t5-base')

# API-Based Rerankers
model = Reranking(method='apiranker', model_name='voyage', api_key='your-api-key')
model = Reranking(method='apiranker', model_name='jina', api_key='your-api-key')
model = Reranking(method='apiranker', model_name='mixedbread.ai', api_key='your-api-key')

# Blender Reranker
model = Reranking(method='blender_reranker', model_name='PairRM')

# ColBERT Reranker
model = Reranking(method='colbert_ranker', model_name='Colbert')

# EchoRank
model = Reranking(method='echorank', model_name='flan-t5-large')

# First Ranker
model = Reranking(method='first_ranker', model_name='base')

# FlashRank
model = Reranking(method='flashrank', model_name='ms-marco-TinyBERT-L-2-v2')

# InContext Reranker
Reranking(method='incontext_reranker', model_name='llamav3.1-8b')

# InRanker
model = Reranking(method='inranker', model_name='inranker-small')

# ListT5
model = Reranking(method='listt5', model_name='listt5-base')

# LiT5 Distill
model = Reranking(method='lit5distill', model_name='LiT5-Distill-base')

# LiT5 Score
model = Reranking(method='lit5score', model_name='LiT5-Distill-base')

# LLM Layerwise Ranker
model = Reranking(method='llm_layerwise_ranker', model_name='bge-multilingual-gemma2')

# LLM2Vec
model = Reranking(method='llm2vec', model_name='Meta-Llama-31-8B')

# MonoBERT
model = Reranking(method='monobert', model_name='monobert-large')

# MonoT5
Reranking(method='monot5', model_name='monot5-base-msmarco')

# RankGPT
model = Reranking(method='rankgpt', model_name='llamav3.1-8b')

# RankGPT API
model = Reranking(method='rankgpt-api', model_name='gpt-3.5', api_key="gpt-api-key")
model = Reranking(method='rankgpt-api', model_name='gpt-4', api_key="gpt-api-key")
model = Reranking(method='rankgpt-api', model_name='llamav3.1-8b', api_key="together-api-key")
model = Reranking(method='rankgpt-api', model_name='claude-3-5', api_key="claude-api-key")

# RankT5
model = Reranking(method='rankt5', model_name='rankt5-base')

# Sentence Transformer Reranker
model = Reranking(method='sentence_transformer_reranker', model_name='all-MiniLM-L6-v2')
model = Reranking(method='sentence_transformer_reranker', model_name='gtr-t5-base')
model = Reranking(method='sentence_transformer_reranker', model_name='sentence-t5-base')
model = Reranking(method='sentence_transformer_reranker', model_name='distilbert-multilingual-nli-stsb-quora-ranking')
model = Reranking(method='sentence_transformer_reranker', model_name='msmarco-bert-co-condensor')

# SPLADE
model = Reranking(method='splade', model_name='splade-cocondenser')

# Transformer Ranker
model = Reranking(method='transformer_ranker', model_name='mxbai-rerank-xsmall')
model = Reranking(method='transformer_ranker', model_name='bge-reranker-base')
model = Reranking(method='transformer_ranker', model_name='bce-reranker-base')
model = Reranking(method='transformer_ranker', model_name='jina-reranker-tiny')
model = Reranking(method='transformer_ranker', model_name='gte-multilingual-reranker-base')
model = Reranking(method='transformer_ranker', model_name='nli-deberta-v3-large')
model = Reranking(method='transformer_ranker', model_name='ms-marco-TinyBERT-L-6')
model = Reranking(method='transformer_ranker', model_name='msmarco-MiniLM-L12-en-de-v1')

# TwoLAR
model = Reranking(method='twolar', model_name='twolar-xl')

# Vicuna Reranker
model = Reranking(method='vicuna_reranker', model_name='rank_vicuna_7b_v1')

# Zephyr Reranker
model = Reranking(method='zephyr_reranker', model_name='rank_zephyr_7b_v1_full')

4๏ธโƒฃ Using Generator Module

Rankify provides a Generator Module for retrieval-augmented generation (RAG), integrating retrieved documents with generative models like OpenAI, LiteLLM, vLLM, and Hugging Face. Its modular design allows easy addition of new RAG methods and endpoints, enabling seamless experimentation with approaches like zero-shot RAG, chain-of-thought RAG, and FiD-based RAG. Below there are examples of how to use different RAG methods and how to include different LLM endpoints.

Please note that in order to use API-based endpoints (OpenAI, LiteLLM), you need to specify an api-key. See how to do this in our example below.

Examples of Using Different RAG methods and backends

# Zero-shot with Huggingface endpoint
generator = Generator(method="zero-shot", model_name='meta-llama/Meta-Llama-3.1-8B-Instruct', backend="huggingface")

# Basic RAG with LiteLLM endpoint
generator = Generator(method="basic-rag", model_name='ollama/mistral', backend="litellm", api_key=api_key)

# Chain-of-Thought RAG with vLLM endpoint
generator = Generator(method="chain-of-thought-rag", model_name='mistralai/Mistral-7B-v0.1', backend="vllm")

# In-context-RALM with OpenAI endpoint
generator = Generator(method="in-context-ralm", model_name='gpt-3.5-turbo', backend="openai", api_keys=[api_key])

Usage example without API-inference

from rankify.dataset.dataset import Document, Question, Answer, Context
from rankify.generator.generator import Generator

# Define question and answer
question = Question("What is the capital of Austria?")
answers=Answer("")
contexts = [
    Context(id=1, title="France", text="The capital of France is Paris.", score=0.9),
    Context(id=2, title="Germany", text="Berlin is the capital of Germany.", score=0.5)
]

# Construct document
doc = Document(question=question, answers=answers, contexts=contexts)

# Initialize Generator (e.g., Meta Llama)
generator = Generator(method="basic-rag", model_name='meta-llama/Meta-Llama-3.1-8B-Instruct', backend="huggingface")

# Generate answer
generated_answers = generator.generate([doc])
print(generated_answers)  # Output: ["Paris"]

Usage example with API-inference

Saving your API-keys in a .env.local file, you can access them via the listed methods:

# in .env.local:
OPENAI_API_KEY=your-api-key
LITELLM_API_KEY=your-api-key

Usage

# load LiteLLM api-key
api_key = get_litellm_api_key()
# load OpenAI api-key
api_key = get_openai_api_key()

Full example using LiteLLM:

from rankify.dataset.dataset import Document, Question, Answer, Context
from rankify.generator.generator import Generator
from rankify.utils.models.rank_llm.rerank.api_keys import get_litellm_api_key

# Define question and answer
question = Question("What is the capital of France?")
answers = Answer([""])
contexts = [
    Context(id=1, title="France", text="The capital of France is Paris.", score=0.9),
    Context(id=2, title="Germany", text="Berlin is the capital of Germany.", score=0.5)
]

# Construct document
doc = Document(question=question, answers=answers, contexts=contexts)

#load api-key
api_key = get_litellm_api_key()

# Initialize Generator (e.g., Meta Llama)
generator = Generator(method="basic-rag", model_name='ollama/mistral', backend="litellm", api_key=api_key)

# Generate answer
generated_answers = generator.generate([doc])
print(generated_answers)  # Output: ["Paris"]

5๏ธโƒฃ Evaluating with Metrics

Rankify provides built-in evaluation metrics for retrieval, re-ranking, and retrieval-augmented generation (RAG). These metrics help assess the quality of retrieved documents, the effectiveness of ranking models, and the accuracy of generated answers.

Evaluating Generated Answers

You can evaluate the quality of retrieval-augmented generation (RAG) results by comparing generated answers with ground-truth answers.

from rankify.metrics.metrics import Metrics
from rankify.dataset.dataset import Dataset

# Load dataset
dataset = Dataset('bm25', 'nq-test', 100)
documents = dataset.download(force_download=False)

# Initialize Generator
generator = Generator(method="in-context-ralm", model_name='meta-llama/Llama-3.1-8B')

# Generate answers
generated_answers = generator.generate(documents)

# Evaluate generated answers
metrics = Metrics(documents)
print(metrics.calculate_generation_metrics(generated_answers))

Evaluating Retrieval Performance

# Calculate retrieval metrics before reranking
metrics = Metrics(documents)
before_ranking_metrics = metrics.calculate_retrieval_metrics(ks=[1, 5, 10, 20, 50, 100], use_reordered=False)

print(before_ranking_metrics)

Evaluating Reranked Results

# Calculate retrieval metrics after reranking
after_ranking_metrics = metrics.calculate_retrieval_metrics(ks=[1, 5, 10, 20, 50, 100], use_reordered=True)
print(after_ranking_metrics)

๐Ÿงช BEIR & TREC DL19/DL20 with BM25

Rankify ships convenient hooks to run BM25 baselines on BEIR tasks and TREC DL'19/20, and to evaluate with TREC-style metrics (nDCG, MAP, MRR).

Quick start (single dataset)

from rankify.dataset.dataset import Dataset
from rankify.metrics.metrics import Metrics

# Download pre-retrieved BM25 results (top-k per query)
docs = Dataset('bm25', 'dl19', n_docs=1000).download(force_download=False)

# Evaluate with TREC metrics (nDCG@10/100 by default shown here)
metrics = Metrics(docs)
print(metrics.calculate_trec_metrics(ndcg_cuts=[10, 100], use_reordered=False))

Notes

  • Supported names include dl19, dl20, and BEIR tasks with the beir- prefix, e.g.: beir-arguana, beir-covid, beir-dbpedia, beir-fever, beir-fiqa, beir-news, beir-nfc, beir-quora, beir-robust04, beir-scidocs, beir-scifact, beir-signal, beir-touche.
  • If you need explicit qrels selection, pass qrel=name.replace("beir-", "") to calculate_trec_metrics.

Batch over BEIR & DL datasets

from rankify.dataset.dataset import Dataset
from rankify.metrics.metrics import Metrics

BEIR_TASKS = [
    "beir-arguana", "beir-covid", "beir-dbpedia", "beir-fever", "beir-fiqa", "beir-news",
    "beir-nfc", "beir-quora", "beir-robust04", "beir-scidocs", "beir-scifact",
    "beir-signal", "beir-touche",
]

for name in ["dl19", "dl20", *BEIR_TASKS]:
    docs = Dataset('bm25', name, n_docs=100).download(force_download=False)
    m = Metrics(docs)
    res = m.calculate_trec_metrics(ndcg_cuts=[10, 100], use_reordered=False)
    print(name, res)

(Optional) Add a reranker, then evaluate

from rankify.models.reranking import Reranking
from rankify.dataset.dataset import Dataset
from rankify.metrics.metrics import Metrics

name = "beir-arguana"
docs = Dataset('bm25', name, n_docs=100).download(force_download=False)
reranker = Reranking(method='transformer_ranker', model_name='bge-reranker-base')
reranker.rank(docs)

m = Metrics(docs)
print("Before:", m.calculate_trec_metrics(ndcg_cuts=[10, 100], use_reordered=False))
print("After :", m.calculate_trec_metrics(ndcg_cuts=[10, 100], use_reordered=True))

๐Ÿ“ Evaluating RAG with RAGAS

Rankify ships a thin wrapper around ragas to make quality evaluation of generated answers simple and flexibleโ€”whether you judge with a local HF model or a hosted API like OpenAI. You can run fast defaults, pick specific metrics, or simulate predictions when compute is tight.

โœ… Install

# core Rankify RAG deps
pip install bert-score
pip install ragas
pip install langchain_huggingface
pip install rouge-score
import torch
from rankify.dataset.dataset import Document, Question, Answer, Context
from rankify.generator.generator import Generator
from rankify.metrics.generator_metrics import GeneratorMetrics
from rankify.metrics.ragas_bridge import RagasModels

# 1) Build a tiny document
question = Question("What is the capital of France?")
answers  = Answer(["Paris"])
contexts = [
    Context(id=1, title="France",   text="The capital of France is Paris.", score=0.9),
    Context(id=2, title="Germany",  text="Berlin is the capital of Germany.", score=0.5),
]
doc = Document(question=question, answers=answers, contexts=contexts)

# 2) Generate an answer (or skip and provide your own predictions list)
generator   = Generator(method="basic-rag",
                        model_name="meta-llama/Meta-Llama-3.1-8B-Instruct",
                        backend="huggingface",
                        torch_dtype=torch.float16)
predictions = generator.generate([doc])
print("Generated:", predictions)

# 3) Evaluate with RAGAS (HF judge)
gen_metrics = GeneratorMetrics([doc])

ragas_hf = RagasModels(
    llm_kind="hf",
    llm_name="meta-llama/Meta-Llama-3.1-8B-Instruct",
    embeddings_kind="hf",
    embeddings_name="sentence-transformers/all-MiniLM-L6-v2",
    torch_dtype="float16",
    max_new_tokens=256,  # shorter outputs = faster + cheaper
    timeout=180,         # seconds per metric call
    max_retries=1,
    max_workers=2,       # keep small on limited hardware
)

# (A) Fast defaults
scores_fast = gen_metrics.all(predictions, ragas_models=ragas_hf)
print("RAGAS (fast):", scores_fast)

# (B) Pick specific metrics
scores_specific = gen_metrics.ragas_generator(
    predictions,
    judge=ragas_hf,
    metrics=["faithfulness", "response_relevancy", "context_precision", "context_recall"],
)
print("RAGAS (specific):", scores_specific)

# (C) OpenAI judge (much faster if you have an API key)
ragas_openai = RagasModels(llm_kind="openai", llm_name="gpt-4o-mini", timeout=30)
scores_openai = gen_metrics.all(predictions, ragas_models=ragas_openai)
print("RAGAS (OpenAI):", {k: v for k, v in scores_openai.items() if k.startswith("ragas_")})

๐Ÿ“œ Supported Models

1๏ธโƒฃ Index

  • โœ… Wikipedia
  • โœ… MS-MARCO
  • ๐Ÿ•’ Online Search

1๏ธโƒฃ Retrievers

  • โœ… BM25
  • โœ… DPR
  • โœ… ColBERT
  • โœ… ANCE
  • โœ… BGE
  • โœ… Contriever
  • โœ… BPR
  • โœ… HYDE
  • ๐Ÿ•’ RepLlama
  • ๐Ÿ•’ coCondenser
  • ๐Ÿ•’ Spar
  • ๐Ÿ•’ Dragon
  • ๐Ÿ•’ Hybird

2๏ธโƒฃ Rerankers


3๏ธโƒฃ Generator

RAG-Methods

  • โœ… Zero-shot
  • โœ… Basic-RAG
  • โœ… Chain-of-Thought-RAG
  • โœ… Fusion-in-Decoder (FiD) with T5
  • โœ… In-Context Learning RALM
  • ๐Ÿ•’ Self-Consistency RAG
  • ๐Ÿ•’ Retrieval Chain-of-Thought

LLM-Endpoints

  • โœ… Hugging Face
  • โœ… vLLM
  • โœ… LiteLLM
  • โœ… OpenAI

โœจ Features

  • ๐Ÿ”ฅ Unified Framework: Combines retrieval, re-ranking, and retrieval-augmented generation (RAG) into a single modular toolkit.
  • ๐Ÿ“š Rich Dataset Support: Includes 40+ benchmark datasets with pre-retrieved documents for seamless experimentation.
  • ๐Ÿงฒ Diverse Retrieval Methods: Supports BM25, DPR, ANCE, BPR, ColBERT, BGE, and Contriever for flexible retrieval strategies.
  • ๐ŸŽฏ Powerful Re-Ranking: Implements 24 advanced models with 41 sub-methods to optimize ranking performance.
  • ๐Ÿ—๏ธ Prebuilt Indices: Provides Wikipedia and MS MARCO corpora, eliminating indexing overhead and speeding up retrieval.
  • ๐Ÿ”ฎ Seamless RAG Integration: Works with backends like Hugging Face, OpenAI, vLLM, LiteLLM inferening models like GPT, LLAMA, T5, and Fusion-in-Decoder (FiD) for multiple retrieval-augmented generation methods.
  • ๐Ÿ›  Extensible & Modular: Easily integrates custom datasets, retrievers, ranking models, and RAG pipelines.
  • ๐Ÿ“Š Built-in Evaluation Suite: Includes retrieval, ranking, and RAG metrics for robust benchmarking.
  • ๐Ÿ“– User-Friendly Documentation: Access detailed ๐Ÿ“– online docs, example notebooks, and tutorials for easy adoption.

๐Ÿ” Roadmap

Rankify is still under development, and this is our first release (v0.1.0). While it already supports a wide range of retrieval, re-ranking, and RAG techniques, we are actively enhancing its capabilities by adding more retrievers, rankers, datasets, and features.

๐Ÿ“– Documentation

For full API documentation, visit the Rankify Docs.


๐Ÿ’ก Contributing

Follow these steps to get involved:

  1. Fork this repository to your GitHub account.

  2. Create a new branch for your feature or fix:

    git checkout -b feature/YourFeatureName
  3. Make your changes and commit them:

    git commit -m "Add YourFeatureName"
  4. Push the changes to your branch:

    git push origin feature/YourFeatureName
  5. Submit a Pull Request to propose your changes.

Thank you for helping make this project better!


๐ŸŒ Community Contributions

Chinese community resources available!

Special thanks to Xiumao for writing two exceptional Chinese blog posts about Rankify:

These articles were crafted with high-traffic optimization in mind and are widely recommended in Chinese academic and developer circles.

We updated the ไธญๆ–‡็‰ˆๆœฌ to reflect these blog contributions while keeping original content intactโ€”thank you Xiumao for your continued support!

๐Ÿ”– License

Rankify is licensed under the Apache-2.0 License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

We would like to express our gratitude to the following libraries, which have greatly contributed to the development of Rankify:

  • Rerankers โ€“ A powerful Python library for integrating various reranking methods.
    ๐Ÿ”— GitHub Repository

  • Pyserini โ€“ A toolkit for supporting BM25-based retrieval and integration with sparse/dense retrievers.
    ๐Ÿ”— GitHub Repository

  • FlashRAG โ€“ A modular framework for Retrieval-Augmented Generation (RAG) research.
    ๐Ÿ”— GitHub Repository

๐ŸŒŸ Citation

Please kindly cite our paper if helps your research:

@article{abdallah2025rankify,
  title={Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation},
  author={Abdallah, Abdelrahman and Mozafari, Jamshid and Piryani, Bhawna and Ali, Mohammed and Jatowt, Adam},
  journal={arXiv preprint arXiv:2502.02464},
  year={2025}
}

Star History

Star History Chart

About

๐Ÿ”ฅ Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation ๐Ÿ”ฅ. Our toolkit integrates 40 pre-retrieved benchmark datasets and supports 7+ retrieval techniques, 24+ state-of-the-art Reranking models, and multiple RAG methods.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 7