Parameter-efficient learning can boost cross-domain and cross-topic generalization and calibration. [Paper]
Create the python environment via Anaconda3
:
conda create -n pt-retrieval -y python=3.8.13
conda activate pt-retrieval
Install the necessary python packages. Change cudatoolkit
version according to your environment (11.3
in our experiment).
conda install -n pt-retrieval pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.3 -c pytorch
pip install -r requirements.txt
(Optional) To train with Adapter, install Adapter-transformers
pip install -U adapter-transformers
We use the preprocessed data provided by DPR which can be downloaded from the cloud using download_data.py
. One needs to specify the resource name to be downloaded. Run python download_data.py'
to see all options.
python data/download_data.py \
--resource {key from download_data.py's RESOURCES_MAP}
The keys of the five datasets and retrieval corpus we used in our experiments are:
data.retriever.nq
anddata.retriever.qas.nq
data.retriever.trivia
anddata.retriever.qas.trivia
data.retriever.squad1
anddata.retriever.qas.squad1
data.retriever.webq
anddata.retriever.qas.webq
data.retriever.curatedtre
anddata.retriever.qas.curatedtrec
data.wikipedia_split.psgs_w100
NOTE: The resource name matching is prefix-based. So if you need to download all training data, just use --resource data.retriever
.
Run the evaluation script for DPR on BEIR and the script will download the dataset automatically. The decompressed data will be saved at ./beir_eval/datasets
.
Or you can download the data manually via:
wget https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/$dataset.zip
OAG-QA is the largest public topic-specific passage retrieval dataset, which consists of 17,948 unique queries from 22 scientific disciplines and 87 fine-grained topics. The queries and reference papers are collected from professional forums such as Zhihu and StackExchange, and mapped to papers in the Open Academic Graph with rich meta-information (e.g. abstract, field-of-study (FOS)).
Download OAG-QA from the link: OAG-QA, and unzip it to ./data/oagqa-topic-v2
Five training modes are supported in the code and the corresponding training scripts are provided:
- Fine-tuning
- P-tuning v2
- BitFit
- Adapter
- Lester et al. & P-tuning
Run training scripts in ./run_scripts
, for example:
bash run_scripts/run_train_dpr_multidata_ptv2.sh
P-Tuning v2 and the original finetuning are supported in colbert.
cd colbert
bash scripts/run_train_colbert_ptv2.sh
Download the checkpoints we use in the experiments to reproduce our results in the paper.
Change --model_file ./checkpoints/$filename/$checkpoint
in each evaluation script *.sh
from ./eval_scripts
to load them.
Checkpoints | DPR | ColBERT |
---|---|---|
P-tuning v2 | Download | Download |
Fine-tune | Download | Download |
Inference is divied as two steps.
- Step 1: Generate representation vectors for the static documents dataset via:
bash eval_scripts/generate_wiki_embeddings.sh
- Step 2: Retrieve for the validation question set from the entire set of candidate documents and calculate the top-k retrieval accuracy. To select the validation dataset, you can replace
$dataset
withnq
,trivia
,webq
orcuratedtrec
.
bash eval_scripts/evaluate_on_openqa.sh $dataset $top-k
Evaluate DPR on BEIR via:
bash eval_scripts/evaluate_on_beir.sh $dataset
You can choose $dataset
from 15 datasets from BEIR, which can be referred to BEIR.
There are 87 topics in OAG-QA and you can choose any topic to run the evaluation via:
bash eval_scripts/evaluate_on_oagqa.sh $topic $top-k
Similar to the BEIR evaluation for DPR, run the script to evaluate ColBERT on BEIR:
cd colbert
bash scripts/evalute_on_beir.sh $dataset
Plot the calibration curve and calculate ECE via:
bash calibration_on_openqa.sh $dataset
Plot the calibration curve and calculate ECE via:
bash eval_scripts/calibration_on_beir.sh $dataset
If you find this paper and repo useful, please consider citing us in your work
@article{WLTam2022PT-Retrieval,
author = {Weng Lam Tam and
Xiao Liu and
Kaixuan Ji and
Lilong Xue and
Xingjian Zhang and
Yuxiao Dong and
Jiahua Liu and
Maodi Hu and
Jie Tang},
title = {Parameter-Efficient Prompt Tuning Makes Generalized and Calibrated
Neural Text Retrievers},
journal = {CoRR},
volume = {abs/2207.07087},
year = {2022},
url = {https://doi.org/10.48550/arXiv.2207.07087},
doi = {10.48550/arXiv.2207.07087},
eprinttype = {arXiv},
eprint = {2207.07087},
timestamp = {Tue, 19 Jul 2022 17:45:18 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2207-07087.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}