This repository provides resources developed within the following article [PDF]:
M. Arustashvili and K. Balog. SciNUP: Natural Language User Interest Profiles for Scientific Literature Recommendation. ArXiv e-prints, October 2025. https://doi.org/10.48550/arXiv.2510.21352
The use of natural language (NL) user profiles in recommender systems offers greater transparency and user control compared to traditional representations. However, there is scarcity of large-scale, publicly available test collections for evaluating NL profile-based recommendation. To address this gap, we introduce SciNUP, a novel synthetic dataset for scholarly recommendation that leverages authors' publication histories to generate NL profiles and corresponding ground truth items. We use this dataset to conduct a comparison of baseline methods, ranging from sparse and dense retrieval approaches to state-of-the-art LLM-based rerankers. Our results show that while baseline methods achieve comparable performance, they often retrieve different items, indicating complementary behaviors. At the same time, considerable headroom for improvement remains, highlighting the need for effective NL-based recommendation approaches. The SciNUP dataset thus serves as a valuable resource for fostering future research and development in this area.
The SciNUP dataset contains NL profiles of 1000 researchers for scientific literature recommendation with the following components:
- NL profiles and candidate items - Download dataset.tgz and extract under data/SciNUP
- Ground truth items - Download sampled_users.tgz and and extract under data/SciNUP
- NL profile breadth categorization - breadth_classification.tsv
| Attribute | SciNUP |
|---|---|
| #Authors | 1,000 |
| #Authored papers (min / median / max) | 10 / 20 / 260 |
| #Candidate items per author | 1,000 |
| #Ground truth papers per author (min / median / max) | 1 / 27 / 438 |
| Profile length (words) | 117 ± 55 |
| #Narrow / Medium / Broad NL profiles | 679 / 256 / 65 |
Details about dataset creation steps can be found under data/SciNUP/README.md
We benchmarked sparse (BM25, RM3), dense (kNN-SciBERT, BGE-large, BGE-v2-M3, BGE-v2-MiniCPM) and LLM-reranking (PRP-Llama-3-8B, PRP-Llama-3.3-70B, PRP-GPT-4o-mini) methods on our dataset. Additionally, we evaluated ensemble model using reciprocal rank fusion to fuse best-performing results in each category (RM3, BGE-v2-MiniCPM and PRP-GPT-4o-mini).
| Model | R@100 | MAP | MRR | NDCG@10 | Runfile |
|---|---|---|---|---|---|
| BM25 | 0.3491 | 0.1148 | 0.4661 | 0.2869 | data/retrieval_results/bm25.trec |
| RM3 | 0.3570 | 0.1391 | 0.5147 | 0.3251 | data/retrieval_results/rm3.trec |
| kNN-SciBERT | 0.1480 | 0.0232 | 0.2182 | 0.1019 | data/retrieval_results/knn_scibert.trec |
| BGE-Large | 0.2826 | 0.0783 | 0.3666 | 0.2072 | data/retrieval_results/bge_large.trec |
| BGE-v2-M3 | 0.3472 | 0.1152 | 0.4633 | 0.2763 | data/retrieval_results/bge_v2_m3.trec |
| BGE-v2-MiniCPM | 0.4203 | 0.1673 | 0.5393 | 0.3541 | data/retrieval_results/bge_v2_minicpm.trec |
| PRP-Llama-3 (8B) | 0.3491 | 0.1165 | 0.4774 | 0.2925 | data/retrieval_results/prp_llama_8b.trec |
| PRP-Llama-3.3 (70B) | 0.3491 | 0.1423 | 0.5378 | 0.3541 | data/retrieval_results/prp_llama_70b.trec |
| PRP-GPT-4o-mini | 0.3491 | 0.1405 | 0.5297 | 0.3542 | data/retrieval_results/prp_gpt.trec |
| Ensemble | 0.4136 | 0.2163 | 0.6333 | 0.4481 | data/retrieval_results/rrf_fused.trec |
Implementation of these methods can be found under src/models and scripts/. The main script to run retrieval including the sample usage is under scripts/run_retrieval.py.
The numbers shown in the table above are generated using trec_eval:
trec_eval -m recall.100 -m map -m recip_rank -m ndcg_cut.10 data/retrieval_results/ground_truth_qrels.txt PATH_TO_DESIRED_RUNFILE
If you use the resources presented in this repository, please cite:
@misc{Arustashvili:2025:arXiv,
author = {Mariam Arustashvili and Krisztian Balog},
title = {SciNUP: Natural Language User Interest Profiles for Scientific Literature Recommendation},
year = {2025},
eprint = {2510.21352},
archivePrefix = {arXiv},
primaryClass = {cs.IR},
}
Should you have any questions, please contact Mariam Arustashvili at mariam.arustashvili[AT]uis.no (with [AT] replaced by @).