Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving

📄 Accepted to Findings of EMNLP 2025

📂 Repository

This repository will host the PhoPile dataset and benchmarking code for evaluating foundation models with retrieval-augmented generation (RAG) in Olympiad-level physics problem solving.

🚀 Data and code will be released soon.

📄 Citation

If you use this work, please cite:

@inproceedings{zheng2025phopile,
  title     = "Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving",
  author    = "Zheng, Shunfeng and Zhang, Yudi and Fang, Meng and Zhang, Zihan and Wu, Zhitan and Pechenizkiy, Mykola and Chen, Ling",
  booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2025",
  year      = "2025",
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving

📂 Repository

📄 Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

aialt/PhoPile

Folders and files

Latest commit

History

Repository files navigation

Benchmarking Foundation Models with Retrieval-Augmented Generation in Olympic-Level Physics Problem Solving

📂 Repository

📄 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Packages