AutoReproduce: Automatic AI Experiment Reproduction with Paper Lineage

This is the official repo of AutoReproduce and ReproduceBench.

Overview

AutoReproduce

We are currently organizing the code and adding more content form automation. The current code is just a demo.

Quick Start

export OPENAI_API_KEY="<OPENAI_API_KEY>"
export BASE_URL="<BASE_URL>" #If necessary

python reproduce.py #Default setting

Customized Reproduce

For reproduce the paper you want, the paper content should be downloaded (We are currently organizing the use of Mineru for automation).

If the data cannot be obtained directly, please download the data in advance and modify the instruction to specify the path.

python reproduce.py --paper-path xxx --dataloader-path xxx

TODO

Currently, for the default setting the paper lineage is not employed. Downloading the code from GitHub is limited. We recommand utilizing your customized github token and run the following commands before reproduction.

export GITHUB_TOKEN="<GITHUB_TOKEN>"

ReproduceBench

Download Datasets

All the datasets and human-curated reference code could be available at ReproduceBench.

pip install -U huggingface_hub
cd AutoReproduce
huggingface-cli download --repo-type dataset --resume-download ai9stars/ReproduceBench --local-dir ReproduceBench

Evaluation

All the evaluation code are under evaluation. The current code is not well-structured. We are currently working on organizing it.

# First summarize the key points of the paper.
python evaluation/summarize_points.py

# Then run the following files to calculate align-score.
python evaluation/eval_high.py  # High-level score
python evaluation/eval_low.py   # Low-level score
python evaluation/eval_mixed.py # Mixed-level score

Contact

For any questions, you can contact 2429527z@gmail.com.

Citation

If you find this work useful, consider giving this repository a star ⭐️ and citing 📝 our paper as follows:

@misc{zhao2025autoreproduceautomaticaiexperiment,
      title={AutoReproduce: Automatic AI Experiment Reproduction with Paper Lineage}, 
      author={Xuanle Zhao and Zilin Sang and Yuxuan Li and Qi Shi and Shuo Wang and Duzhen Zhang and Xu Han and Zhiyuan Liu and Maosong Sun},
      year={2025},
      eprint={2505.20662},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2505.20662}, 
}

Acknowledgement

The code is based on the Agent Laboratory. Thanks for these great works and open sourcing!

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
evaluation		evaluation
examples		examples
scripts		scripts
README.md		README.md
agents.py		agents.py
autorp.png		autorp.png
common_imports.py		common_imports.py
inference.py		inference.py
reproduce.py		reproduce.py
tools.py		tools.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AutoReproduce: Automatic AI Experiment Reproduction with Paper Lineage

Overview

AutoReproduce

Quick Start

Customized Reproduce

TODO

ReproduceBench

Download Datasets

Evaluation

Contact

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

AI9Stars/AutoReproduce

Folders and files

Latest commit

History

Repository files navigation

AutoReproduce: Automatic AI Experiment Reproduction with Paper Lineage

Overview

AutoReproduce

Quick Start

Customized Reproduce

TODO

ReproduceBench

Download Datasets

Evaluation

Contact

Citation

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages