The Random-Crypto Benchmark is a procedurally generated dataset of cryptographic CTF challenges. The benchmark was designed for reinforcement learning of LLM based agents.
The benchmark's website can be visited here.
- ✅ 50 Human-verified challenges for evaluation (link)
- ⚙️ 5000 Non-Verified Challenges for training (link)
Set up the environment:
# Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txtMake sure to set your OpenAI API key in a .env file at the root of this folder:
OPENAI_API_KEY=your-key-hereThis code generates 50 challenges, one from each type.
python main.py --variants 1 --output_folder my_generated_challengesThis code generates 5000 challenges, one hundred from each type.
python main.py --variants 100 --output_folder my_generated_challenges- Lajos Muzsai (muzsailajos@protonmail.com)
- David Imolai (david@imol.ai)
- András Lukács (andras.lukacs@ttk.elte.hu)
@article{muzsai2025improving,
title={Improving LLM Agents with Reinforcement Learning on Cryptographic CTF Challenges},
author={Muzsai, Lajos and Imolai, David and Luk{\'a}cs, Andr{\'a}s},
journal={arXiv preprint arXiv:2506.02048},
year={2025}
}