Simulation Fallacy Benchmark - Initial Public Release
This is the first official release of the Simulation Fallacy benchmark, containing all data, analysis scripts, and figures from the paper.
What's Included
- ✅ Complete study data: Cross-domain and persistence study results (8 JSON files)
- ✅ Reproducible analysis: Scripts to regenerate both figures from the paper
- ✅ Colab notebook: One-click reproduction environment
- ✅ Inter-rater reliability: IRR validation data and reports
- ✅ Prompt transparency: All 11 prompt templates used in the study
- ✅ Comprehensive documentation: README, DATA_DICTIONARY, CITATION.cff
Key Findings
- GPT-5: ~98% silent refusal (epistemic boundary respected)
- Gemini 2.5 Pro: ~81% fabrication (high confabulation rate)
- Claude Sonnet 4: admission/fabrication oscillation
Reproducibility
All figures and metrics can be regenerated from the included data:
python scripts/compute_metrics.py --in_dir results/final --out_csv results/final/label_counts_with_pct.csv
python scripts/plot_figures.py --tables_csv results/final/label_counts_with_pct.csv --figdir figures
python scripts/plot_transitions.py --in_dir results/final --figdir figuresCitation
See CITATION.cff or use the GitHub citation widget (top-right of repo page).
Related Work
- The Mirror Loop - Semantic drift in recursive LLM interaction
- Recursive Confabulation - Multi-turn hallucination benchmark
License: MIT
Maintained by: Course Correct Labs