This repository contains all code, configs and data used in our paper “Born A Transformer -- Always a Transformer?”
| section of paper | experiment category | folder |
|---|---|---|
| § 4.1 | In-context learning on synthetic copy / retrieve tasks | copying/ · retrieval/ |
| § 4.2 | In-context learning on real-world tasks | realistic/ |
| § 4.3 | Supervised fine-tuning baselines | finetuning/ |
| § 4.4 | Mechanistic probes (induction-head patching, attention alignment) | mechanistic/ |
| Appendix | Training from-scratch ablations | fromscratch/ |
(Each link opens a dedicated README with setup & run instructions.)
copying/ # synthetic copy–ICL experiments (§4.1)
retrieval/ # retrieval–ICL experiments (§4.1)
realistic/ # ArXiv, lorem-ipsum, code-assist tasks (§4.2)
finetuning/ # supervised fine-tune baselines (§4.3)
mechanistic/ # induction-head probing & ablations (§4.4)
fromscratch/ # training tiny models from scratch (Appendix)
datasets/ # cached JSONL datasets
results/ # generated outputs (kept small in repo)
visualisations/ # plots for the paper
prompts/ # prompt templates
To be added after ArXiv is uploded