First install uv dependencies :
uv syncThen simply run :
uv run python training.py- Improving Language Understanding by Generative Pre-Training : https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
- Language Models are Unsupervised Multitask Learners : https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
- Layer Normalization : https://arxiv.org/pdf/1607.06450
- Attention Is All You Need : https://arxiv.org/pdf/1706.03762
- Positional embedding : https://huggingface.co/blog/designing-positional-encoding