APT

Arithmetic Pretrained Transformer is a MechInterp project into how transformers learn arithmetic.

Inspirations and goals

This project is heavily inspired by Neel Nanda's "Progress Measured for Grokking via Mechanistic Interpretability". The idea that transformers generalize very late into training by actually implementing an algorithm was fascinating to me. For now, I wanted to focus on simple arithmetic tasks (starting with addition and multiplication, hopefully moving to subtraction and division), hence the name. Much of the original transformer implementation comes from following along with Andrej Karpathy's wonderful NanoGPT tutorial.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
apt_checkpoints/base		apt_checkpoints/base
datasets		datasets
images		images
tokenizer		tokenizer
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
apt_hooked_transformer.py		apt_hooked_transformer.py
apt_train_2digits.py		apt_train_2digits.py
apt_train_3digits.py		apt_train_3digits.py
arithmetic_pretrained_transformer.py		arithmetic_pretrained_transformer.py
arithmetic_tokenizer.py		arithmetic_tokenizer.py
bertviz_experiments.py		bertviz_experiments.py
fun_with_matrices.ipynb		fun_with_matrices.ipynb
generate_dataset.ipynb		generate_dataset.ipynb
mlp_animated.ipynb		mlp_animated.ipynb
play.ipynb		play.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

APT

Inspirations and goals

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

Neelectric/APT

Folders and files

Latest commit

History

Repository files navigation

APT

Inspirations and goals

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages