nanoEQXGPT

An implementation of Karpathy's excellent nanoGPT. The goal here is to reproduce the same GPT2 model in Equinox, a neural network library written on top of JAX. JAX allows us to use OpenXLA more effectively compared to Torch, so we should be more efficient hardware wise. We now want to make efficiency comparisons.

notable differences with the nanoGPT version

datasets

Tinystories is added

config

out_dir is replaced with out_path which allows avoids hardcoding the model name saved and loaded. tensorboard_log is available and wandb_project and wandb_run_name are changed to log_project and log_run_name respectively.

Roadmap 🚎

Getting started

git clone git@github.com:TugdualKerjan/nanoEQXGPT.git
uv sync
uv run data/shakespear_char/prepare.py
uv run train.py

Speed

It seems like kaparthy has spent more time than me on optimization because the model here is about x10 slower that the PyTorch version lol (Around 300ms vs 30ms) for the shakespear_char dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
config		config
data		data
.gitignore		.gitignore
README.md		README.md
configurator.py		configurator.py
exp.ipynb		exp.ipynb
model.py		model.py
pyproject.toml		pyproject.toml
sample.py		sample.py
train.py		train.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

nanoEQXGPT

notable differences with the nanoGPT version

datasets

config

Roadmap 🚎

Getting started

Speed

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

TugdualKerjan/nanoEQXGPT

Folders and files

Latest commit

History

Repository files navigation

nanoEQXGPT

notable differences with the nanoGPT version

datasets

config

Roadmap 🚎

Getting started

Speed

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages