Megakernels!

Installation

Clone this repo and run:

git submodule update --init --recursive
pip install uv
uv pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
uv pip install -e .

Low-Latency Llama Demo

First, to compile the megakernel, run:

# from the repo root
export THUNDERKITTENS_ROOT=$(pwd)/ThunderKittens
export MEGAKERNELS_ROOT=$(pwd)
export PYTHON_VERSION=3.12 # adjust if yours is different
export GPU=H100 # options are {H100, A100, 4090}, else defaults to B200
cd demos/low-latency-llama
make

To start an interactive chat session with the model, run:

# from the repo root
python megakernels/scripts/llama_repl.py

To benchmark the megakernel, run:

# from the repo root
python megakernels/scripts/generate.py mode=mk prompt="tell me a funny joke about cookies" ntok=100

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
ThunderKittens @ 664c108		ThunderKittens @ 664c108
demos/low-latency-llama		demos/low-latency-llama
include		include
megakernels		megakernels
util		util
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Megakernels!

Installation

Low-Latency Llama Demo

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

HazyResearch/Megakernels

Folders and files

Latest commit

History

Repository files navigation

Megakernels!

Installation

Low-Latency Llama Demo

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages