Skip to content

HazyResearch/Megakernels

Repository files navigation

Megakernels!

Installation

Clone this repo and run:

git submodule update --init --recursive
pip install uv
uv pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
uv pip install -e .

Low-Latency Llama Demo

First, to compile the megakernel, run:

# from the repo root
export THUNDERKITTENS_ROOT=$(pwd)/ThunderKittens
export MEGAKERNELS_ROOT=$(pwd)
export PYTHON_VERSION=3.12 # adjust if yours is different
export GPU=H100 # options are {H100, A100, 4090}, else defaults to B200
cd demos/low-latency-llama
make

To start an interactive chat session with the model, run:

# from the repo root
python megakernels/scripts/llama_repl.py

To benchmark the megakernel, run:

# from the repo root
python megakernels/scripts/generate.py mode=mk prompt="tell me a funny joke about cookies" ntok=100

About

kernels, of the mega variety

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published