This repository is still in the early stages of development. Additionally, it includes many experimental approaches. Please consider this as a place to experiment with my ideas. Do not use it in a product under any circumstances.
Caten = Compile+AbstracTENsor
Caten is an experimental deep learning compiler. Our goal is to implement a compiler that is as simple as tinygrad, and as flexible as TVM.
We're looking for collaborators! Please join our Discord and let me know if you'd like to contribute!
Caten is still under development, but it aims to support a wide range of models in the future—from image processing to text generation, and vision language models! Some models are already up and running.
$ JIT=1 PARALLEL=8 ./roswell/caten.ros llm-example --model "gpt2" --prompt "Hello" --max-length 100
Give the GPT2 demo a try! You can pass compilation settings through environment variables.
For example, setting JIT=1
enables JIT compilation, while JIT_DEBUG >= 2
allows you to view the schedule and the generated kernels. Setting PARALLEL=8
divides the ScheduleGraph and compiles it in parallel.
You may still find the token/ms rate slow, but we're not yet at the stage of implementing an AutoScheduler to accelerate kernel performance (as well as GPU support). Once our IR matures enough to handle a wide range of deep learning models, we plan to focus on speeding things up!
- Install Roswell and suitable IDE. (If unsure, Emacs or Lem is recommended)
- Install ISL (Integer Set Library) for the fast kernel generation.
- Install Qlot
- Check out getting-started.lisp
$ git clone git@github.com:hikettei/Caten.git
$ cd Caten
$ qlot install
$ qlot exec ros run
> (ql:quickload :caten)
> (in-package :caten-user)
> (proceed (!randn `(3 3)))
-
Join our Discord Server.
-
Check out our roadmap.
-
Create a PR
Caten is a project that started only a few months ago. We are currently in the stage of building a solid foundational library. Here’s what we’re looking for:
-
Feature additions with tests (e.g., new activations, unimplemented matrix operations)
-
Bug reports and additional tests.
-
Refactoring of the core compiler components
-
Improving the documentation
etc...
Before contributing, please note that there is no linter here. Make an effort to adhere to Google Common Lisp Style Guide. Changes that do not follow this should be rejected by the review.
- Transformer
- GPT2
- Llama3 8B
- TinyLLAMA
- Classification
- MobileNetV2
- MobileNetV3
- ResNet18/ResNet34/ResNet50
- VIT_B_16
- Segmentation
- CenterNet
- Detection
- YoLOv3
- YoLOv7
- Common Lisp Frontend (caten/apis)
- ONNX (caten/onnx)
- GGUF (caten/gguf)
- Support Dequantization from GGUF
- Support QOPs
- Autodiff
- Fast Autodiff
- Support Training
- Distributed Training
- LISP VM
- CLANG JIT
- CLANG with Auto Scheduler
- METAL
- CUDA
- Vulkan
- Auto Scheduler
You should install python, numpy, pytorch before running the test-suite by using make install_extra
. If not specified, install the latest one.
$ make install_extra # extra dependencies for running tests
$ make test