Header only library for cpu inference with rwkv v5
Todos and stuff
- AVX512
- AVX512-skylake
- AVX2
- NEON(Arm)
- Non-simd
- FP32
- BF16
- FP16
- INT8
- Cuda
- Rocm
- Vulkan
- Batch Inference
- Sequence Inference ( state generation )
- Static memory usage via buffers
- Fixing memory leakage
- Example app
- Windows build .bat
- Mac build
- go to
./models/
- Download a model from https://huggingface.co/BlinkDL/rwkv-5-world/tree/main
- Edit convert.py to point to the download model
- run convert.py (your converted model is placed into
./build/
) - run
./build.sh
- go to
./build
- from the terminal, run
./rwkv