Releases: IST-DASLab/qutlass
Releases · IST-DASLab/qutlass
v0.0.1
Core features of QuTLASS v0.0.1:
- MXFP4 microscaling support, with
- Weight and Activation quantization (W4A4)
- Online rotations: fused kernel for Hadamard tranforms, quantization, and scale computation.
- Hadamard sizes matching the microscaling group sizes (i.e., 32 for MXFP4).
- Compatible with any rotation matrix defined, as they are loaded in runtime.
- Multiple quantization schemes:
- Quartet (i.e., Quest-like).
- Abs-Max.
- Matmul kernels:
- CUTLASS-backed MXFP4:MXFP4 kernel with block-scale reordering.
- Prototype kernel for small batch sizes (no reordering required).
- Transformers Integration (PR #38696)