Skip to content

Releases: IST-DASLab/qutlass

v0.0.1

15 Jul 09:58
Compare
Choose a tag to compare

Core features of QuTLASS v0.0.1:

  • MXFP4 microscaling support, with
  • Weight and Activation quantization (W4A4)
  • Online rotations: fused kernel for Hadamard tranforms, quantization, and scale computation.
    • Hadamard sizes matching the microscaling group sizes (i.e., 32 for MXFP4).
    • Compatible with any rotation matrix defined, as they are loaded in runtime.
  • Multiple quantization schemes:
  • Matmul kernels:
    • CUTLASS-backed MXFP4:MXFP4 kernel with block-scale reordering.
    • Prototype kernel for small batch sizes (no reordering required).
  • Transformers Integration (PR #38696)