Dive into advanced quantization techniques. Learn to implement and customize linear quantization functions, measure quantization error, and compress model weights using PyTorch for efficient and accessible AI models.
machine-learning quantization model-compression linear-quantization quantization-error ai-optimization advanced-quantization symmetric-quantization asymmetric-quantization per-tensor-granularity per-channel-granularity per-group-granularity pytorch-quantizer weight-packing 8-bit-compression 2-bit-weights
-
Updated
May 22, 2024 - Jupyter Notebook