Stars
Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.
This repository contains the training code of ParetoQ introduced in our work "ParetoQ Scaling Laws in Extremely Low-bit LLM Quantization"
⚖️ The First Coding Agent-as-a-Judge
Code repo for the paper "SpinQuant LLM quantization with learned rotations"
MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.
Code repo for the paper "LLM-QAT Data-Free Quantization Aware Training for Large Language Models"