Independent ML research. Building efficient LLMs for edge deployment.
PentaNet — Native pentanary {-2,-1,0,+1,+2} quantization for LLMs
−6.4% PPL vs BitNet at 124M params · 3 seeds · WikiText-103
Paper · Code · Model
ShiftQuant — Systematic analysis of shift-based PTQ on existing LLMs
Proves no 7-value grid escapes the coverage/gap tradeoff · AWQ recovers 22%
Paper · Code
PyTorch · Triton · AVX2 · WikiText-103 · RTX 5080
