Wonderful Matrices to Build Small Language Models
-
Updated
Feb 15, 2025 - Python
Wonderful Matrices to Build Small Language Models
Code for paper "Achieving Sparse Activation in Small Language Models"
This repository provides everything you need to perform Supervised Fine-Tuning (SFT) of the Qwen2.5-Coder-1.5B-Instruct model—or any of its larger variants (7B, 14B, 32B)—on the Qwen Models, using the nvidia/OpenCodeReasoning dataset.
Add a description, image, and links to the small-language-model topic page so that developers can more easily learn about it.
To associate your repository with the small-language-model topic, visit your repo's landing page and select "manage topics."