Wonderful Matrices to Build Small Language Models
-
Updated
Feb 15, 2025 - Python
Wonderful Matrices to Build Small Language Models
Code for paper "Achieving Sparse Activation in Small Language Models"
This repository provides everything you need to perform Supervised Fine-Tuning (SFT) of the Qwen2.5-Coder-1.5B-Instruct model—or any of its larger variants (7B, 14B, 32B)—on the Qwen Models, using the nvidia/OpenCodeReasoning dataset.
AtomMind is a lightweight Small Scientific Language Model (Sslm) for reasoning across Math, Physics, Chemistry, and Biology using domain experts, symbolic reasoning, and optimization modules. It supports optional memory and self-monitoring to improve problem-solving and accuracy.
Applied Decision Architecture Matrix - Small Language Model
GraphRAG-powered Small Language Model for experimentation and statistical analysis Q&A
An small language model in early stages of development
A Streamlit demo using an SLM (Phi) and RAG to showcase how an AI can help users with learning functionality on a website
Transformer-based Calculator
Small Language Model Implementation based on Mamba (SSM) architecture. Muon optimizer to the 2D weight matrices while using the stable AdamW optimizer for all other parameters.
Excel-formula SLM that turns natural-language instructions into Excel formulas.
🚀 150M Language Model with Latent MoE architecture using a SINGLE GPU from scratch.
Building GPTs from the ground up. A hands-on journey through attention mechanisms, tokenization, and training loops.
Small Language Models (SLM) for medication Named Entity Recognition (NER)
Small Language Model trained to generate specialised visual novel scripts.
Small language model with MoE architecture in the future, a number of dense model presets, tool calling, and chain-of-thought reasoning. Optimized for on-device training on Apple Silicon Macs.
Add a description, image, and links to the small-language-model topic page so that developers can more easily learn about it.
To associate your repository with the small-language-model topic, visit your repo's landing page and select "manage topics."