MiniLLM : An implementation of Qwen3-like small language model

This SLM (or Small Language Model) has been inspired by Karpathy's Video on GPT2 but with a little difference. The model has been made to be more production ready and more similar to trending models such as Alibaba's Qwen 3. So everything has taken from Karpathy's content, Qwen's attention and embedding mechanisms added to it and now, it is one of the pretrained models which are fully open sourced.

This project has been started by Muhammadreza Haghiri(and active on X with the handle @haghiri_ai) who's the founder of Mann-E which was the first generative AI platform with pretrained/fine-tuned models in the country of Iran. This model is an effort from Mann-E in order to have a more accessible and democratized AI for everyone.

You also can take a look at the HuggingFace organization for accessing checkpoints and downloading them.

Download checkpoints

miniLLM-360m (5000 steps)

Release notes

Since adding all the notes to this README.md makes it unnecessarily long, we've added CHANGES.md file and all change logs and release notes will be added there.

How to contribute

We have a contribution guide that you can study. If you want to contribute to the project you must read that file first.

How to run

Prerequisites (For training)

A good high-end NVIDIA GPU with CUDA support (Tested on Google Colab's T4 as bare minimum and tested on B200s for faster training)
Linux operating system
Python

Prerequisites (For inference)

A user-level NVIDIA GPU with CUDA support (like a 2050)
Python
Linux is recommended. If you're a Windows user, you may run the codes on WSL

Run training scripts

First, create a python virtual environment like this:

python3 -m venv .venv

Then activate your environment:

source .venv/bin/activate

After the activation, just install the required libraries by running the following command:

pip install -r requirements.txt

After libraries installed, you may change model_params.json file hyperparameters. You can use params_calculator.py script to find out how big the resulting model will get. Then you only need to run training script:

python3 train.py

After training done, you will find out a few .pt files in the path, it is where you have your model files which are ready for inference.

NOTE: The current model is made to support English language and things may change in the future to add multilignuality to the model. It means it's possible for changes in tokenizer and other parts borrowed from other models as well.

Run inference scripts

In order to run the inference on the model you have created, you may need to use inference.py and this script comes with a few flags and options. Additionally, you can download the model from huggingface.

--model-path : It is path to the model file.
--prompt : It is the text to be completed.
--tokenizer : It is your desired tokenizer. Since the training script is currently using SmolLM 135m tokenizer, the same goes for the inference as well. It may change in the future.
--max_new_tokens : This flag helps you generate the maximum tokens possible. Since in the current training it has been set to 512, the maximum is 512. If you change it while doing the training process, this can be tweaked.
--temperature : This flag is deciding for the creativity of the model. Setting it to 0 is more likely to output the data used in training.
--top_p : When you set this, it just looks for everything with that probability.
--top_k : This also checks for the nearest neighbors of your input.
--seed : Decides for the randomness of the model.
--max_seq_length : It decides how many tokens can be taken as an input.

NOTE: All of the flags except "prompt" got their default value. You may change them in order to get the best results.

Parameter calculator guide

d_model : Embedding size, or in a better word, the dimensions of each token.
n_heads : Number of attention heads per layer.
n_layers : Transformers layers needed for the model.
d_ff: Dimensions of the feed forward layer.
vocab : Vocabulary size (which is determined by the tokenization process.)

TODO List

requirements.txt file.
Add a license to this repository.
Upload the model to huggingface.
Making a more accurate model (on English)
Changing tokenizer to a better one
Provide fine tuning script for instruction following
Porting to safetensors.
Making models transformer compatible in order to be used in huggingface transformers pipelines.
Making the train script work on multiple GPUs (it will make training of the bigger models possible)

Support The Project

You can support this project by donations. Donations are currently accepted in form of crypto and these are wallets:

Solana: GNJWgRmgRd7S9VrhCcJnVNTtAiQGTWrya9gcGb985x2m
Ethereum: 0xa2dd3D50DE0Fc12fAd946606cd853B2a972d8de8
Sui: 0x943c1190bae9a052879c1861833621e20545bc33a8c990d48cc3bb8e7b1ac00b
Polygon: 0xa2dd3D50DE0Fc12fAd946606cd853B2a972d8de8
Base: 0xa2dd3D50DE0Fc12fAd946606cd853B2a972d8de8
Bitcoin (Taproot): bc1pgtgd3uymvdxycelu06zz3sgrt47rccw2zk9u550e4de6tzgngz2s738gsn
Bitcoin (Native Segwit): bc1q85drn275ugetvleha6egp7a8u0ramyf39zg4wj

Name		Name	Last commit message	Last commit date
Latest commit History 135 Commits
CHANGES.md		CHANGES.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
inference.py		inference.py
logo.png		logo.png
minillm_training_notebook.ipynb		minillm_training_notebook.ipynb
model.md		model.md
model_params.json		model_params.json
model_params_180m.json		model_params_180m.json
model_params_bigger.json		model_params_bigger.json
params_calculator.py		params_calculator.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MiniLLM : An implementation of Qwen3-like small language model

Download checkpoints

Release notes

How to contribute

How to run

Prerequisites (For training)

Prerequisites (For inference)

Run training scripts

Run inference scripts

Parameter calculator guide

TODO List

Support The Project

About

Uh oh!

Releases

Packages

Languages

License

prp-e/minillm

Folders and files

Latest commit

History

Repository files navigation

MiniLLM : An implementation of Qwen3-like small language model

Download checkpoints

Release notes

How to contribute

How to run

Prerequisites (For training)

Prerequisites (For inference)

Run training scripts

Run inference scripts

Parameter calculator guide

TODO List

Support The Project

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages