mini-language-model

This repository has a simple pipeline for a mini language model, as well as some basic notes about the fundamental concepts of how a language model works which are collected across various sources. The main file, train_main.py, outlines the following steps:

Overview of `train_main.py`

1. Text Preprocessing: Converting a sample text into tokenized inputs and targets using a vocabulary mapping dict.
2. Embedding Layer: Mapping token indices to embedding vectors
3. Positional Encoding: Adding learnable positional information to the embeddings
4. Self-Attention: Computing self-attention over the sequence
5. Transformer Block: Applying a transformer pipeline
6. MiniLM Model: Combining the above components into a final model

1. Argument Parser

Uses argparse to require two parameters:

--embed_dim: The size of the embedding vectors.
--hidden_dim: The hidden dimension size used in the transformer block.
--lr: The learning rate for the model
--epochs: Total epochs to train the transformer
--dataset: To train on full dataset or a subset

2. Some Notes to look for:

Self-Attention: Click here
Transformer: Click here

3. How to train the model?

Change the parameters as per your convinience

python train_main.py --embed_dim 16 --hidden_dim 64 --lr 0.01 --epochs 100 --dataset subset

4. How to use the trained model?

python generate.py --prompt *ENTER THE PROMPT HERE*

Name		Name	Last commit message	Last commit date
Latest commit History 124 Commits
backbone_nn		backbone_nn
dataset		dataset
min_lm		min_lm
model		model
self_attention		self_attention
transformer		transformer
vocab_mapping		vocab_mapping
LICENSE		LICENSE
README.md		README.md
generate.py		generate.py
generate_text.py		generate_text.py
train_main.py		train_main.py
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

mini-language-model

Overview of `train_main.py`

1. Argument Parser

2. Some Notes to look for:

3. How to train the model?

4. How to use the trained model?

About

Uh oh!

Releases

Packages

Languages

License

NeuralClassifier/mini-language-model

Folders and files

Latest commit

History

Repository files navigation

mini-language-model

Overview of train_main.py

1. Argument Parser

2. Some Notes to look for:

3. How to train the model?

4. How to use the trained model?

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Overview of `train_main.py`

Packages