End-to-end implementation of Transformers using PyTorch from scratch
References:
- Transformers Blog: https://pastoral-cloudberry-567.notion.site/Transformers-fdc33a784ae64e138bd6bf1e19f2bbdf
- Attention is all you need: https://arxiv.org/pdf/1706.03762
- HuggingFace Course: https://huggingface.co/learn/nlp-course/en/chapter1/3?fw=pt
Implementing end to end Transformer model using PyTorch from scratch, and training it to generate paragraphs if given a keyword or phrase as a input.
- TransformerModel.py --> Model class containing all logic and architecture of Transformer model
- train_beta.ipynb --> Jupyter Notebook to train and do the sample inference on trained model
- trained-transformer_model.pth --> Trained model checkpoint (saved state dict)
- Articles.xlsx --> Dataset used to train the model (https://www.kaggle.com/datasets/asad1m9a9h6mood/news-articles)
- requirements.txt --> pip freeze of dependencies
The model takes a keyword or phrase, tokenizes it, and then iteratively generates text by predicting the next token in the sequence. The model uses embedding, positional encoding, and an encoder-decoder architecture to generate coherent text. Sampling strategies like temperature scaling and top-k sampling help to produce varied and natural outputs.
-
Hardware used:
- CPU: Intel i7-10750H (2.60 GHz)
- RAM: 16 GB
- GPU: NVIDIA GeForce RTX 2060 (6 GB)
-
Create virtual environment
virtualenv env
- Activate virtual environment
./env/Scripts/activate
- Installing dependancies
pip install -r requirements.txt
Dashboard which can generate the paragrph using the trained model if given a keyword or phrase as a input.
- Running a Streamlit app
streamlit run app.py