This repository contains an implementation of a Transformer model built from scratch using PyTorch. The model is trained for English-to-Arabic translation using the news commentary dataset.
- Custom Transformer Architecture: Implemented from scratch in PyTorch, inspired by Mr. Umar Jamil's code and his YouTube video.
- Monitoring and Logging: Utilized PyTorch Ignite for efficient monitoring, logging, and checkpointing during training.
- Dataset Handling: Leveraged Hugging Face's
datasets
library for easy access to the news_commentary dataset and tokenization.
model
: Contains the architecture of the Transformer model. This directory includes the implementation of the model components.config
: Holds configuration files and hyperparameters for training. Adjust these files to modify the training settings and parameters.train
: Contains scripts and modules related to training the model. This is where the training process is defined and executed.utils
: Includes utility functions and classes, particularly those related to PyTorch Ignite handlers for monitoring, logging, and checkpointing.
To get started with this repository, follow these steps:
-
Clone the Repository:
git clone https://github.com/Mo-Ouail-Ocf/transformers-from-scratch.git cd transformers-from-scratch
-
Set Up Your Environment:
Create a Conda environment from the provided
env.yml
file:conda env create -f env.yml
Activate the Conda environment:
conda activate transformers-env
Note: This code is compatible with Python 3.11.
- Results and Visualization: Implement visualization tools for attention scores to better understand model performance and behavior.
-
Special thanks to Mr. Umar Jamil for his exceptional resources and tutorials, which provided valuable insights into implementing and understanding Transformer models.
-
For more information and in-depth tutorials, visit Mr. Umar Jamil's YouTube channel.