Transformers from Scratch for English-to-Arabic Translation

This repository contains an implementation of a Transformer model built from scratch using PyTorch. The model is trained for English-to-Arabic translation using the news commentary dataset.

Key Features

Custom Transformer Architecture: Implemented from scratch in PyTorch, inspired by Mr. Umar Jamil's code and his YouTube video.
Monitoring and Logging: Utilized PyTorch Ignite for efficient monitoring, logging, and checkpointing during training.
Dataset Handling: Leveraged Hugging Face's datasets library for easy access to the news_commentary dataset and tokenization.

Repository packages structure

model: Contains the architecture of the Transformer model. This directory includes the implementation of the model components.
config: Holds configuration files and hyperparameters for training. Adjust these files to modify the training settings and parameters.
train: Contains scripts and modules related to training the model. This is where the training process is defined and executed.
utils: Includes utility functions and classes, particularly those related to PyTorch Ignite handlers for monitoring, logging, and checkpointing.

Getting Started

To get started with this repository, follow these steps:

Clone the Repository:

git clone https://github.com/Mo-Ouail-Ocf/transformers-from-scratch.git
cd transformers-from-scratch

Set Up Your Environment:

Create a Conda environment from the provided env.yml file:
```
conda env create -f env.yml
```
Activate the Conda environment:
```
conda activate transformers-env
```
Note: This code is compatible with Python 3.11.

Future Tasks

Results and Visualization: Implement visualization tools for attention scores to better understand model performance and behavior.

Acknowledgements

Special thanks to Mr. Umar Jamil for his exceptional resources and tutorials, which provided valuable insights into implementing and understanding Transformer models.
For more information and in-depth tutorials, visit Mr. Umar Jamil's YouTube channel.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
config.py		config.py
data.py		data.py
env.yml		env.yml
model.py		model.py
readme.md		readme.md
tokenizer_ar.json		tokenizer_ar.json
tokenizer_en.json		tokenizer_en.json
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Transformers from Scratch for English-to-Arabic Translation

Key Features

Repository packages structure

Getting Started

Future Tasks

Acknowledgements

About

Releases

Packages

Languages

Mo-Ouail-Ocf/transformers-from-scratch

Folders and files

Latest commit

History

Repository files navigation

Transformers from Scratch for English-to-Arabic Translation

Key Features

Repository packages structure

Getting Started

Future Tasks

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages