Tranformer from scratch

This is a step by step implementation of Attention is all you need paper issued in 2017 The model was basically coded by Umar-jamil but i did my homework and did a step by step implementation with debugging which wasn't a simple task

preview.mp4

What did I Add ?

Automatic mixed precision for the forward loop in training and validation steps.
The ability to subset the data and try on smaller subset not the entire dataset -> config ['data_subset_ratio'].
added Pin_memory in datalaoder to True to enable faster data transfer to CUDA-enabled GPUs. -> reference ['https://pytorch.org/docs/stable/data.html'].
In validation loop no_inference context manager was used to squeeze every bit of performance -> reference ['https://pytorch.org/docs/stable/notes/autograd.html#inference-mode'].
With Adam optimizer i used fused kernel flag to speed up computation -> reference ['https://pytorch.org/docs/stable/optim.html'].

the application built over the transformer

basically it's a translator for en-it and the dataset was imported from Hugging-Face

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
config.py		config.py
dataset.py		dataset.py
model_architecture.py		model_architecture.py
requirements.txt		requirements.txt
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Tranformer from scratch

What did I Add ?

the application built over the transformer

About

Releases

Packages

Languages

Amr-Abdellatif/NLP-Tranformer-from-scratch-with-PyTorch-AMP

Folders and files

Latest commit

History

Repository files navigation

Tranformer from scratch

What did I Add ?

the application built over the transformer

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages