This project implements the decoder part of the "Attention is All You Need" architecture, following Andrej Karpathy's video explanation.
View Colab Notebook ยป
Dataset
โข
Paper
โข
Karpathy's video
This project provides a step-by-step implementation of the decoder component from the "Attention is All You Need" transformer architecture, guided by Andrej Karpathy's insightful video. The goal is to offer a clear and accessible understanding of how transformers work by building the decoder from scratch.
Key Features:
- Educational Focus: Designed to help learners grasp the intricacies of transformers through a hands-on approach.
- Step-by-Step Implementation: Mirrors the structure and explanations in Karpathy's video, making it easy to follow along.
- Decoder-Centric: Focuses specifically on the decoder component, which is crucial for understanding sequence generation tasks.
- From-Scratch Approach: Emphasizes building the model without relying on high-level libraries, promoting a deeper understanding of the underlying mechanisms.
We assume no responsibility for an improper use of this code and everything related to it. We do not assume any responsibility for damage caused to people and / or objects in the use of the code.
By using this code even in a small part, the developers are declined from any responsibility.In the event that the software uses third-party components for its operation,
the individual licenses are indicated in the following section.
Software list:
Software | License owner | License type | Link |
---|---|---|---|
pyTorch | PyTorch | Multiple | here |
Copyrright (C) by Pietrobon Andrea
Released date: 15-09-2024