Highlights
- Pro
Stars
transformers
3 repositories
Understanding the Difficulty of Training Transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
Toolkit for attaching, training, saving and loading of new heads for transformer models