A packages containing all popular Learning Rate Schedulers. Implemented in Keras Tensorflow
-
Updated
Oct 5, 2022
A packages containing all popular Learning Rate Schedulers. Implemented in Keras Tensorflow
Clean-room GPT-2/GPT-3 implementation: tokenizers, architecture blocks, training loop with AdamW + cosine decay, CLI scripts, inference tools, and pytest suite. Covers OpenWebText-10k & WikiText-103 workflows. Designed as an academic reference for understanding and scaling decoder-only transformers
Add a description, image, and links to the cosine-decay topic page so that developers can more easily learn about it.
To associate your repository with the cosine-decay topic, visit your repo's landing page and select "manage topics."