A collection of memory efficient attention operators implemented in the Triton language.
-
Updated
Jun 5, 2024 - Python
A collection of memory efficient attention operators implemented in the Triton language.
Triton implementation of FlashAttention2 that adds Custom Masks.
Triton implement of bi-directional (non-causal) linear attention
VIT inference in triton because, why not?
A "standard library" of Triton kernels.
LAMB go brrr
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️💾️📜️ The sourceCode:Triton category for AI2001, containing Triton programming language datasets
🌳️🌐️#️⃣️ The Bliss Browser Triton (ClosedAI) language support module, allowing Triton (ClosedAI) programs to be written in and ran within the browser.
Writing TensorRT plugins using Triton and Python
A container of various PyTorch neural network modules written in Triton.
Fast Golu Activation in Triton
Triton implementation for FISTA (Experimental)
Add a description, image, and links to the triton-lang topic page so that developers can more easily learn about it.
To associate your repository with the triton-lang topic, visit your repo's landing page and select "manage topics."