A collection of memory efficient attention operators implemented in the Triton language.
-
Updated
Jun 5, 2024 - Python
A collection of memory efficient attention operators implemented in the Triton language.
Triton implementation of FlashAttention2 that adds Custom Masks.
VIT inference in triton because, why not?
Triton implement of bi-directional (non-causal) linear attention
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️💾️📜️ The sourceCode:Triton category for AI2001, containing Triton programming language datasets
LAMB go brrr
A container of various PyTorch neural network modules written in Triton.
Writing TensorRT plugins using Triton and Python
Triton implementation for FISTA (Experimental)
Add a description, image, and links to the triton-lang topic page so that developers can more easily learn about it.
To associate your repository with the triton-lang topic, visit your repo's landing page and select "manage topics."