flash-attention-2

Flash Attention Implementation with Multiple Backend Support and Sharding This module provides a flexible implementation of Flash Attention with support for different backends (GPU, TPU, CPU) and platforms (Triton, Pallas, JAX).

pallas jax flash-attention flash-attention-2

Updated Nov 16, 2024
Python

BBC-Esq / WhisperS2T-transcriber

Star

Uses the powerful WhisperS2T and Ctranslate2 libraries to batch transcribe multiple files

audio-recorder audio-recording transcription audio-transcribing transcriber audio-transcription transcr ctranslate2 flash-attention-2 whispers2t

Updated Sep 17, 2024
Python

graphcore-research / flash-attention-ipu

Star

Poplar implementation of FlashAttention for IPU

deep-learning transformers pytorch ipu graphcore poplar flash-attention flash-attention-2

Updated Mar 12, 2024
C++

gietema / attention

Star

Toy Flash Attention implementation in torch

torch flash-attention flash-attention-2 flash-attention-3

Updated Sep 22, 2024
Python

lalitdotdev / transcribeX

Star

Transcribe audio in minutes with OpenAI's WhisperV3 and Flash Attention v2 + Transformers without relying on third-party providers and APIs. Host it yourself or try it out.

python modal transformers transcription wavesurfer-js nvidia-cuda bun nvidia-gpu virtual-environment fastapi huggingface-transformers flash-attention-2 next14 whisper- whisperv3

Updated Jun 18, 2024
TypeScript

Improve this page

Add a description, image, and links to the flash-attention-2 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the flash-attention-2 topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flash-attention-2

Here are 10 public repositories matching this topic...

DefTruth / Awesome-LLM-Inference

DefTruth / CUDA-Learn-Notes

arihanv / Shush

alexzhang13 / flashattention2-custom-mask

Bruce-Lee-LY / flash_attention_inference

erfanzar / jax-flash-attn2

BBC-Esq / WhisperS2T-transcriber

graphcore-research / flash-attention-ipu

gietema / attention

lalitdotdev / transcribeX

Improve this page

Add this topic to your repo