Skip to content

tenstorrent/tt-metal

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ttnn logo

TT-NN is a Python & C++ Neural Network OP library.


LLMs

Model Batch Hardware ttft (s) t/s/u Target t/s/u Release
Falcon7B-decode 32 e150 135 140
Falcon7B 32 n150 0.08 16.7 26 v0.51.0-rc24
Mistral-7B 32 n150 9.9 25 v0.51.0-rc28
Mamba-2.8B 32 n150 0.04 12.3 41 v0.51.0-rc26
LLaMA-3.1-8B 32 n150 8.3 23 v0.51.0-rc28
Falcon7B (data parallel) 32 LoudBox 0.11 13.4 26 v0.51.0-rc36
LLaMA-2-70B - (tensor parallel) 32 LoudBox 10.4 20 v0.51.0-rc36
LLaMA-3.1-70B (tensor parallel) 32 LoudBox 10.4 20 v0.51.0-rc36
Falcon40B (tensor parallel) 32 LoudBox 5.3 36 v0.51.0-rc35
Mixtral7Bx8 (tensor parallel) 32 LoudBox 0.19 15.7 33 v0.51.0-rc33
Falcon7B (data parallel) 1024 Galaxy 0.30 4.0 26 v0.51.0-rc30

CNNs

Model Batch Hardware fps Target fps Release
ResNet-50 (224x224) 20 e150 5,100 10,000
ResNet-50 (224x224) 16 n150 4,100 7,000
ResNet-50 (224x224) (data parallel) 128 LoudBox 31,250 56,000
ViT 8 e150 860 2,000
Stable Diffusion 1.4 (512x512) 1 n150 0.167 0.3

NLPs

Model Batch Hardware sen/sec Target sen/sec Release
BERT-Large 12 e150 370 410
BERT-Large 8 n150 270 400
T5 small e150 140
Bloom e150 70

Model Updates

For the latest model updates and features, please see MODEL_UPDATES.md

TT-NN Tech Reports


TT-Metalium logo

TT-Metalium is our low-level programming model, enabling kernel development for Tenstorrent hardware.

Getting started

Get started with simple kernels.

TT-Metalium Tech Reports