From 1bbe9e678e06b329d172f64d691273aca63e2325 Mon Sep 17 00:00:00 2001
From: Paul Balanca <paulb@graphcore.ai>
Date: Thu, 15 Aug 2024 09:35:24 +0100
Subject: [PATCH] wip

---
 docs/JAX FP8 matmul tutorial.ipynb | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/docs/JAX FP8 matmul tutorial.ipynb b/docs/JAX FP8 matmul tutorial.ipynb
index 93e71a9..48616ce 100644
--- a/docs/JAX FP8 matmul tutorial.ipynb	
+++ b/docs/JAX FP8 matmul tutorial.ipynb	
@@ -15,7 +15,9 @@
    "source": [
     "## Quickstart: FP8 in deep learning\n",
     "\n",
-    "The latest generation of machine learning hardware (Nvidia H100, AMD MI300, Graphcore C600, ...) have integrated direct FP8 support in the hardware, improving energy efficiency and throughput.\n",
+    "The latest generation of machine learning hardware (Nvidia H100, AMD MI300, Graphcore C600, ... TODO links) have integrated direct FP8 support in the hardware, improving energy efficiency and throughput.\n",
+    "\n",
+    "As shown the low precision ML literature, two distinct formats are necessary to support to achieve similar accuracy to `bfloat16` (or `float16`) training: `E4M3` and `E5M2` `float8` formats. As presented below, the two formats differ in the trade-off between precision (i.e. mantissa bits) and dynamic range (i.e. exponent bits). In short, `E4M3` is used for storing weights and activations whereas `E5M2` for representing backward gradients (which require a higher dynamic range).\n",
     "\n",
     "![image](img/fp-formats.webp)\n",
     "\n",