wip

graphcore-research · Aug 15, 2024 · 1bbe9e6 · 1bbe9e6
1 parent 56795c2
commit 1bbe9e6
Showing 1 changed file with 3 additions and 1 deletion.
diff --git a/docs/JAX FP8 matmul tutorial.ipynb b/docs/JAX FP8 matmul tutorial.ipynb
@@ -15,7 +15,9 @@
  "source": [
  "## Quickstart: FP8 in deep learning\n",
  "\n",
- "The latest generation of machine learning hardware (Nvidia H100, AMD MI300, Graphcore C600, ...) have integrated direct FP8 support in the hardware, improving energy efficiency and throughput.\n",
+ "The latest generation of machine learning hardware (Nvidia H100, AMD MI300, Graphcore C600, ... TODO links) have integrated direct FP8 support in the hardware, improving energy efficiency and throughput.\n",
+ "\n",
+ "As shown the low precision ML literature, two distinct formats are necessary to support to achieve similar accuracy to `bfloat16` (or `float16`) training: `E4M3` and `E5M2` `float8` formats. As presented below, the two formats differ in the trade-off between precision (i.e. mantissa bits) and dynamic range (i.e. exponent bits). In short, `E4M3` is used for storing weights and activations whereas `E5M2` for representing backward gradients (which require a higher dynamic range).\n",
  "\n",
  "![image](img/fp-formats.webp)\n",
  "\n",