Skip to content

Commit

Permalink
wip
Browse files Browse the repository at this point in the history
  • Loading branch information
balancap committed Aug 15, 2024
1 parent 56795c2 commit 1bbe9e6
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion docs/JAX FP8 matmul tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,9 @@
"source": [
"## Quickstart: FP8 in deep learning\n",
"\n",
"The latest generation of machine learning hardware (Nvidia H100, AMD MI300, Graphcore C600, ...) have integrated direct FP8 support in the hardware, improving energy efficiency and throughput.\n",
"The latest generation of machine learning hardware (Nvidia H100, AMD MI300, Graphcore C600, ... TODO links) have integrated direct FP8 support in the hardware, improving energy efficiency and throughput.\n",
"\n",
"As shown the low precision ML literature, two distinct formats are necessary to support to achieve similar accuracy to `bfloat16` (or `float16`) training: `E4M3` and `E5M2` `float8` formats. As presented below, the two formats differ in the trade-off between precision (i.e. mantissa bits) and dynamic range (i.e. exponent bits). In short, `E4M3` is used for storing weights and activations whereas `E5M2` for representing backward gradients (which require a higher dynamic range).\n",
"\n",
"![image](img/fp-formats.webp)\n",
"\n",
Expand Down

0 comments on commit 1bbe9e6

Please sign in to comment.