Skip to content

Commit

Permalink
Merge pull request #208 from jakevdp:update-readme
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 679140513
  • Loading branch information
The ml_dtypes Authors committed Sep 26, 2024
2 parents 10f0272 + 7ac48c6 commit ecd6b68
Showing 1 changed file with 10 additions and 4 deletions.
14 changes: 10 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,20 +8,26 @@

- [`bfloat16`](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format):
an alternative to the standard [`float16`](https://en.wikipedia.org/wiki/Half-precision_floating-point_format) format
- `float8_*`: several experimental 8-bit floating point representations
including:
- 8-bit floating point representations, parameterized by number of exponent and
mantissa bits, as well as the bias (if any) and representability of infinity,
NaN, and signed zero.
* `float8_e3m4`
* `float8_e4m3`
* `float8_e4m3b11fnuz`
* `float8_e4m3fn`
* `float8_e4m3fnuz`
* `float8_e5m2`
* `float8_e5m2fnuz`
- Microscaling (MX) sub-byte floating point representations including:
* `float8_e8m0fnu`
- Microscaling (MX) sub-byte floating point representations:
* `float4_e2m1fn`
* `float6_e2m3fn`
* `float6_e3m2fn`
- `int2`, `int4`, `uint2` and `uint4`: low precision integer types.
- Narrow integer encodings:
* `int2`
* `int4`
* `uint2`
* `uint4`

See below for specifications of these number formats.

Expand Down

0 comments on commit ecd6b68

Please sign in to comment.