Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[aapcs64] Describe the FPMR register and the FP8 types #273

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
182 changes: 98 additions & 84 deletions aapcs64/aapcs64.rst
Original file line number Diff line number Diff line change
Expand Up @@ -892,6 +892,11 @@ thread-local storage on platforms where multi-threaded code is
supported. The exact location of such information is platform
specific.

**(Alpha)**

The FPMR is a system register that controls behaviors of the FP8 instructions.
momchil-velikov marked this conversation as resolved.
Show resolved Hide resolved
It is a temporary register.

momchil-velikov marked this conversation as resolved.
Show resolved Hide resolved
Scalable vector registers
^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down Expand Up @@ -2570,6 +2575,9 @@ The mapping of C arithmetic types to Fundamental Data Types is shown in `Table 3
| | | significant bits of the type in a big-endian view. Non-significant |
| | | bits within the last quad-word are unspecified. |
+--------------------------------+-----------------------------------------+------------------------------------------------------------------------+
| **(Alpha)** ``__mfp8`` | unsigned byte | Arm extension. Values are intrepreted as either E5M2 or E4M3, |
momchil-velikov marked this conversation as resolved.
Show resolved Hide resolved
| | | depending on processor mode. |
+--------------------------------+-----------------------------------------+------------------------------------------------------------------------+

A platform ABI may specify a different combination of primitive variants but we discourage this.

Expand Down Expand Up @@ -2965,61 +2973,65 @@ The header file ``arm_neon.h`` also defines a number of intrinsic functions that

.. table:: Table 7: Short vector extended types

+-----------------+-------------------+--------------------------+-----------+
momchil-velikov marked this conversation as resolved.
Show resolved Hide resolved
| Internal type | arm\_neon.h type | Base Type | Elements |
+=================+===================+==========================+===========+
| __Int8x8\_t | int8x8\_t | signed byte | 8 |
+-----------------+-------------------+--------------------------+-----------+
| __Int16x4\_t | int16x4\_t | signed half-word | 4 |
+-----------------+-------------------+--------------------------+-----------+
| __Int32x2\_t | int32x2\_t | signed word | 2 |
+-----------------+-------------------+--------------------------+-----------+
| __Uint8x8\_t | uint8x8\_t | unsigned byte | 8 |
+-----------------+-------------------+--------------------------+-----------+
| __Uint16x4\_t | uint16x4\_t | unsigned half-word | 4 |
+-----------------+-------------------+--------------------------+-----------+
| __Uint32x2\_t | uint32x2\_t | unsigned word | 2 |
+-----------------+-------------------+--------------------------+-----------+
| __Float16x4\_t | float16x4\_t | half-precision float | 4 |
+-----------------+-------------------+--------------------------+-----------+
| __Float32x2\_t | float32x2\_t | single-precision float | 2 |
+-----------------+-------------------+--------------------------+-----------+
| __Poly8x8\_t | poly8x8\_t | unsigned byte | 8 |
+-----------------+-------------------+--------------------------+-----------+
| __Poly16x4\_t | poly16x4\_t | unsigned half-word | 4 |
+-----------------+-------------------+--------------------------+-----------+
| __Int8x16\_t | int8x16\_t | signed byte | 16 |
+-----------------+-------------------+--------------------------+-----------+
| __Int16x8\_t | int16x8\_t | signed half-word | 8 |
+-----------------+-------------------+--------------------------+-----------+
| __Int32x4\_t | int32x4\_t | signed word | 4 |
+-----------------+-------------------+--------------------------+-----------+
| __Int64x2\_t | int64x2\_t | signed double-word | 2 |
+-----------------+-------------------+--------------------------+-----------+
| __Uint8x16\_t | uint8x16\_t | unsigned byte | 16 |
+-----------------+-------------------+--------------------------+-----------+
| __Uint16x8\_t | uint16x8\_t | unsigned half-word | 8 |
+-----------------+-------------------+--------------------------+-----------+
| __Uint32x4\_t | uint32x4\_t | unsigned word | 4 |
+-----------------+-------------------+--------------------------+-----------+
| __Uint64x2\_t | uint64x2\_t | unsigned double-word | 2 |
+-----------------+-------------------+--------------------------+-----------+
| __Float16x8\_t | float16x8\_t | half-precision float | 8 |
+-----------------+-------------------+--------------------------+-----------+
| __Float32x4\_t | float32x4\_t | single-precision float | 4 |
+-----------------+-------------------+--------------------------+-----------+
| __Float64x2\_t | float64x2\_t | double-precision float | 2 |
+-----------------+-------------------+--------------------------+-----------+
| __Poly8x16\_t | poly8x16\_t | unsigned byte | 16 |
+-----------------+-------------------+--------------------------+-----------+
| __Poly16x8\_t | poly16x8\_t | unsigned half-word | 8 |
+-----------------+-------------------+--------------------------+-----------+
| __Poly64x2\_t | poly64x2\_t | unsigned double-word | 2 |
+-----------------+-------------------+--------------------------+-----------+
| __Bfloat16x4\_t | bfloat16x4\_t | half-precison Brain float| 4 |
+-----------------+-------------------+--------------------------+-----------+
| __Bfloat16x8\_t | bfloat16x8\_t | half-precison Brain float| 8 |
+-----------------+-------------------+--------------------------+-----------+
+-----------------------------+-------------------+--------------------------+-----------+
| Internal type | arm\_neon.h type | Base Type | Elements |
+=============================+===================+==========================+===========+
| __Int8x8\_t | int8x8\_t | signed byte | 8 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Int16x4\_t | int16x4\_t | signed half-word | 4 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Int32x2\_t | int32x2\_t | signed word | 2 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Uint8x8\_t | uint8x8\_t | unsigned byte | 8 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Uint16x4\_t | uint16x4\_t | unsigned half-word | 4 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Uint32x2\_t | uint32x2\_t | unsigned word | 2 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Float16x4\_t | float16x4\_t | half-precision float | 4 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Float32x2\_t | float32x2\_t | single-precision float | 2 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Poly8x8\_t | poly8x8\_t | unsigned byte | 8 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Poly16x4\_t | poly16x4\_t | unsigned half-word | 4 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Int8x16\_t | int8x16\_t | signed byte | 16 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Int16x8\_t | int16x8\_t | signed half-word | 8 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Int32x4\_t | int32x4\_t | signed word | 4 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Int64x2\_t | int64x2\_t | signed double-word | 2 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Uint8x16\_t | uint8x16\_t | unsigned byte | 16 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Uint16x8\_t | uint16x8\_t | unsigned half-word | 8 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Uint32x4\_t | uint32x4\_t | unsigned word | 4 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Uint64x2\_t | uint64x2\_t | unsigned double-word | 2 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Float16x8\_t | float16x8\_t | half-precision float | 8 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Float32x4\_t | float32x4\_t | single-precision float | 4 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Float64x2\_t | float64x2\_t | double-precision float | 2 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Poly8x16\_t | poly8x16\_t | unsigned byte | 16 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Poly16x8\_t | poly16x8\_t | unsigned half-word | 8 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Poly64x2\_t | poly64x2\_t | unsigned double-word | 2 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Bfloat16x4\_t | bfloat16x4\_t | half-precison Brain float| 4 |
+-----------------------------+-------------------+--------------------------+-----------+
| __Bfloat16x8\_t | bfloat16x8\_t | half-precison Brain float| 8 |
+-----------------------------+-------------------+--------------------------+-----------+
| **(Alpha)** __Mfloat8x8\_t | mfloat8x8\_t | modal 8-bit float | 8 |
+-----------------------------+-------------------+--------------------------+-----------+
| **(Alpha)** __Mfloat8x16\_t | mfloat8x16\_t | modal 8-bit float | 16 |
+-----------------------------+-------------------+--------------------------+-----------+

APPENDIX Support for Scalable vectors
=====================================
Expand Down Expand Up @@ -3054,35 +3066,37 @@ document.

.. table:: Table 8: Scalable Vector Types and Scalable Predicate Types

+---------------------+-----------------------+-------------------------------------------+----------------+
| Internal type | ``arm_sve.h`` type | Base type | Elements |
+=====================+=======================+===========================================+================+
| ``__SVInt8_t`` | ``svint8_t`` | signed byte | VG×8 |
+---------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVUint8_t`` | ``svuint8_t`` | unsigned byte | VG×8 |
+---------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVInt16_t`` | ``svint16_t`` | signed half-word | VG×4 |
+---------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVUint16_t`` | ``svuint16_t`` | unsigned half-word | VG×4 |
+---------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVFloat16_t`` | ``svfloat16_t`` | half-precision float | VG×4 |
+---------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVBfloat16_t`` | ``svbfloat16_t`` | half-precision brain float | VG×4 |
+---------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVInt32_t`` | ``svint32_t`` | signed word | VG×2 |
+---------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVUint32_t`` | ``svuint32_t`` | unsigned word | VG×2 |
+---------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVFloat32_t`` | ``svfloat32_t`` | single-precision float | VG×2 |
+---------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVInt64_t`` | ``svint64_t`` | signed double-word | VG |
+---------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVUint64_t`` | ``svuint64_t`` | unsigned double-word | VG |
+---------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVFloat64_t`` | ``svfloat64_t`` | double-precision float | VG |
+---------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVBool_t`` | ``svbool_t`` | single bit (fully packed into VG bytes) | VG×8 |
+---------------------+-----------------------+-------------------------------------------+----------------+
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| Internal type | ``arm_sve.h`` type | Base type | Elements |
+================================+=======================+===========================================+================+
| ``__SVInt8_t`` | ``svint8_t`` | signed byte | VG×8 |
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVUint8_t`` | ``svuint8_t`` | unsigned byte | VG×8 |
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVInt16_t`` | ``svint16_t`` | signed half-word | VG×4 |
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVUint16_t`` | ``svuint16_t`` | unsigned half-word | VG×4 |
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVFloat16_t`` | ``svfloat16_t`` | half-precision float | VG×4 |
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVBfloat16_t`` | ``svbfloat16_t`` | half-precision brain float | VG×4 |
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVInt32_t`` | ``svint32_t`` | signed word | VG×2 |
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVUint32_t`` | ``svuint32_t`` | unsigned word | VG×2 |
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVFloat32_t`` | ``svfloat32_t`` | single-precision float | VG×2 |
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVInt64_t`` | ``svint64_t`` | signed double-word | VG |
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVUint64_t`` | ``svuint64_t`` | unsigned double-word | VG |
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVFloat64_t`` | ``svfloat64_t`` | double-precision float | VG |
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| ``__SVBool_t`` | ``svbool_t`` | single bit (fully packed into VG bytes) | VG×8 |
+--------------------------------+-----------------------+-------------------------------------------+----------------+
| **(Alpha)** ``__SVMfloat8_t`` | ``svmfloat8_t`` | modal 8-bit float | VG×8 |
momchil-velikov marked this conversation as resolved.
Show resolved Hide resolved
+--------------------------------+-----------------------+-------------------------------------------+----------------+


APPENDIX C++ mangling
Expand Down
Loading