Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Math: Optimise 16-bit matrix multiplication functions. #9088

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Commits on Aug 26, 2024

  1. Math: Add Doxygen documentation for matrix multiplication

    This patch introduces Doxygen-style documentation to the matrix
    multiplication functions. Clear descriptions and parameter details
    are provided to facilitate better understanding and ease of use.
    
    Signed-off-by: Shriram Shastry <malladi.sastry@intel.com>
    ShriramShastry committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    4a3f6b9 View commit details
    Browse the repository at this point in the history
  2. Math: Error Checking Enhancements

    - Added checks for integer overflow during shifting.
    - Validated matrix dimensions to prevent mismatches.
    - Ensured non-null pointers before operating on matrices.
    
    Signed-off-by: Shriram Shastry <malladi.sastry@intel.com>
    ShriramShastry committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    1ac1b5d View commit details
    Browse the repository at this point in the history
  3. Math: Change accumulator data type to int32_t for matrix multiplication

    Changed the accumulator data type from `int64_t` to `int32_t` to reduce
    instruction cycle count. This change results in an approximate 8.18% gain
    in performance for matrix multiplication operations.
    
    Performance Results:
    Compiler Settings: -O2
    +------------+------+------+--------+-----------+-----------+----------+
    | Test Name  | Rows | Cols | Cycles | Max Error | RMS Error | Result   |
    +------------+------+------+--------+-----------+-----------+----------+
    | Test 1     | 3    | 5    | 6487   | 0.00      | 0.00      | Pass     |
    | Test 2     | 6    | 8    | 6106   | 0.00      | 0.00      | Pass     |
    +------------+------+------+--------+-----------+-----------+----------+
    
    Signed-off-by: Shriram Shastry <malladi.sastry@intel.com>
    ShriramShastry committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    1f4f10a View commit details
    Browse the repository at this point in the history
  4. Math: Enhance pointer arithmetic in matrix multiplication

    Enhanced pointer arithmetic within loops to improve readability and
    reduce overhead. This change potentially reduces minor computational
    overhead, contributing to overall performance improvements of around
    8.23% for Test 1 and 16.00% for Test 2.
    
    Performance Results:
    Compiler Settings: -O3
    
    +------------+------+------+--------+-----------+-----------+----------+
    | Test Name  | Rows | Cols | Cycles | Max Error | RMS Error | Result   |
    +------------+------+------+--------+-----------+-----------+----------+
    | Test 1     | 3    | 5    | 5953   | 0.00      | 0.00      | Pass     |
    | Test 2     | 6    | 8    | 5128   | 0.00      | 0.00      | Pass     |
    +------------+------+------+--------+-----------+-----------+----------+
    
    Signed-off-by: Shriram Shastry <malladi.sastry@intel.com>
    ShriramShastry committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    5ddb7c0 View commit details
    Browse the repository at this point in the history
  5. Math: Update comments and apply cosmetic changes

    Updated comments for better clarity and understanding. Made cosmetic
    changes such as reformatting code and renaming variables to enhance
    readability without impacting functionality. This resulted in
    approximately 7.97% and 15.00% performance improvements for
    Test 1 and Test 2, respectively.
    
    Performance Results:
    Compiler Settings: -O2
    
    +------------+------+------+--------+-----------+-----------+----------+
    | Test Name  | Rows | Cols | Cycles | Max Error | RMS Error | Result   |
    +------------+------+------+--------+-----------+-----------+----------+
    | Test 1     | 3    | 5    | 5975   | 0.00      | 0.00      | Pass     |
    | Test 2     | 6    | 8    | 5192   | 0.00      | 0.00      | Pass     |
    +------------+------+------+--------+-----------+-----------+----------+
    
    Signed-off-by: Shriram Shastry <malladi.sastry@intel.com>
    ShriramShastry committed Aug 26, 2024
    Configuration menu
    Copy the full SHA
    53dbdef View commit details
    Browse the repository at this point in the history

Commits on Aug 27, 2024

  1. Math: Improve pointer manipulation in mat_multiply_elementwise

    - Enhanced data pointers for matrix elements
    - Streamlined loop iteration for matrix element-wise
     multiplication
    - Achieved a 0.09% performance improvement in cycle count
    
    | Rows | Cols | Cycles | Max Error | RMS Error | Result|
    +------+------+--------+-----------+-----------+-------+
    | 5    | 6    | 3359   | 0.00      | 0.00      | Pass  |
    
    Signed-off-by: Shriram Shastry <malladi.sastry@intel.com>
    ShriramShastry committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    65c21f0 View commit details
    Browse the repository at this point in the history
  2. Math: Switch mat_multiply_elementwise product type to int32_t

    - Changed product variable from int64_t to int32_t
    - Improved performance by reducing data size
    - Achieved a 11.57% performance improvement in cycle count
    
    | Rows | Cols | Cycles | Max Error | RMS Error | Result |
    +------+------+--------+-----------+-----------+--------+
    | 5    | 6    | 2972   | 0.00	   | 0.00      | Pass   |
    
    Signed-off-by: Shriram Shastry <malladi.sastry@intel.com>
    ShriramShastry committed Aug 27, 2024
    Configuration menu
    Copy the full SHA
    c3eeab1 View commit details
    Browse the repository at this point in the history