divert calls to oneDNN's gemm_api into ACL #174

fadara01 · 2023-04-06T14:09:15Z

In Transformer models, Tensorflow calls the MklMatMulOp layer which requires a BLAS SGEMM interface not supported by the Arm Compute Library (See #168). As a result, these layers to fall into the sub-optimal oneDNN's reference gemm_api kernels.
This patch directs calls to MklMatMulOp into BatchMatMulMkl for aarch64. Hence, avoiding the need for the BLAS SGEMM interface and allowing the ACL matmul kernels to be used instead of gemm_api.

divert calls to oneDNN's gemm_api into ACL

ffaa8ff

nSircombe approved these changes Apr 6, 2023

View reviewed changes

nSircombe merged commit bbcfdbc into ARM-software:main Apr 6, 2023

nSircombe mentioned this pull request Apr 6, 2023

When oneDNN is enabled, an unoptimized matmul is called by Tensorflow on aarch64 #168

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

divert calls to oneDNN's gemm_api into ACL #174

divert calls to oneDNN's gemm_api into ACL #174

fadara01 commented Apr 6, 2023

divert calls to oneDNN's gemm_api into ACL #174

divert calls to oneDNN's gemm_api into ACL #174

Conversation

fadara01 commented Apr 6, 2023