Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

divert calls to oneDNN's gemm_api into ACL #174

Conversation

fadara01
Copy link
Contributor

@fadara01 fadara01 commented Apr 6, 2023

In Transformer models, Tensorflow calls the MklMatMulOp layer which requires a BLAS SGEMM interface not supported by the Arm Compute Library (See #168). As a result, these layers to fall into the sub-optimal oneDNN's reference gemm_api kernels.
This patch directs calls to MklMatMulOp into BatchMatMulMkl for aarch64. Hence, avoiding the need for the BLAS SGEMM interface and allowing the ACL matmul kernels to be used instead of gemm_api.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants