-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new batch_gemm types #466
Add new batch_gemm types #466
Conversation
@Rbiessy, cc |
@AidanBeltonS Thanks for the PR. Before going through the review in more detail, what is your plan for this issue? Why openBLAS come into picture here? I would prefer to have all applicable backends working before adding these new APIs. "I have been unable to test the mkl backends as I was running into some problems regarding duplicate definitions between the mkl headers and the openBlas/CBlas headers." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the reference cblas implementation doesn't support some of the operations that are being added (as I understand it), is the new functionality actually tested?
Hey @mmeterel, I checked with Aidan about the issue with the MKL backends. The duplicate definitions seemed to be an issue with the setup or build commands used. We ran into another issue with undefined references with
Looking at
We can use 2024.0 for the tests for now. Aidan is running more tests. |
@Rbiessy @AidanBeltonS AFAIK, there should not be any issues with missing symbols with 2024.1. This version has been in CI for a while now. I would suspect it can be a rebase issue on your branch. We should make it functional with 2024.1. |
@andrewtbarker Will you be able to help with this review? |
Sure, I will take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR, there is a lot of good work here. Most of my comments are just about style and naming consistency.
Yes, this should have been fixed in #445 . If not we should fix it. |
Have you tested the PR with hipSYCL/AdaptiveSYCL? Can you please add the logs? |
e436f1c
to
20a7057
Compare
No I have not tested HIPsycl. I have attached the other backend tests below. Netlib and portblas are passing fine. MKL has some failing tests due to tolerating which I am investigating further. It seems it deviates more from the reference implementation in some cases. MKL tests error: |
It looks like |
The failures it Dot are due to error
DotU is an odd one, it does not appear to be related to my changes however
|
I have resolved all but one issue with GemmBatch's tests. The CPU MKL implementation has significant amounts of error compared to the GPU. I believe there may be a fundamental difference in the precision of the calculation for the CPU. One possible fix would be to increase the tolerance significantly just for the CPU. Im not a fan of this approach as it is a bit of a brute force solution. Does anyone have any recommendations on how they would like to see this handled?
|
Can you please test hipSYCL backend as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions to make interpreting failed test results easier - I flagged a few places but there are similar issues in most of the new tests.
If, as we suspect, the CPU backend is doing accumulation in double while the GPU backend does it in float, one option would be changing what reference gemm from |
What is the status here? As I see it we have three outstanding items:
(1) may be a larger issue with CI that in my opinion can be dealt with separately in another PR. (2) is minor and should be easy to fix. I hope (3) is also minor but I'm not sure, is there any progress understanding it? |
Hi @andrewtbarker, I have updated the status by email as it was easier to discuss issues with testing AdaptiveCpp on the CI. In short there are a few issues @AidanBeltonS will need to look at once he is back from Holiday next week! |
I have addressed items 2. and 3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're good to go. Thanks for sticking with this one!
6e70f68
to
7dee1c1
Compare
I have also disabled the in8, float combination for MKLCPU/GPU as I found similar precision issues. |
I have confirmed the tests pass with AdaptiveCpp on AMD and Nvidia HW. |
Add support for more batch_gemm types to follow the specification. Some combination using int8 are disabled on some backends due to precision issue.
Description
This adds new data types for the gemm_batch operation, to better be in line with the oneMKL spec. The types added are <half, half, float, float>, <int8, int8, float, float>, and <int8, int8, int32, float>.
New testing is added for these data types. Tests where the scalar type does not match the input type require a higher tolerance as the reference calculation is being performed at a much higher precision.
Test logs:
rocblas_test_log.txt
cublas_test_log.txt
I have been unable to test the mkl backends as I was running into some problems regarding duplicate definitions between the mkl headers and the openBlas/CBlas headers.
Fixes # (GitHub issue)
#446
Checklist
All Submissions
New interfaces
it was accepted? # (RFC)
New features