-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TL/MLX5: add gtest for mcast #1023
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think those tests are run... On both Azure and github CI, tl/mlx5/bcast seem to have trouble initializing. Take a look at the logs
1b5a340
to
e5d263d
Compare
@samnordmann thanks Sam for the comments. I updated the PR |
I still think that those tests are not run. Am I missing something? |
So we have multiple algorithms in this file for Bcast, I added MCAST as new algorithm to the list. Not sure what else I should do. |
Can you please re-trigger the CI? |
e5d263d
to
1be54c8
Compare
@samnordmann done |
This is probably not a change for this PR, but should we update the gtest to show which tests are run instead of just a number? For example,
This will make it so instead of the test run looking like:
it will look like this:
for every test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test in on itself works as intended, the question is is there a fallback in the infra, hece why we're not seeing a failure.
@B-a-S , Where can we access these machines? We need to verify if we're actually running MCAST enabled test suite.
I thought the conclusion was that the CI on this PR should fail and reveal the bug fixed in #1022. Is there any update on this? Do we want to merge this PR regardless? |
Hey Sam, you are correct this PR should in fact fail, and it gives a false negative. The false negative seems to be that we're not running MCAST and instead are falling back or just ignoring the TL_MLX5. I need to dig further and haven't had a chance to look at it. But outside of the CI the test indeed fails. |
1be54c8
to
212050b
Compare
@MamziB rebase please |
212050b
to
32ac551
Compare
Can one of the admins verify this patch? |
32ac551
to
43c5cb4
Compare
@MamziB please fix the linter error |
TL/MLX5: add gtest for mcast