Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Unity][Dlight] GeMV rule skip cases of "outer dim being grouped" #284

Merged
merged 1 commit into from
Aug 16, 2023

Conversation

MasterJH5574
Copy link
Member

Prior to this PR, the GeMV contains an assumption that "the grouped dimension of the largest tensor in GeMV must has the same iter type as the innermost dimension", which is enforced by an assertion.

Since AutoGPTQ quantization only supports KN GeMV layout, the decode-GeMV generated under the AutoGPTQ quantization has the pattern where the reduction dimension is not the innermost and is grouped. This pattern does not follow the assersion above and thus an assertion error will be thrown.

This PR updates the logic to skip transforming such cases, instead of asserting the assumption. One real AutoGPTQ decode-GeMV workload is added as a test case for future awareness.

Prior to this PR, the GeMV contains an assumption that "the grouped
dimension of the largest tensor in GeMV must has the same iter type
as the innermost dimension", which is enforced by an assertion.

Since AutoGPTQ quantization only supports KN GeMV layout, the
decode-GeMV generated under the AutoGPTQ quantization has the pattern
where the reduction dimension is not the innermost and is grouped.
This pattern does not follow the assersion above and thus an assertion
error will be thrown.

This PR updates the logic to skip transforming such cases, instead of
asserting the assumption. One real AutoGPTQ decode-GeMV workload is
added as a test case for future awareness.
@MasterJH5574 MasterJH5574 merged commit 08f4be7 into mlc-ai:mlc Aug 16, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant