Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[n, k//2] int4pack to [n, k//8] w/o transpose #1186

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ZhiweiYan-96
Copy link
Contributor

@ZhiweiYan-96 ZhiweiYan-96 commented Dec 20, 2024

Motivation

  1. Original should be [n, k//2] int4x2, target should be [n, k//8] int4x8, no need to transpose, so the indexing logic is changed
  2. [Need to be decided] We should not consider big endian/small endian here right? I think int4 should be packed in easiest way, aka, just right shift per bits.

int K_div_2 = K_ / 2;
int K_div_8 = K_ / 8;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although might never happened, but could you handle K_ not divided by 8?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, let me add an assert

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants