Feat (mx): unpadding during dequantization #1134

Giuseppe5 · 2024-12-18T16:58:08Z

Reason for this PR

Groupwise quantization requires padding when the input channel shape is not divisible by groupsize.

Padding works well until it doesn't, and there are important edge cases that were not covered by the previous implementation.
(e.g., weight only quantization where padding was required. Until now, we also had to force activation quantization because otherwise we had shape mismatch).

Changes Made in this PR

With the current implementation, we un-pad when dequantizing, taking care of all the edge cases

Few todos:

Consolidate dequantization for groupwise tensors + inference export
Fix typing in runtime activations
Testing

Testing Summary

Risk Highlight

This PR includes code from another work (please detail).
This PR contains API-breaking changes.
This PR depends on work in another PR (please provide links/details).
This PR introduces new dependencies (please detail).
There are coverage gaps not covered by tests.
Documentation updates required in subsequent PR.

Checklist

Code comments added to any hard-to-understand areas, if applicable.
Changes generate no new warnings.
Updated any relevant tests, if applicable.
No conflicts with destination dev branch.
I reviewed my own code changes.
Initial CI/CD passing.
1+ reviews given, and any review issues addressed and approved.
Post-review full CI/CD passing.

Giuseppe5 · 2024-12-19T10:48:47Z

src/brevitas/proxy/groupwise_float_parameter_quant.py

@@ -28,6 +28,7 @@ def apply_input_view(self, x):
        return x.flatten(start_dim, start_dim + 1)

    def create_quant_tensor(self, qt_args: Tuple[Any]) -> GroupwiseFloatQuantTensor:
+        shape = self.tracked_parameter_list[0].shape


We don't support weight quant sharing for groupwise anyway, so this is safe, but it is ugly.

Giuseppe5 · 2024-12-19T10:49:16Z

src/brevitas/quant_tensor/groupwise_float_quant_tensor.py

-            new_zp = self.zero_point_
-
-        return new_value, new_scale, new_zp
+        from brevitas.utils.quant_utils import groupwise_dequant


Also ugly, maybe the function should live somewhere else?

Giuseppe5 added 7 commits December 18, 2024 16:56

Feat (mx): unpadding during dequantization

eb0ba78

unpadding everything

8775ca2

fix for tensor

d7b5036

Fix zero point

e7ef917

Fix weight residual computation

33bd3f7

fix

ae52c79

fix

b8b08d1

Giuseppe5 commented Dec 19, 2024

View reviewed changes

Giuseppe5 added 2 commits December 19, 2024 11:09

Last fixes

2ac05a1

group dim

4dad221

Giuseppe5 requested a review from nickfraser December 19, 2024 14:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat (mx): unpadding during dequantization #1134

Feat (mx): unpadding during dequantization #1134

Giuseppe5 commented Dec 18, 2024 •

edited

Loading

Giuseppe5 Dec 19, 2024

Giuseppe5 Dec 19, 2024

Feat (mx): unpadding during dequantization #1134

Are you sure you want to change the base?

Feat (mx): unpadding during dequantization #1134

Conversation

Giuseppe5 commented Dec 18, 2024 • edited Loading

Reason for this PR

Changes Made in this PR

Testing Summary

Risk Highlight

Checklist

Giuseppe5 Dec 19, 2024

Choose a reason for hiding this comment

Giuseppe5 Dec 19, 2024

Choose a reason for hiding this comment

Giuseppe5 commented Dec 18, 2024 •

edited

Loading