Dev quant squeeze #941

ScXfjiang · 2024-04-21T12:51:03Z

The PR was submitted to add the Squeeze/Unsqueeze operation to QuantTensor. Issue: #891

Changes

Add Squeeze/Unsqueeze operation to QuantTensor.
For meta-data sensitive operations in QuantTensor (e.g., reshape, flatten, transpose, permute, squeeze, unsqueeze), add per-channel granularity test cases.

See more discussions in #891 #728.

ScXfjiang · 2024-04-22T11:17:43Z

have formatted the code and added a quick fix @Giuseppe5

ScXfjiang · 2024-04-22T17:59:16Z

Hi @Giuseppe5,

I've updated the code to be compatible with PyTorch < 2.0, and it should now pass the CI checks.

In PyTorch < 2.0, torch.squeeze() doesn't support multiple dims.

Changelog of PyTorch 2.0:
Update torch.squeeze to allow squeezing multiple dimensions at once (pytorch/pytorch#89017)

To make Brevitas flexible, QuantTensor.squeeze() always supports multiple dims, even when using PyTorch < 2.0.

The method seems more complex than other meta-data sensitive methods because it needs to guarantee that the value tensor and the meta-data tensors choose the same dims to squeeze.

Giuseppe5 · 2024-05-14T09:54:07Z

src/brevitas/quant_tensor/int_quant_tensor.py

+        tensor_meta = {
+            'scale': self.scale, 'zero_point': self.zero_point, 'bit_width': self.bit_width}
+        for k, tm in tensor_meta.items():
+            if tm is not None and len(value.shape) == len(tm.shape) - len(sorted_target_dims):


metadatas cannot be None anymore

Giuseppe5 · 2024-05-14T09:55:43Z

src/brevitas/quant_tensor/torch_handler.py

@@ -46,6 +46,21 @@ def transpose_handler(inp, *args, **kwargs):
    return inp.transpose(*args, **kwargs)


+@implements(torch.permute)
+def permute_handler(inp, *args, **kwargs):
+    return inp.permute(*args, **kwargs)


Don't we need a permute implementation to permute metadata of quant tensor?

Sorry for the late reply. I will update my code next week : )

Giuseppe5 · 2024-05-23T13:12:35Z

As you can see, in the meantime we merged another PR to add support for FloatQuantTensor.
The two have different metadatas, so they would need slightly adjusted implementation. Feel free to keep this PR focused on IntQuantTensor and I will open an issue to track the same problem for FloatQuantTensor.

The extension to that should be trivial but I don't want to expand the scope too much.

ScXfjiang added 12 commits April 20, 2024 23:59

update test code for transpose

72f2aa1

update

4a63364

update

643e1ef

update

1071e71

update

71b4cfc

naive test code

fc2ff55

add per_channel granularity tests

73c4e1e

fix meta data

811c491

refine test

9f08050

update

7ab3563

update

cb6c632

update

5859fea

ScXfjiang mentioned this pull request Apr 21, 2024

Add squeeze / unsqueeze operations to quant invariant functions in torch_handler.py #891

Open

format and fix

b328705

ScXfjiang added 4 commits April 22, 2024 17:09

compatible code with PyTorch < 2.0

a568793

update

a6a4052

update

cda567c

update

e21c6cc

Giuseppe5 reviewed May 14, 2024

View reviewed changes

ScXfjiang closed this Jul 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dev quant squeeze #941

Dev quant squeeze #941

ScXfjiang commented Apr 21, 2024 •

edited

Loading

ScXfjiang commented Apr 22, 2024

ScXfjiang commented Apr 22, 2024

Giuseppe5 May 14, 2024

Giuseppe5 May 14, 2024

ScXfjiang May 23, 2024

Giuseppe5 commented May 23, 2024

Dev quant squeeze #941

Dev quant squeeze #941

Conversation

ScXfjiang commented Apr 21, 2024 • edited Loading

ScXfjiang commented Apr 22, 2024

ScXfjiang commented Apr 22, 2024

Giuseppe5 May 14, 2024

Choose a reason for hiding this comment

Giuseppe5 May 14, 2024

Choose a reason for hiding this comment

ScXfjiang May 23, 2024

Choose a reason for hiding this comment

Giuseppe5 commented May 23, 2024

ScXfjiang commented Apr 21, 2024 •

edited

Loading