-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance: Quant Tensor Test #894
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's create a folder brevitas/quant_tensor for these tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to leave it as is at the moment as the other tests in this area don't have their own folders and I want to keep this PR nice and simple. If it expands later we can create a folder.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All tests in brevitas/core refer to file that are srcs/brevitas/core (we don't follow all the subfolders but that's the general idea).
QuantTensor does not belong there. Moreover, soon there will be more QuantTensor stuff to test so we might as well go ahead and create a folder.
MATMUL = 4 | ||
|
||
|
||
# QuantTensor isn't meant to be initialized directly, it'll be invalid if you do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say that it could be invalid if you generate it manually (likewise, it is possible to generate manually a valid QuantTensor if you carefully pick scale factors, bit_width, values, etc.).
# QuantTensor isn't meant to be initialized directly, it'll be invalid if you do | ||
# so you need to create it indirectly via QuantIdentity for example | ||
def to_quant_tensor(input: torch.Tensor) -> QuantTensor: | ||
mod = QuantIdentity(bit_width=8, quant_type=QuantType.INT, return_quant_tensor=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe quant_type
arg is not necessary.
assert False | ||
|
||
# tolerance set to a high value as there is considerable loss of precision | ||
assert torch.isclose(normal, quant, atol=0.1).all().item() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is still tolerance required for all operators or are there any operators more troublesome than other?
I would try to keep a tighter bound where possible if it's not too much headache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll see if I can tighten up the tolerance
quant = qa * qb | ||
elif op == Operator.MATMUL: | ||
normal = a @ b | ||
# @ matmul operator not implemented for QuantTensor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the difference between @
and matmul? Also in terms of implementations, what would we need to override to implement @
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe there is a difference so its probably something we should create an issue to implement
assert torch.isclose(a.transpose(0, 1), b.transpose(0, 1), atol=0.01).all().item() | ||
|
||
|
||
def test_quant_tensor_view(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
View and transpose open the discussion to a broader topic regarding how to deal with quant metadata views and transpose, especially in the case where we are doing per channel or finer granularity quantizations.
For now, I would add a TODO in both test case that says that we need to deal with quant metadata and test it
quant = qa * qb | ||
elif op == Operator.MATMUL: | ||
normal = a @ b | ||
# @ matmul operator not implemented for QuantTensor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe there is a difference so its probably something we should create an issue to implement
x = torch.randn(4, 4) | ||
quant_tensor = to_quant_tensor(x) | ||
normal_tensor = torch.Tensor(x) | ||
assert torch.allclose(qdq(normal_tensor, quant_tensor), quant_tensor, rtol=0.01) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
difference between the qdq result and quant tensor is extremely close but some error is creeping in from the quanttensor somewhere so added relative tolerance
qb = to_quant_tensor(b) | ||
|
||
# to factor in quantisation error | ||
e_a = a - qa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didn't use qdq approach above as should be covered by the init test, I just need the difference so I can incorporate it into the calculations below
# @ matmul operator not implemented for QuantTensor | ||
quant = torch.matmul(qa, qb) | ||
normal = (a - e_a) @ (b - e_b) | ||
else: | ||
# unrecognised operator | ||
assert False | ||
|
||
# tolerance set to a high value as there is considerable loss of precision |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment is outdated I believe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small clean-up of comments but otherwise it's ready to go
Issue
Implements #727
Details
Creates test file for QuantTensor
Has tests for:
It's not exhaustive but should form a good starting point.