You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I am trying to implement your quantization method in my system (Q(r)=Int(r/S) + Z), and I keep having very weird behavior.
going through your code - I saw that for weight quantization, Z=0 and for S:
n = 2 ** (num_bits - 1) - 1
if per_channel:
scale, _ = torch.max(torch.stack([saturation_min.abs(), saturation_max.abs()], dim=1), dim=1)
scale = torch.clamp(scale, min=1e-8) / n
else:
scale = max(saturation_min.abs(), saturation_max.abs())
scale = torch.clamp(scale, min=1e-8) / n
what I didn't manage to figure out is, how do you calculate saturation_min and saturation_max.
It looks like they are the max and min of the weights, but the weights are learnt - so are they the max and min before the gradient descent or after the update of the weights?
The text was updated successfully, but these errors were encountered:
Hello,
I am trying to implement your quantization method in my system (
Q(r)=Int(r/S) + Z
), and I keep having very weird behavior.going through your code - I saw that for weight quantization, Z=0 and for S:
what I didn't manage to figure out is, how do you calculate
saturation_min
andsaturation_max
.It looks like they are the max and min of the weights, but the weights are learnt - so are they the max and min before the gradient descent or after the update of the weights?
The text was updated successfully, but these errors were encountered: