-
Hi! I am trying to compress multiple CNN autoencoder models (can't share them) and I have a question about quantization with accuracy control. The problem is:
My idea is to use a validate function as follows, which will calculate negative MSE between original model output and quantized model output (negative - to use def mse_max(true, pred):
return min(-np.mean((true - pred) ** 2), -1e-5)
def validate(model, validation_loader):
mses = []
for images in validation_loader:
pred = model(images)[0] # pred from quantized model
infer_request = orig_model_comp.create_infer_request()
orig = infer_request.infer(images)[0] # pred from original OV model
mses.append(mse_max(pred, orig))
return np.mean(mses) In my couple experiments this approach yields some promising results when I set Can this approach be any good or are there any better approaches (maybe general approach for the case when there are no metrics available)? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @korotaS, As practice has shown, model metric is more robust metric to rank quantized operation. If model metric is not available then proxy metric should be used. We usually use MSE as default proxy metric as well. Choose proxy metric is magic and I don't have general recommendation. Based on algorithm working I can share some insides:
|
Beta Was this translation helpful? Give feedback.
Hi @korotaS,
As practice has shown, model metric is more robust metric to rank quantized operation. If model metric is not available then proxy metric should be used. We usually use MSE as default proxy metric as well. Choose proxy metric is magic and I don't have general recommendation. Based on algorithm working I can share some insides: