Quantizing with accuracy control for CNN autoencoder models #2639

korotaS · 2024-04-17T09:35:32Z

korotaS
Apr 17, 2024

Hi! I am trying to compress multiple CNN autoencoder models (can't share them) and I have a question about quantization with accuracy control. The problem is:

I don't have a train loop so I can't use quantization aware training;
I don't have any metrics for the performance of these autoencoder models.

My idea is to use a validate function as follows, which will calculate negative MSE between original model output and quantized model output (negative - to use max_drop correctly):

def mse_max(true, pred):
    return min(-np.mean((true - pred) ** 2), -1e-5)

def validate(model, validation_loader):
    mses = []

    for images in validation_loader:
        pred = model(images)[0]  # pred from quantized model
        infer_request = orig_model_comp.create_infer_request()
        orig = infer_request.infer(images)[0]  # pred from original OV model
        mses.append(mse_max(pred, orig))

    return np.mean(mses)

In my couple experiments this approach yields some promising results when I set max_drop=0.01. The quantized model outputs something similar to the original one but it is obviously far from perfect.

Can this approach be any good or are there any better approaches (maybe general approach for the case when there are no metrics available)?

Answered by alexsu52

Apr 19, 2024

Hi @korotaS,

As practice has shown, model metric is more robust metric to rank quantized operation. If model metric is not available then proxy metric should be used. We usually use MSE as default proxy metric as well. Choose proxy metric is magic and I don't have general recommendation. Based on algorithm working I can share some insides:

Validation dataset should include hard cases when quantized model gives bad result.
As the first step, the quantize with accuracy control algorithm calculates the impact of each quantized operation on the model metric, by calculating the metric of model in which the target quantized operation is returned to the original precision. Thus If you add print…

View full answer

alexsu52 · 2024-04-19T08:24:10Z

alexsu52
Apr 19, 2024
Maintainer

Hi @korotaS,

As practice has shown, model metric is more robust metric to rank quantized operation. If model metric is not available then proxy metric should be used. We usually use MSE as default proxy metric as well. Choose proxy metric is magic and I don't have general recommendation. Based on algorithm working I can share some insides:

Validation dataset should include hard cases when quantized model gives bad result.
As the first step, the quantize with accuracy control algorithm calculates the impact of each quantized operation on the model metric, by calculating the metric of model in which the target quantized operation is returned to the original precision. Thus If you add print the metric in the validation function print(f"count = {len(mses)}, mse = {np.mean(mses)}") you can estimate sensitivity of metric. Often the proxy metric is not very sensitive to changes, which leads to quantized operations reverting out of order. In such cases, you need to try to select a different proxy metric.

1 reply

korotaS Apr 19, 2024
Author

Hi @alexsu52, thanks for the detailed answer!
I did some experiments with SSIM validation metric on my own small dataset and it seems to work even better than MSE, so I think that it's true that selecting some other metric could be beneficial.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantizing with accuracy control for CNN autoencoder models #2639

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Quantizing with accuracy control for CNN autoencoder models #2639

korotaS Apr 17, 2024

Replies: 1 comment · 1 reply

alexsu52 Apr 19, 2024 Maintainer

korotaS Apr 19, 2024 Author

korotaS
Apr 17, 2024

Replies: 1 comment 1 reply

alexsu52
Apr 19, 2024
Maintainer

korotaS Apr 19, 2024
Author