Skip to content

Latest commit

 

History

History
1466 lines (1446 loc) · 27.3 KB

full_model_list.md

File metadata and controls

1466 lines (1446 loc) · 27.3 KB

Full Validated Models

The below tables are models enabled by the Intel® Neural Compressor.

TensorFlow 2.x models

Framework Version Model Accuracy Performance
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio [(INT8-FP32)/FP32] INT8 throughput FP32 throughput Throughput Ratio[INT8/FP32]
CLX8280 1s 4c per instance bs1 CLX8280 1s 4c per instance bs1
tensorflow 2.5.0 resnet50v1.0 74.24% 74.27% -0.04% 925.93 329.57 2.81x
tensorflow 2.5.0 resnet50v1.5 76.94% 76.46% 0.63% 726.14 281.58 2.58x
tensorflow 2.5.0 resnet101 77.21% 76.45% 0.99% 549.88 227.27 2.42x
tensorflow 2.5.0 inception_v1 70.30% 69.74% 0.80% 1256.73 705.65 1.78x
tensorflow 2.5.0 inception_v2 74.27% 73.97% 0.41% 1046.34 567.72 1.84x
tensorflow 2.5.0 inception_v3 77.29% 76.75% 0.70% 542.64 254.92 2.13x
tensorflow 2.5.0 inception_v4 80.36% 80.27% 0.11% 335.25 129.32 2.59x
tensorflow 2.5.0 inception_resnet_v2 80.42% 80.40% 0.02% 157.41 79.83 1.97x
tensorflow 2.5.0 mobilenetv1 73.93% 70.96% 4.19% 2372.88 691.70 3.43x
tensorflow 2.5.0 mobilenetv2 71.96% 71.76% 0.28% 1408.45 673.72 2.09x
tensorflow 2.5.0 ssd_resnet50_v1 37.91% 38.00% -0.24% 49.84 17.03 2.93x
tensorflow 2.5.0 ssd_mobilenet_v1 23.02% 23.13% -0.48% 571.43 260.22 2.20x
tensorflow 2.5.0 ssd_resnet34 21.97% 22.16% -0.86% 26.49 7.29 3.63x
tensorflow 2.5.0 faster_rcnn_resnet101 30.33% 30.38% -0.16% 45.47 12.99 3.50x
tensorflow 2.5.0 faster_rcnn_resnet101_saved 30.37% 30.38% -0.03% 46.02 11.36 4.05x
tensorflow 2.5.0 mask_rcnn_inception_v2 28.61% 28.73% -0.42% 89.78 35.58 2.52x
tensorflow 2.5.0 wide_deep_large_ds 77.61% 77.67% -0.08% 5645.16 3723.40 1.52x
tensorflow 2.5.0 vgg16 72.13% 70.89% 1.75% 406.98 114.27 3.56x
tensorflow 2.5.0 vgg19 72.35% 71.01% 1.89% 344.83 94.39 3.65x
tensorflow 2.5.0 resnetv2_50 70.36% 69.64% 1.03% 448.72 378.58 1.19x
tensorflow 2.5.0 resnetv2_101 72.58% 71.87% 0.99% 271.84 205.46 1.32x
tensorflow 2.5.0 resnetv2_152 72.92% 72.37% 0.76% 188.78 138.83 1.36x
tensorflow 2.5.0 densenet121 72.31% 72.89% -0.80% 213.54 145.14 1.47x
tensorflow 2.5.0 densenet161 76.36% 76.29% 0.09% 131.41 80.66 1.63x
tensorflow 2.5.0 densenet169 74.49% 74.65% -0.21% 178.07 123.74 1.44x
tensorflow 2.5.0 ssd_resnet50_v1_ckpt 37.89% 38.00% -0.29% 49.28 14.51 3.40x
tensorflow 2.5.0 ssd_mobilenet_v1_ckpt 23.02% 23.13% -0.48% 573.30 219.37 2.61x
tensorflow 2.5.0 mask_rcnn_inception_v2_ckpt 28.61% 28.73% -0.42% 85.90 34.10 2.52x
tensorflow 2.5.0 efficientnet_b0 78.53% 76.75% 2.32% 274.94 254.73 1.08x
tensorflow 2.5.0 resnet50_fashion 78.05% 78.12% -0.09% 2229.30 938.34 2.37x

TensorFlow 1.x models

Framework Version Model Accuracy Performance
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio [(INT8-FP32)/FP32] INT8 throughput FP32 throughput Throughput Ratio[INT8/FP32]
CLX8280 1s 4c per instance bs1 CLX8280 1s 4c per instance bs1
tensorflow 1.15.0-up2 bert_large_squad 92.4835 92.9805 -0.53% 15.86 5.50 2.88x
tensorflow 1.15.0-up2 bert_base_mrpc 86.03% 86.52% -0.57% 138.31 92.08 1.50x
tensorflow 1.15.0-up2 resnet_v1_50_slim 76.05% 75.18% 1.16% 752.69 265.96 2.83x
tensorflow 1.15.0-up2 resnet_v1_101_slim 77.15% 76.40% 0.98% 465.43 139.28 3.34x
tensorflow 1.15.0-up2 resnet_v1_152_slim 77.56% 76.81% 0.98% 343.14 94.31 3.64x
tensorflow 1.15.0-up2 inception_v1_slim 70.41% 69.77% 0.92% 1202.75 573.30 2.10x
tensorflow 1.15.0-up2 inception_v2_slim 74.38% 73.98% 0.54% 1021.90 487.47 2.10x
tensorflow 1.15.0-up2 inception_v3_slim 78.32% 77.99% 0.42% 591.22 222.01 2.66x
tensorflow 1.15.0-up2 inception_v4_slim 80.35% 80.19% 0.20% 321.69 114.21 2.82x
tensorflow 1.15.0-up2 vgg16_slim 72.16% 70.89% 1.79% 411.04 113.45 3.62x
tensorflow 1.15.0-up2 vgg19_slim 72.22% 71.01% 1.70% 346.19 95.08 3.64x
tensorflow 1.15.0-up2 resnetv2_50_slim 70.39% 69.72% 0.96% 458.72 357.14 1.28x
tensorflow 1.15.0-up2 resnetv2_101_slim 72.51% 71.91% 0.83% 277.12 191.94 1.44x
tensorflow 1.15.0-up2 resnetv2_152_slim 72.98% 72.40% 0.80% 193.91 132.53 1.46x

PyTorch models

Framework Version Model Accuracy Performance
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio [(INT8-FP32)/FP32] INT8 throughput FP32 throughput Throughput Ratio[INT8/FP32]
CLX8280 1s 4c per instance bs1 CLX8280 1s 4c per instance bs1
pytorch 1.9.0+cpu resnet18 69.58% 69.76% -0.26% 492.61 263.65 1.87x
pytorch 1.9.0+cpu resnet50 75.87% 76.13% -0.34% 281.24 130.01 2.16x
pytorch 1.9.0+cpu resnext101_32x8d 79.09% 79.31% -0.28% 109.32 47.45 2.30x
pytorch 1.9.0+cpu bert_base_mrpc 88.16% 88.73% -0.64% 170.11 85.83 1.98x
pytorch 1.9.0+cpu bert_base_cola 58.29% 58.84% -0.93% 178.71 83.91 2.13x
pytorch 1.9.0+cpu bert_base_sts-b 88.65% 89.27% -0.70% 176.81 84.27 2.10x
pytorch 1.9.0+cpu bert_base_sst-2 91.63% 91.86% -0.25% 177.71 84.16 2.11x
pytorch 1.9.0+cpu bert_base_rte 69.31% 69.68% -0.52% 177.17 85.53 2.07x
pytorch 1.9.0+cpu bert_large_mrpc 87.48% 88.33% -0.95% 62.06 24.83 2.50x
pytorch 1.9.0+cpu bert_large_squad 92.78988 93.04683 -0.28% 13.89 7.49 1.85x
pytorch 1.9.0+cpu bert_large_qnli 91.12% 91.82% -0.76% 63.02 24.21 2.60x
pytorch 1.9.0+cpu bert_large_rte 72.92% 72.56% 0.50% 46.07 23.45 1.96x
pytorch 1.9.0+cpu bert_large_cola 62.85% 62.57% 0.45% 61.92 24.52 2.52x
pytorch 1.9.0+cpu inception_v3 69.39% 69.54% -0.21% 230.34 131.21 1.76x
pytorch 1.9.0+cpu peleenet 71.54% 72.08% -0.75% 271.32 203.96 1.33x
pytorch 1.9.0+cpu yolo_v3 24.50% 24.54% -0.17% 59.09 28.49 2.07x
pytorch 1.9.0+cpu se_resnext50_32x4d 79.02% 79.08% -0.07% 204.02 109.12 1.87x
pytorch 1.9.0+cpu mobilenet_v2 70.73% 71.86% -1.57% 445.01 329.26 1.35x
pytorch 1.9.0+cpu blendcnn 68.40% 68.40% 0.00% 2868.85 2755.91 1.04x
pytorch 1.5.0a0+b58f89b resnet50_ipex 75.80% 76.13% -0.44% 353.71 213.09 1.66x
pytorch 1.9.0+cpu gpt_wikitext 60.06256 60.19923 -0.23% 13.11 12.06 1.09x
pytorch 1.9.0+cpu roberta_base_mrpc 85.37% 85.51% -0.17% 173.78 85.54 2.03x
pytorch 1.9.0+cpu camembert_base_mrpc 84.72% 84.22% 0.60% 158.16 84.63 1.87x
pytorch 1.9.0+cpu distilbert_base_mrpc 81.17% 80.99% 0.21% 279.44 158.91 1.76x
pytorch 1.9.0+cpu albert_base_mrpc 88.77% 88.50% 0.31% 22.88 18.28 1.25x
pytorch 1.9.0+cpu funnel_mrpc 91.72% 92.26% -0.58% 79.44 78.01 1.02x
pytorch 1.9.0+cpu bart_wnli 49.30% 52.11% -5.41% 21.74 19.92 1.09x
pytorch 1.9.0+cpu mbart_wnli 56.34% 56.34% 0.00% 39.87 20.34 1.96x
pytorch 1.9.0+cpu t5_wmt_en_ro 24.3855 24.5213 -0.55% 2.76 2.59 1.06x
pytorch 1.9.0+cpu marianmt_wmt_en_ro 22.3857 22.225 0.72% 1.94 1.84 1.05x
pytorch 1.9.0+cpu pegasus_billsum 50.2328 51.2135 -1.91% 0.18 0.11 1.56x
pytorch 1.9.0+cpu dialogpt_wikitext 36.18182 36.18182 0.00% 4.37 4.35 1.00x
pytorch 1.9.0+cpu xlm-roberta-base_mrpc 87.93% 88.62% -0.78% 79.57 77.46 1.03x
pytorch 1.9.0+cpu flaubert_mrpc 79.81% 80.19% -0.48% 361.20 295.11 1.22x
pytorch 1.9.0+cpu barthez_mrpc 83.25% 83.81% -0.66% 112.72 67.00 1.68x
pytorch 1.9.0+cpu longformer_mrpc 90.97% 91.46% -0.53% 12.97 10.97 1.18x
pytorch 1.9.0+cpu layoutlm_mrpc 81.22% 78.01% 4.12% 145.26 78.19 1.86x
pytorch 1.9.0+cpu deberta_mrpc 90.29% 90.91% -0.68% 78.70 50.84 1.55x
pytorch 1.9.0+cpu squeezebert_mrpc 87.96% 87.65% 0.36% 145.56 126.72 1.15x
pytorch 1.9.0+cpu resnet18_fx 69.61% 69.76% -0.22% 503.96 257.73 1.96x
pytorch 1.9.0+cpu xlnet_base_mrpc 89.43% 89.47% -0.04% 67.93 52.56 1.29x
pytorch 1.9.0+cpu transfo_xl_mrpc 82.09% 81.20% 1.09% 6.64 4.94 1.34x
pytorch 1.9.0+cpu ctrl_mrpc 82.00% 82.00% 0.00% 15.34 5.70 2.69x
pytorch 1.9.0+cpu xlm_mrpc 80.50% 79.56% 1.18% 39.06 12.90 3.03x
pytorch 1.9.0+cpu maskrcnn_fx 37.70% 37.80% -0.26% 59.58 38.66 1.54x

Quantization-aware training models

Framework Version Model Accuracy Performance
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio [(INT8-FP32)/FP32] INT8 throughput FP32 throughput Throughput Ratio[INT8/FP32]
CLX8280 1s 4c per instance bs1 CLX8280 1s 4c per instance bs1
pytorch 1.9.0+cpu resnet18_qat 69.75% 69.76% -0.02% 492.96 262.86 1.87x
pytorch 1.9.0+cpu resnet50_qat 76.05% 76.13% -0.11% 273.97 128.53 2.13x
pytorch 1.9.0+cpu resnet18_qat_fx 69.72% 69.76% -0.05% 498.22 257.64 1.93x
pytorch 1.9.0+cpu mobilenet_v2_qat 71.45% 71.86% -0.56% 450.16 316.31 1.42x

MXNet models

Framework Version Model Accuracy Performance
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio [(INT8-FP32)/FP32] INT8 throughput FP32 throughput Throughput Ratio[INT8/FP32]
CLX8280 1s 4c per instance bs1 CLX8280 1s 4c per instance bs1
mxnet 1.7.0 resnet50v1 76.08% 76.33% -0.32% 1125.40 335.57 3.35x
mxnet 1.7.0 inceptionv3 77.73% 77.64% 0.11% 623.33 230.49 2.71x
mxnet 1.7.0 mobilenet1.0 71.69% 72.22% -0.74% 4375.00 1741.29 2.51x
mxnet 1.7.0 mobilenetv2_1.0 70.78% 70.87% -0.12% 3500.00 1284.40 2.73x
mxnet 1.7.0 resnet18_v1 70.02% 70.14% -0.17% 2325.58 731.45 3.18x
mxnet 1.7.0 squeezenet1.0 56.74% 56.96% -0.38% 2916.67 1093.75 2.67x
mxnet 1.7.0 ssd-resnet50_v1 80.21% 80.23% -0.03% 187.82 40.07 4.69x
mxnet 1.7.0 ssd-mobilenet1.0 74.94% 75.54% -0.79% 445.01 116.28 3.83x
mxnet 1.7.0 resnet152_v1 78.21% 78.54% -0.42% 394.37 119.60 3.30x

ONNX Models

Framework Version Model Accuracy Performance
INT8 Tuning Accuracy FP32 Accuracy Baseline Acc Ratio [(INT8-FP32)/FP32] INT8 throughput FP32 throughput Throughput Ratio[INT8/FP32]
CLX8280 1s 4c per instance bs1 CLX8280 1s 4c per instance bs1
onnxrt 1.8.0 resnet50_v1_5 72.11% 72.28% -0.24% 546.02 339.97 1.61x
onnxrt 1.8.0 bert_base_mrpc_static 85.29% 86.03% -0.86% 479.12 210.97 2.27x
onnxrt 1.8.0 bert_base_mrpc_dynamic 85.54% 86.03% -0.57% 244.84 100.00 2.45x
onnxrt 1.8.0 vgg16 66.58% 66.68% -0.15% 101.35 79.25 1.28x
onnxrt 1.8.0 ssd_mobilenet_v1 22.41% 23.10% -2.99% 427.87 377.16 1.13x
onnxrt 1.8.0 ssd_mobilenet_v2 23.80% 24.68% -3.57% 339.48 279.89 1.21x
onnxrt 1.8.0 distilbert_base_mrpc 84.56% 84.56% 0.00% 1081.92 386.53 2.80x
onnxrt 1.8.0 mobilebert_mrpc 85.54% 86.27% -0.85% 437.23 400.23 1.09x
onnxrt 1.8.0 roberta_base_mrpc 88.73% 89.46% -0.82% 494.70 203.90 2.43x
onnxrt 1.8.0 resnet50-v1-12 74.83% 74.97% -0.19% 642.79 348.26 1.85x
onnxrt 1.8.0 resnet_v1_5_mlperf 76.11% 76.47% -0.47% 599.32 343.47 1.74x
onnxrt 1.8.0 mobilenet_v3_mlperf 75.51% 75.75% -0.32% 1397.21 1007.19 1.39x
onnxrt 1.8.0 bert_squad_model_zoo 80.43519 80.67171 -0.29% 73.68 40.81 1.81x
onnxrt 1.8.0 mobilebert_squad_mlperf 89.84479 90.0265 -0.20% 60.52 57.30 1.06x
onnxrt 1.8.0 vgg16_model_zoo 72.37% 72.38% -0.01% 122.85 79.57 1.54x