Full Validated Models

The below tables are models enabled by the Intel® Neural Compressor.

TensorFlow 2.x models

Framework	Version	Model	Accuracy			Performance
			INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	INT8 throughput	FP32 throughput	Throughput Ratio[INT8/FP32]
			INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	CLX8280 1s 4c per instance bs1	CLX8280 1s 4c per instance bs1	Throughput Ratio[INT8/FP32]
tensorflow	2.5.0	resnet50v1.0	74.24%	74.27%	-0.04%	925.93	329.57	2.81x
tensorflow	2.5.0	resnet50v1.5	76.94%	76.46%	0.63%	726.14	281.58	2.58x
tensorflow	2.5.0	resnet101	77.21%	76.45%	0.99%	549.88	227.27	2.42x
tensorflow	2.5.0	inception_v1	70.30%	69.74%	0.80%	1256.73	705.65	1.78x
tensorflow	2.5.0	inception_v2	74.27%	73.97%	0.41%	1046.34	567.72	1.84x
tensorflow	2.5.0	inception_v3	77.29%	76.75%	0.70%	542.64	254.92	2.13x
tensorflow	2.5.0	inception_v4	80.36%	80.27%	0.11%	335.25	129.32	2.59x
tensorflow	2.5.0	inception_resnet_v2	80.42%	80.40%	0.02%	157.41	79.83	1.97x
tensorflow	2.5.0	mobilenetv1	73.93%	70.96%	4.19%	2372.88	691.70	3.43x
tensorflow	2.5.0	mobilenetv2	71.96%	71.76%	0.28%	1408.45	673.72	2.09x
tensorflow	2.5.0	ssd_resnet50_v1	37.91%	38.00%	-0.24%	49.84	17.03	2.93x
tensorflow	2.5.0	ssd_mobilenet_v1	23.02%	23.13%	-0.48%	571.43	260.22	2.20x
tensorflow	2.5.0	ssd_resnet34	21.97%	22.16%	-0.86%	26.49	7.29	3.63x
tensorflow	2.5.0	faster_rcnn_resnet101	30.33%	30.38%	-0.16%	45.47	12.99	3.50x
tensorflow	2.5.0	faster_rcnn_resnet101_saved	30.37%	30.38%	-0.03%	46.02	11.36	4.05x
tensorflow	2.5.0	mask_rcnn_inception_v2	28.61%	28.73%	-0.42%	89.78	35.58	2.52x
tensorflow	2.5.0	wide_deep_large_ds	77.61%	77.67%	-0.08%	5645.16	3723.40	1.52x
tensorflow	2.5.0	vgg16	72.13%	70.89%	1.75%	406.98	114.27	3.56x
tensorflow	2.5.0	vgg19	72.35%	71.01%	1.89%	344.83	94.39	3.65x
tensorflow	2.5.0	resnetv2_50	70.36%	69.64%	1.03%	448.72	378.58	1.19x
tensorflow	2.5.0	resnetv2_101	72.58%	71.87%	0.99%	271.84	205.46	1.32x
tensorflow	2.5.0	resnetv2_152	72.92%	72.37%	0.76%	188.78	138.83	1.36x
tensorflow	2.5.0	densenet121	72.31%	72.89%	-0.80%	213.54	145.14	1.47x
tensorflow	2.5.0	densenet161	76.36%	76.29%	0.09%	131.41	80.66	1.63x
tensorflow	2.5.0	densenet169	74.49%	74.65%	-0.21%	178.07	123.74	1.44x
tensorflow	2.5.0	ssd_resnet50_v1_ckpt	37.89%	38.00%	-0.29%	49.28	14.51	3.40x
tensorflow	2.5.0	ssd_mobilenet_v1_ckpt	23.02%	23.13%	-0.48%	573.30	219.37	2.61x
tensorflow	2.5.0	mask_rcnn_inception_v2_ckpt	28.61%	28.73%	-0.42%	85.90	34.10	2.52x
tensorflow	2.5.0	efficientnet_b0	78.53%	76.75%	2.32%	274.94	254.73	1.08x
tensorflow	2.5.0	resnet50_fashion	78.05%	78.12%	-0.09%	2229.30	938.34	2.37x

TensorFlow 1.x models

Framework	Version	Model	Accuracy			Performance
			INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	INT8 throughput	FP32 throughput	Throughput Ratio[INT8/FP32]
			INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	CLX8280 1s 4c per instance bs1	CLX8280 1s 4c per instance bs1	Throughput Ratio[INT8/FP32]
tensorflow	1.15.0-up2	bert_large_squad	92.4835	92.9805	-0.53%	15.86	5.50	2.88x
tensorflow	1.15.0-up2	bert_base_mrpc	86.03%	86.52%	-0.57%	138.31	92.08	1.50x
tensorflow	1.15.0-up2	resnet_v1_50_slim	76.05%	75.18%	1.16%	752.69	265.96	2.83x
tensorflow	1.15.0-up2	resnet_v1_101_slim	77.15%	76.40%	0.98%	465.43	139.28	3.34x
tensorflow	1.15.0-up2	resnet_v1_152_slim	77.56%	76.81%	0.98%	343.14	94.31	3.64x
tensorflow	1.15.0-up2	inception_v1_slim	70.41%	69.77%	0.92%	1202.75	573.30	2.10x
tensorflow	1.15.0-up2	inception_v2_slim	74.38%	73.98%	0.54%	1021.90	487.47	2.10x
tensorflow	1.15.0-up2	inception_v3_slim	78.32%	77.99%	0.42%	591.22	222.01	2.66x
tensorflow	1.15.0-up2	inception_v4_slim	80.35%	80.19%	0.20%	321.69	114.21	2.82x
tensorflow	1.15.0-up2	vgg16_slim	72.16%	70.89%	1.79%	411.04	113.45	3.62x
tensorflow	1.15.0-up2	vgg19_slim	72.22%	71.01%	1.70%	346.19	95.08	3.64x
tensorflow	1.15.0-up2	resnetv2_50_slim	70.39%	69.72%	0.96%	458.72	357.14	1.28x
tensorflow	1.15.0-up2	resnetv2_101_slim	72.51%	71.91%	0.83%	277.12	191.94	1.44x
tensorflow	1.15.0-up2	resnetv2_152_slim	72.98%	72.40%	0.80%	193.91	132.53	1.46x

PyTorch models

Framework	Version	Model	Accuracy			Performance
			INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	INT8 throughput	FP32 throughput	Throughput Ratio[INT8/FP32]
			INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	CLX8280 1s 4c per instance bs1	CLX8280 1s 4c per instance bs1	Throughput Ratio[INT8/FP32]
pytorch	1.9.0+cpu	resnet18	69.58%	69.76%	-0.26%	492.61	263.65	1.87x
pytorch	1.9.0+cpu	resnet50	75.87%	76.13%	-0.34%	281.24	130.01	2.16x
pytorch	1.9.0+cpu	resnext101_32x8d	79.09%	79.31%	-0.28%	109.32	47.45	2.30x
pytorch	1.9.0+cpu	bert_base_mrpc	88.16%	88.73%	-0.64%	170.11	85.83	1.98x
pytorch	1.9.0+cpu	bert_base_cola	58.29%	58.84%	-0.93%	178.71	83.91	2.13x
pytorch	1.9.0+cpu	bert_base_sts-b	88.65%	89.27%	-0.70%	176.81	84.27	2.10x
pytorch	1.9.0+cpu	bert_base_sst-2	91.63%	91.86%	-0.25%	177.71	84.16	2.11x
pytorch	1.9.0+cpu	bert_base_rte	69.31%	69.68%	-0.52%	177.17	85.53	2.07x
pytorch	1.9.0+cpu	bert_large_mrpc	87.48%	88.33%	-0.95%	62.06	24.83	2.50x
pytorch	1.9.0+cpu	bert_large_squad	92.78988	93.04683	-0.28%	13.89	7.49	1.85x
pytorch	1.9.0+cpu	bert_large_qnli	91.12%	91.82%	-0.76%	63.02	24.21	2.60x
pytorch	1.9.0+cpu	bert_large_rte	72.92%	72.56%	0.50%	46.07	23.45	1.96x
pytorch	1.9.0+cpu	bert_large_cola	62.85%	62.57%	0.45%	61.92	24.52	2.52x
pytorch	1.9.0+cpu	inception_v3	69.39%	69.54%	-0.21%	230.34	131.21	1.76x
pytorch	1.9.0+cpu	peleenet	71.54%	72.08%	-0.75%	271.32	203.96	1.33x
pytorch	1.9.0+cpu	yolo_v3	24.50%	24.54%	-0.17%	59.09	28.49	2.07x
pytorch	1.9.0+cpu	se_resnext50_32x4d	79.02%	79.08%	-0.07%	204.02	109.12	1.87x
pytorch	1.9.0+cpu	mobilenet_v2	70.73%	71.86%	-1.57%	445.01	329.26	1.35x
pytorch	1.9.0+cpu	blendcnn	68.40%	68.40%	0.00%	2868.85	2755.91	1.04x
pytorch	1.5.0a0+b58f89b	resnet50_ipex	75.80%	76.13%	-0.44%	353.71	213.09	1.66x
pytorch	1.9.0+cpu	gpt_wikitext	60.06256	60.19923	-0.23%	13.11	12.06	1.09x
pytorch	1.9.0+cpu	roberta_base_mrpc	85.37%	85.51%	-0.17%	173.78	85.54	2.03x
pytorch	1.9.0+cpu	camembert_base_mrpc	84.72%	84.22%	0.60%	158.16	84.63	1.87x
pytorch	1.9.0+cpu	distilbert_base_mrpc	81.17%	80.99%	0.21%	279.44	158.91	1.76x
pytorch	1.9.0+cpu	albert_base_mrpc	88.77%	88.50%	0.31%	22.88	18.28	1.25x
pytorch	1.9.0+cpu	funnel_mrpc	91.72%	92.26%	-0.58%	79.44	78.01	1.02x
pytorch	1.9.0+cpu	bart_wnli	49.30%	52.11%	-5.41%	21.74	19.92	1.09x
pytorch	1.9.0+cpu	mbart_wnli	56.34%	56.34%	0.00%	39.87	20.34	1.96x
pytorch	1.9.0+cpu	t5_wmt_en_ro	24.3855	24.5213	-0.55%	2.76	2.59	1.06x
pytorch	1.9.0+cpu	marianmt_wmt_en_ro	22.3857	22.225	0.72%	1.94	1.84	1.05x
pytorch	1.9.0+cpu	pegasus_billsum	50.2328	51.2135	-1.91%	0.18	0.11	1.56x
pytorch	1.9.0+cpu	dialogpt_wikitext	36.18182	36.18182	0.00%	4.37	4.35	1.00x
pytorch	1.9.0+cpu	xlm-roberta-base_mrpc	87.93%	88.62%	-0.78%	79.57	77.46	1.03x
pytorch	1.9.0+cpu	flaubert_mrpc	79.81%	80.19%	-0.48%	361.20	295.11	1.22x
pytorch	1.9.0+cpu	barthez_mrpc	83.25%	83.81%	-0.66%	112.72	67.00	1.68x
pytorch	1.9.0+cpu	longformer_mrpc	90.97%	91.46%	-0.53%	12.97	10.97	1.18x
pytorch	1.9.0+cpu	layoutlm_mrpc	81.22%	78.01%	4.12%	145.26	78.19	1.86x
pytorch	1.9.0+cpu	deberta_mrpc	90.29%	90.91%	-0.68%	78.70	50.84	1.55x
pytorch	1.9.0+cpu	squeezebert_mrpc	87.96%	87.65%	0.36%	145.56	126.72	1.15x
pytorch	1.9.0+cpu	resnet18_fx	69.61%	69.76%	-0.22%	503.96	257.73	1.96x
pytorch	1.9.0+cpu	xlnet_base_mrpc	89.43%	89.47%	-0.04%	67.93	52.56	1.29x
pytorch	1.9.0+cpu	transfo_xl_mrpc	82.09%	81.20%	1.09%	6.64	4.94	1.34x
pytorch	1.9.0+cpu	ctrl_mrpc	82.00%	82.00%	0.00%	15.34	5.70	2.69x
pytorch	1.9.0+cpu	xlm_mrpc	80.50%	79.56%	1.18%	39.06	12.90	3.03x
pytorch	1.9.0+cpu	maskrcnn_fx	37.70%	37.80%	-0.26%	59.58	38.66	1.54x

Quantization-aware training models

Framework	Version	Model	Accuracy			Performance
			INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	INT8 throughput	FP32 throughput	Throughput Ratio[INT8/FP32]
			INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	CLX8280 1s 4c per instance bs1	CLX8280 1s 4c per instance bs1	Throughput Ratio[INT8/FP32]
pytorch	1.9.0+cpu	resnet18_qat	69.75%	69.76%	-0.02%	492.96	262.86	1.87x
pytorch	1.9.0+cpu	resnet50_qat	76.05%	76.13%	-0.11%	273.97	128.53	2.13x
pytorch	1.9.0+cpu	resnet18_qat_fx	69.72%	69.76%	-0.05%	498.22	257.64	1.93x
pytorch	1.9.0+cpu	mobilenet_v2_qat	71.45%	71.86%	-0.56%	450.16	316.31	1.42x

MXNet models

Framework	Version	Model	Accuracy			Performance
			INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	INT8 throughput	FP32 throughput	Throughput Ratio[INT8/FP32]
			INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	CLX8280 1s 4c per instance bs1	CLX8280 1s 4c per instance bs1	Throughput Ratio[INT8/FP32]
mxnet	1.7.0	resnet50v1	76.08%	76.33%	-0.32%	1125.40	335.57	3.35x
mxnet	1.7.0	inceptionv3	77.73%	77.64%	0.11%	623.33	230.49	2.71x
mxnet	1.7.0	mobilenet1.0	71.69%	72.22%	-0.74%	4375.00	1741.29	2.51x
mxnet	1.7.0	mobilenetv2_1.0	70.78%	70.87%	-0.12%	3500.00	1284.40	2.73x
mxnet	1.7.0	resnet18_v1	70.02%	70.14%	-0.17%	2325.58	731.45	3.18x
mxnet	1.7.0	squeezenet1.0	56.74%	56.96%	-0.38%	2916.67	1093.75	2.67x
mxnet	1.7.0	ssd-resnet50_v1	80.21%	80.23%	-0.03%	187.82	40.07	4.69x
mxnet	1.7.0	ssd-mobilenet1.0	74.94%	75.54%	-0.79%	445.01	116.28	3.83x
mxnet	1.7.0	resnet152_v1	78.21%	78.54%	-0.42%	394.37	119.60	3.30x

ONNX Models

Framework	Version	Model	Accuracy			Performance
			INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	INT8 throughput	FP32 throughput	Throughput Ratio[INT8/FP32]
			INT8 Tuning Accuracy	FP32 Accuracy Baseline	Acc Ratio [(INT8-FP32)/FP32]	CLX8280 1s 4c per instance bs1	CLX8280 1s 4c per instance bs1	Throughput Ratio[INT8/FP32]
onnxrt	1.8.0	resnet50_v1_5	72.11%	72.28%	-0.24%	546.02	339.97	1.61x
onnxrt	1.8.0	bert_base_mrpc_static	85.29%	86.03%	-0.86%	479.12	210.97	2.27x
onnxrt	1.8.0	bert_base_mrpc_dynamic	85.54%	86.03%	-0.57%	244.84	100.00	2.45x
onnxrt	1.8.0	vgg16	66.58%	66.68%	-0.15%	101.35	79.25	1.28x
onnxrt	1.8.0	ssd_mobilenet_v1	22.41%	23.10%	-2.99%	427.87	377.16	1.13x
onnxrt	1.8.0	ssd_mobilenet_v2	23.80%	24.68%	-3.57%	339.48	279.89	1.21x
onnxrt	1.8.0	distilbert_base_mrpc	84.56%	84.56%	0.00%	1081.92	386.53	2.80x
onnxrt	1.8.0	mobilebert_mrpc	85.54%	86.27%	-0.85%	437.23	400.23	1.09x
onnxrt	1.8.0	roberta_base_mrpc	88.73%	89.46%	-0.82%	494.70	203.90	2.43x
onnxrt	1.8.0	resnet50-v1-12	74.83%	74.97%	-0.19%	642.79	348.26	1.85x
onnxrt	1.8.0	resnet_v1_5_mlperf	76.11%	76.47%	-0.47%	599.32	343.47	1.74x
onnxrt	1.8.0	mobilenet_v3_mlperf	75.51%	75.75%	-0.32%	1397.21	1007.19	1.39x
onnxrt	1.8.0	bert_squad_model_zoo	80.43519	80.67171	-0.29%	73.68	40.81	1.81x
onnxrt	1.8.0	mobilebert_squad_mlperf	89.84479	90.0265	-0.20%	60.52	57.30	1.06x
onnxrt	1.8.0	vgg16_model_zoo	72.37%	72.38%	-0.01%	122.85	79.57	1.54x

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

full_model_list.md

full_model_list.md

Full Validated Models

TensorFlow 2.x models

TensorFlow 1.x models

PyTorch models

Quantization-aware training models

MXNet models

ONNX Models

Files

full_model_list.md

Latest commit

History

full_model_list.md

File metadata and controls

Full Validated Models

TensorFlow 2.x models

TensorFlow 1.x models

PyTorch models

Quantization-aware training models

MXNet models

ONNX Models