Skip to content

Latest commit

 

History

History
65 lines (49 loc) · 2.25 KB

get_started.md

File metadata and controls

65 lines (49 loc) · 2.25 KB

Getting Started

  1. Quick Samples

    1.1 Quantization with Python API

    1.2 Quantization with JupyterLab Extension

    1.3 Quantization with GUI

  2. Validated Models

Quick Samples

Quantization with Python API

# Install Intel Neural Compressor and TensorFlow
pip install neural-compressor
pip install tensorflow
# Prepare fp32 model
wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v1_6/mobilenet_v1_1.0_224_frozen.pb
from neural_compressor.config import PostTrainingQuantConfig
from neural_compressor.data import DataLoader
from neural_compressor.data import Datasets

dataset = Datasets('tensorflow')['dummy'](shape=(1, 224, 224, 3))
dataloader = DataLoader(framework='tensorflow', dataset=dataset)

from neural_compressor.quantization import fit
config = PostTrainingQuantConfig()
q_model = fit(
    model="./mobilenet_v1_1.0_224_frozen.pb",
    conf=config,
    calib_dataloader=dataloader,
    eval_dataloader=dataloader)

Quantization with JupyterLab Extension

Search for jupyter-lab-neural-compressor in the Extension Manager in JupyterLab and install with one click:

Extension

Quantization with GUI

# Install Intel Neural Compressor and ONNX
pip install neural-compressor-full
pip install onnx==1.12.0 onnxruntime==1.12.1 onnxruntime-extensions
# Prepare fp32 model
wget https://github.com/onnx/models/raw/main/vision/classification/resnet/model/resnet50-v1-12.onnx
# Start GUI
inc_bench

Architecture

Validated Models

Intel® Neural Compressor validated the quantization for 10K+ models from popular model hubs (e.g., HuggingFace Transformers, Torchvision, TensorFlow Model Hub, ONNX Model Zoo). Over 30 pruning, knowledge distillation and model export samples are also available. More details for validated typical models are available here.