Step-by-Step

This document describes the step-by-step instructions for reproducing PyTorch BlendCNN distillation(with MRPC dataset) results with Intel® Neural Compressor.

Prerequisite

1. Installation

cd examples/pytorch/eager/blendcnn/distillation
pip install -r requirements.txt
pip install torch==1.6.0+cpu -f https://download.pytorch.org/whl/torch_stable.html

2. Prepare model and Dataset

Download BERT-Base, Uncased and GLUE MRPC Benchmark Datasets

model

mkdir models/ && mv uncased_L-12_H-768_A-12.zip models/
cd models/ && unzip uncased_L-12_H-768_A-12.zip

### dataset

After downloads dataset, you need to put dataset at `./MRPC/`, list this:

```Shell
ls MRPC/
dev_ids.tsv  dev.tsv  test.tsv  train.tsv

3. Distillation of BlendCNN with BERT-Base as Teacher

3.1 Fine-tune the pretrained BERT-Base model on MRPC dataset

After preparation of step 2, you can fine-tune the pretrained BERT-Base model on MRPC dataset with below steps.

mkdir -p models/bert/mrpc
# fine-tune the pretrained BERT-Base model
python finetune.py config/finetune/mrpc/train.json

When finished, you can find the fine-tuned BERT-Base model weights model_final.pt at ./models/bert/mrpc/.

3.2 Distilling the BlendCNN with BERT-Base

mkdir -p models/blendcnn/
# distilling the BlendCNN
python distill.py --loss_weights 0.1 0.9

Follow the above steps, you will find distilled BlendCNN model weights best_model_weights.pt in ./models/blendcnn/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Step-by-Step

Prerequisite

1. Installation

2. Prepare model and Dataset

model

3. Distillation of BlendCNN with BERT-Base as Teacher

3.1 Fine-tune the pretrained BERT-Base model on MRPC dataset

3.2 Distilling the BlendCNN with BERT-Base

Files

README.md

Latest commit

History

README.md

File metadata and controls

Step-by-Step

Prerequisite

1. Installation

2. Prepare model and Dataset

model

3. Distillation of BlendCNN with BERT-Base as Teacher

3.1 Fine-tune the pretrained BERT-Base model on MRPC dataset

3.2 Distilling the BlendCNN with BERT-Base