Skip to content

TigerJeffX/Transformers-in-pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transformers in Pytorch

what are provided in this repo?

  • Model: Several transformer-based milestone models are reimplemented from scratch via pytorch

  • Experiments: Conduct experiments on CV/NLP benchmark, respectively
  • Pipeline: End-to-end pipeline
    • Conveniently Playing : integrate data processing and model training/validation into one-stop shop pipeline
    • Efficiently Training : accelerate training and evaluating via DistributeDataParallel(DDP) and Mixed Precision(fp16)
    • Neatly Reading : neat file structure, easy for reading but non-trivial
      • ./script → run train/eval
      • ./model → model implementation
      • ./data → data processing

Usage

1. Env Requirements

# Conda Env
python 3.6.10
torch 1.4.0+cu100
torchvision 0.5.0+cu100
torchtext 0.5.0
spacy 3.4.1
tqdm 4.63.0

# Apex (For mix precision training)
## run `gcc --version` 
gcc (GCC) 5.4.0
## apex installation
git clone https://github.com/NVIDIA/apex
cd apex
git checkout f3a960f80244cf9e80558ab30f7f7e8cbf03c0a0
rm -rf ./build
python setup.py install --cuda_ext --cpp_ext

# System Env 
## run `nvcc --version`
Cuda compilation tools, release 10.0, V10.0.130
# run `nvidia-smi`
Check your own gpu device status

2. Data Requirements

  • multi30k, cifar10 could be automatically downloaded in pipeline
  • imagenet1k(ILSVRC2012) need manual download (Guide for download imagenet1k)
    • Wait until three files download.
      • ILSVRC2012_devkit_t12.tar.gz (2.5M)
      • ILSVRC2012_img_train.tar (138G)
      • ILSVRC2012_img_val.tar (6.3G)
    • Run imagenet1k pipeline, ILSVRC2012_img_train.tar and ILSVRC2012_img_val.tar will be automatically unzipped and arranged in two directories 'data/ILSVRC2012/train' and 'data/ILSVRC2012/val'.
    • But the unzip process costs more than a few hours or you can do it faster by shell anyway.
# Guide for download imagenet1k
mkdir -p data/ILSVRC2012
cd data/ILSVRC2012
wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_devkit_t12.tar.gz --no-check-certificate
wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_train.tar --no-check-certificate
wget https://image-net.org/data/ILSVRC/2012/ILSVRC2012_img_val.tar --no-check-certificate

3. Run Experiments

3.1 Fine-tuning ViT on imagenet1k, cifar10

  • Download pretrained ViT_B_16 model parameters from official storage.
cd data
curl -o ViT_B_16.npz https://storage.googleapis.com/vit_models/augreg/B_16-i21k-300ep-lr_0.001-aug_medium1-wd_0.1-do_0.0-sd_0.0--imagenet2012-steps_20k-lr_0.01-res_224.npz
curl -o ViT_B_16_384.npz https://storage.googleapis.com/vit_models/augreg/B_16-i21k-300ep-lr_0.001-aug_medium1-wd_0.1-do_0.0-sd_0.0--imagenet2012-steps_20k-lr_0.01-res_384.npz
  • Before run experiments
    • Set CUDA env in script/run_img_cls_task.py/__main__ according to your GPU device
    • Adjust train/eval settings in script/run_img_cls_task.py/get_args() and launch the experiment
cd script

# run experiments on cifar10
# (4mins/epoch, 3.5hours totally | GPU device: P40×4)
python ./run_image_cls_task.py cifar10

# run experiments on imagenet1k
# (less than 5hours/epcoch, more than 10hours totally | GPU device: P40×4 )
python ./run_image_cls_task.py ILSVRC2012

# Tips:
# 1. Both DDP and FP16 Mixed Precision Training are adopted for accelerating
# 2. The ratio of acceleration depends on your specific GPU device

3.2 Train vanilla transformer from scratch on multi30k (en → de)

  • Before run experiments
    • Set CUDA env in script/run_nmt_task.py/__main__ according to your GPU device
    • Adjust train/eval settings in script/run_nmt_task.py/get_args() and launch the experiment
# run experiments on multi30k (small dataset ,3mins total | GPU device : P40×4 | U can also fork and adjust the pipeline and run this experiments in a small capacity gpu device)
cd script

python ./run_nmt_task.py multi30k

# Tips:
# 1. DDP is adopted for accelerating
# 1. For inference, "greedy search" and "beam search" are also included in the nmt task pipeline.

4. Results

4.1 Fine-tuning ViT on imagenet1k, cifar10,

  • This repo
    • Imagenet1k : ACC 84.9% (result on 50,000 val set for | resolution 384 | extra label smoothing confidence 0.9 | batch size 160, nearly 15,000 training steps)

    • Cifar10 : ACC 99.04% (resolution 224 | batch size 640, nearly 5500 training steps)

  • Comparison to official result ViT Implementation by Google

4.2 Train vanilla transformer from scratch on multi30k (en → de)

Reference materials for further study