Skip to content
/ CALS Public

Code for our method CALS (Class Adaptive Label Smoothing) for network calibration. To Appear at CVPR 2023. Paper: https://arxiv.org/abs/2211.15088

License

Notifications You must be signed in to change notification settings

by-liu/CALS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Class Adaptive Network Calibration

Bingyuan Liu, Jérôme Rony, Adrian Galdran, Jose Dolz, Ismail Ben Ayed

[CVPR 2023][arXiv][BibTeX]

overview

Install:

The pytorch version we used:

torch==1.12.1
torchvision==0.13.1

setup:

pip install -e .

Install window_process kernal for Swin-T model(Optional)

cd kernels/window_process
pip install -e .

Data preparation

For the datasets (i.e., Tiny-ImageNet, ImageNet, ImageNetLT, and VOC2012), please refer to their official citer for download. We provide the data splits we used under data_splits. Please put them to the root directory (or VOC2012/ImageSets/Segmentation for VOC2012) of your dataset.

Important Note : Before you run the code, please add the absolute path of the root directory for the related data configs in configs/data, or pass it in the running commands.

Usage:

Arguments

python tools/train_net.py --help

dist_train is powered by Hydra.

== Configuration groups ==
Compose your configuration from those groups (group=option)

data: imagenet, imagenet_lt, tiny_imagenet, voc2012
dist: pytorch, slurm
lag: aug_lagrangian, aug_lagrangian_class
loss: ce, cpc, focal, focal_adaptive, logit_margin, logit_margin_plus, ls, mmce, penalty_ent, soft_ce
lr_scheduler: cosine, multi_step, one_cycle, plateau, step
mixup: mixup
model: SwinV2-T, deeplabv3, resnet, resnet101_tiny, resnet50_tiny
optim: adam, adamw, sgd
test: local, segment
train: dist, dist_lag, dist_plus, lag_segment, local, segment
wandb: my


== Config ==
Override anything in the config (foo.bar=value)

data:
  name: imagenet
  data_root: YOUR_DATA_ROOT
  input_size: 224
  train_batch_size: 256
  val_batch_size: 250
  test_batch_size: 250
  return_ind: false
  use_mysplit: true
  num_workers: 8
  pin_memory: true
  object:
    trainval:
      _target_: calibrate.data.imagenet.build_train_val_dataset
      data_root: ${data.data_root}
      input_size: ${data.input_size}
      use_mysplit: ${data.use_mysplit}
    test:
      _target_: calibrate.data.imagenet.build_test_dataset
      data_root: ${data.data_root}
      input_size: ${data.input_size}
      use_mysplit: ${data.use_mysplit}
model:
  name: resnet50
  num_classes: 10
  pretrained: false
  drop_rate: 0.0
  object:
    _target_: timm.create_model
    model_name: ${model.name}
    pretrained: ${model.pretrained}
    num_classes: ${model.num_classes}
    drop_rate: ${model.drop_rate}
loss:
  name: ce
  ignore_index: -100
  object:
    _target_: torch.nn.CrossEntropyLoss
    ignore_index: ${loss.ignore_index}
    reduction: mean
optim:
  name: adamw
  lr: 0.0005
  weight_decay: 0.05
  object:
    _target_: torch.optim.AdamW
    lr: ${optim.lr}
    betas:
    - 0.9
    - 0.999
    weight_decay: ${optim.weight_decay}
    eps: 1.0e-08
lr_scheduler:
  name: cosine
  min_lr: 5.0e-06
  warmup_lr: 5.0e-07
  warmup_epochs: 1
  cycle_decay: 0.1
  object:
    _target_: timm.scheduler.cosine_lr.CosineLRScheduler
    t_initial: ${train.max_epoch}
    lr_min: ${lr_scheduler.min_lr}
    cycle_mul: 1
    cycle_decay: ${lr_scheduler.cycle_decay}
    warmup_lr_init: ${lr_scheduler.warmup_lr}
    warmup_t: ${lr_scheduler.warmup_epochs}
    cycle_limit: 1
    t_in_epochs: true
train:
  name: dist
  max_epoch: 200
  clip_grad: 2.0
  clip_mode: norm
  resume: true
  keep_checkpoint_num: 1
  keep_checkpoint_interval: 0
  use_amp: true
  evaluate_logits: true
  object:
    _target_: calibrate.engine.DistributedTrainer
mixup:
  name: mixup
  enable: false
  mixup_alpha: 0.4
  mode: pair
  label_smoothing: 0
  object:
    _target_: timm.data.mixup.Mixup
    mixup_alpha: ${mixup.mixup_alpha}
    mode: ${mixup.mode}
    label_smoothing: ${mixup.label_smoothing}
    num_classes: ${model.num_classes}
dist:
  launch: python
  backend: nccl
wandb:
  enable: false
  project: NA
  entity: NA
  tags: train
test:
  name: local
  checkpoint: best.pth
  save_logits: false
  object:
    _target_: calibrate.engine.Tester
job_name: ${hydra:job.name}
work_dir: ${hydra:run.dir}
seed: 1
log_period: 50
calibrate:
  num_bins: 15
  visualize: false
post_temperature:
  enable: false
  learn: false
  grid_search_interval: 0.1
  cross_validate: ece


Powered by Hydra (https://hydra.cc)
Use --hydra-help to view Hydra specific help

Example

TinyImageNet

Ours

OMP_NUM_THREADS=8 torchrun \
    --master_port 29500 --nnodes=1 --nproc_per_node=1 \
     tools/dist_train.py \
     dist=pytorch model=resnet50_tiny train.max_epoch=100 \
     data=tiny_imagenet data.train_batch_size=128 \
     optim=sgd optim.lr=0.1 lr_scheduler=multi_step \
     train=dist_lag loss=ce +lag=aug_lagrangian_class lag.margin=10

Baselines:

CE (Cross Entropy)

OMP_NUM_THREADS=8 torchrun \
    --master_port 29500 --nnodes=1 --nproc_per_node=1 \
     tools/dist_train.py \
     dist=pytorch model=resnet50_tiny train=dist train.max_epoch=100 \
     data=tiny_imagenet data.train_batch_size=128 \
     optim=sgd optim.lr=0.1 lr_scheduler=multi_step \
     train=dist loss=ce

MbLS (Margin based Label Smoothing)

OMP_NUM_THREADS=8 torchrun \
    --master_port 29500 --nnodes=1 --nproc_per_node=1 \
     tools/dist_train.py \
     dist=pytorch model=resnet50_tiny train=dist train.max_epoch=100 \
     data=tiny_imagenet data.train_batch_size=128 \
     optim=sgd optim.lr=0.1 lr_scheduler=multi_step \
     train=dist loss=logit_margin

ImageNetLT:

Ours:

OMP_NUM_THREADS=8 CUDA_VISIBLE_DEVICES=0,1 torchrun \
    --master_port 29500 --nnodes=1 --nproc_per_node=2 \
    tools/dist_train.py \
    model=resnet model.num_classes=1000 \
    data=imagenet_lt data.train_batch_size=256 \
    optim.lr=0.0005 lr_scheduler.min_lr=5e-6 lr_scheduler.warmup_lr=5e-7 \
    train=dist_lag train.max_epoch=200 \
    loss=ce +lag=aug_lagrangian_class lag.margin=10 lag.lambd_step=1

Baselines:

CE (Cross Entropy):

OMP_NUM_THREADS=8 CUDA_VISIBLE_DEVICES=0,1 torchrun \
    --master_port 29500 --nnodes=1 --nproc_per_node=2 \
    tools/dist_train.py \
    model=resnet model.num_classes=1000 \
    data=imagenet_lt data.train_batch_size=256 \
    optim.lr=0.0005 lr_scheduler.min_lr=5e-6 lr_scheduler.warmup_lr=5e-7 \
    train=dist train.max_epoch=200 \
    loss=ce

MbLS (Margin based Label Smoothing):

OMP_NUM_THREADS=8 CUDA_VISIBLE_DEVICES=0,1 torchrun \
    --master_port 29500 --nnodes=1 --nproc_per_node=2 \
    tools/dist_train.py \
    model=resnet model.num_classes=1000 \
    data=imagenet_lt data.train_batch_size=256 \
    optim.lr=0.0005 lr_scheduler.min_lr=5e-6 lr_scheduler.warmup_lr=5e-7 \
    train=dist train.max_epoch=200 \
    loss=logit_margin

Citing CALS

@inproceedings{liu2023cals,
  title = {Class Adaptive Network Calibration}, 
  author={Bingyuan Liu and Jérôme Rony and Adrian Galdran and Jose Dolz and Ismail Ben Ayed},
  booktitle = {CVPR},
  year={2023},
}

About

Code for our method CALS (Class Adaptive Label Smoothing) for network calibration. To Appear at CVPR 2023. Paper: https://arxiv.org/abs/2211.15088

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published