Skip to content

Latest commit

 

History

History
49 lines (40 loc) · 2.52 KB

README.md

File metadata and controls

49 lines (40 loc) · 2.52 KB

EfficientViT-Pytorch

Precautions

Before you use the code to train your own data set, please first enter the train_gpu.py file and modify the data_root, batch_size and nb_classes parameters. If you want to draw the confusion matrix and ROC curve, you only need to remove the comments of Plot_ROC and Predictor at the end of the code. For the third parameter, you should change it to the path of your own model weights file(.pth).

Use Sophia Optimizer (in util/optimizer.py)

You can use anther optimizer sophia, just need to change the optimizer in train_gpu.py, for this training sample, can achieve better results

# optimizer = create_optimizer(args, model_without_ddp)
optimizer = SophiaG(model.parameters(), lr=2e-4, betas=(0.965, 0.99), rho=0.01, weight_decay=args.weight_decay)

Train this model

train model with single-machine single-card:

python train_gpu.py

train model with single-machine multi-card:

python -m torch.distributed.launch --nproc_per_node=8 train_gpu.py

train model with single-machine multi-card:

(using a specified part of the cards: for example, I want to use the second and fourth cards)

CUDA_VISIBLE_DEVICES=1,3 python -m torch.distributed.launch --nproc_per_node=2 train_gpu.py

train model with multi-machine multi-card:

(For the specific number of GPUs on each machine, modify the value of --nproc_per_node. If you want to specify a certain card, just add CUDA_VISIBLE_DEVICES= to specify the index number of the card before each command. The principle is the same as single-machine multi-card training)

On the first machine: python -m torch.distributed.launch --nproc_per_node=1 --nnodes=2 --node_rank=0 --master_addr=<Master node IP address> --master_port=<Master node port number> train_gpu.py

On the second machine: python -m torch.distributed.launch --nproc_per_node=1 --nnodes=2 --node_rank=1 --master_addr=<Master node IP address> --master_port=<Master node port number> train_gpu.py

Citation

@inproceedings{liu2023efficientvit,
  title={EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention},
  author={Liu, Xinyu and Peng, Houwen and Zheng, Ningxin and Yang, Yuqing and Hu, Han and Yuan, Yixuan},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={14420--14430},
  year={2023}
}