Skip to content

[CVPR'24] Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression

License

MIT, MIT licenses found

Licenses found

MIT
LICENSE
MIT
LICENSE.txt
Notifications You must be signed in to change notification settings

HankYe/Once-for-Both

Repository files navigation

arXiv GitHub issues PRs Welcome

CVPR-2024: Once-For-Both (OFB)

Introduction

This is the official repository to the CVPR 2024 paper "Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression". OFB is a novel one-stage search paradigm containing a bi-mask weight sharing scheme, an adaptive one-hot loss function, and progressive masked image modeling to efficiently learn the importance and sparsity score distributions.

Abstract

In this work, for the first time, we investigate how to integrate the evaluations of importance and sparsity scores into a single stage, searching the optimal subnets in an efficient manner. Specifically, we present OFB, a cost-efficient approach that simultaneously evaluates both importance and sparsity scores, termed Once for Both (OFB), for VTC. First, a bi-mask scheme is developed by entangling the importance score and the differentiable sparsity score to jointly determine the pruning potential (prunability) of each unit. Such a bi-mask search strategy is further used together with a proposed adaptive one-hot loss to realize the progressiveand-efficient search for the most important subnet. Finally, Progressive Masked Image Modeling (PMIM) is proposed to regularize the feature space to be more representative during the search process, which may be degraded by the dimension reduction.

Main Results on ImageNet

Model size
(pixels)
Top-1 (%) Top-5 (%) params
(M)
FLOPs
224 (B)
OFB-DeiT-A 224 75.0 92.3 4.4 0.9
OFB-DeiT-B 224 76.1 92.8 5.3 1.1
OFB-DeiT-C 224 78.0 93.9 8.0 1.7
OFB-DeiT-D 224 80.3 95.1 17.6 3.6
OFB-DeiT-E 224 81.7 95.8 43.9 8.7
Install

Python>=3.8.0 is required with all requirements.txt:

$ git clone https://github.com/HankYe/Once-for-Both
$ cd Once-for-Both
$ conda create -n OFB python==3.8
$ pip install -r requirements.txt

Data preparation

The layout of Imagenet data:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img1.jpeg
    class2/
      img2.jpeg

Searching and Finetuning (Optional)

Here is a sample script to search on DeiT-S model with 2 GPUs.

cd exp_sh
sh run_exp.sh

Citation

Please cite our paper in your publications if it helps your research.

@InProceedings{Ye_2024_CVPR,
author    = {Ye, Hancheng and Yu, Chong and Ye, Peng and Xia, Renqiu and Tang, Yansong and Lu, Jiwen and Chen, Tao and Zhang, Bo},
title     = {Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month     = {June},
year      = {2024},
pages     = {5578-5588}
}

License

This project is licensed under the MIT License.

Acknowledgement

We greatly acknowledge the authors of ViT-Slim and DeiT for their open-source codes. Visit the following links to access more contributions of them.

About

[CVPR'24] Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression

Topics

Resources

License

MIT, MIT licenses found

Licenses found

MIT
LICENSE
MIT
LICENSE.txt

Stars

Watchers

Forks

Packages

No packages published