TensorRT_ASP

0. Introduction

Goal : model compression using by Structured Sparsity
Base Model : Resnet18
Dataset : Imagenet100
Pruning Process :
1. Train base model with Imagenet100 dataset
2. Prune the model in a 2:4 sparse pattern for the FC and convolution layers.
3. Retrain the pruned model
4. Convert to TensorRT PTQ int8 Model

1. Development Environment

Device
- MSI laptop
- CPU i7-11375H
- GPU RTX-3060
Dependency
- WSL(Ubuntu 22.04)
- cuda 12.1
- cudnn 8.9.2
- tensorrt 8.6.1
- pytorch 2.1.0+cu121

2. Code Scheme

    Quantization_EX/
    ├── calibrator.py       # calibration class for TensorRT PTQ
    ├── common.py           # utils for TensorRT
    ├── onnx_export.py      # onnx export ASP model
    ├── train.py            # base model train with ASP 
    ├── trt_infer_2.py      # TensorRT model build using Polygraphy
    ├── trt_infer_acc.py    # TensorRT model accuracy check
    ├── trt_infer.py        # TensorRT model infer
    ├── utils.py            # utils
    ├── LICENSE
    └── README.md

3. Performance Evaluation

Calculation 10000 iteration with one input data [1, 3, 224, 224]

	TensorRT PTQ	TensorRT PTQ with ASP
Precision	Int8	Int8
Avg Latency [ms]	0.418 ms	0.388 ms
Avg FPS [frame/sec]	2388.33 fps	2572.17 fps
Gpu Memory [MB]	123 MB	119 MB

4. Guide

train -> onnx_export -> trt_infer -> trt_infer_acc

5. Reference

ASP (Automatic SParsity) : https://github.com/NVIDIA/apex/tree/master/apex/contrib/sparsity
Polygraphy : https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy
imagenet100 : https://www.kaggle.com/datasets/ambityga/imagenet100

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TensorRT_ASP

0. Introduction

1. Development Environment

2. Code Scheme

3. Performance Evaluation

4. Guide

5. Reference

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
calibrator.py		calibrator.py
common.py		common.py
onnx_export.py		onnx_export.py
train.py		train.py
trt_infer.py		trt_infer.py
trt_infer_2.py		trt_infer_2.py
trt_infer_acc.py		trt_infer_acc.py
utils.py		utils.py

License

yester31/TensorRT_Sparse

Folders and files

Latest commit

History

Repository files navigation

TensorRT_ASP

0. Introduction

1. Development Environment

2. Code Scheme

3. Performance Evaluation

4. Guide

5. Reference

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages