Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speed problem #100

Closed
Hiwyl opened this issue Jan 12, 2022 · 10 comments
Closed

speed problem #100

Hiwyl opened this issue Jan 12, 2022 · 10 comments

Comments

@Hiwyl
Copy link

Hiwyl commented Jan 12, 2022

使用原始yolov5s6 640输入,速度为12ms,使用下面repvgg_block 640输入为13ms GPU v100

YOLOv5 🚀 by Ultralytics, GPL-3.0 license

Parameters

nc: 80 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.50 # layer channel multiple
anchors:

  • [19,27, 44,40, 38,94] # P3/8
  • [96,68, 86,152, 180,137] # P4/16
  • [140,301, 303,264, 238,542] # P5/32
  • [436,615, 739,380, 925,792] # P6/64

YOLOv5 v6.0 backbone

backbone:

[from, number, module, args]

[[-1, 1, Conv, [32, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [64, 3, 2]], # 1-P2/4
[-1, 1, C3, [64]],
[-1, 1, RepVGGBlock, [128, 3, 2]], # 3-P3/8
[-1, 3, C3, [128]],
[-1, 1, RepVGGBlock, [256, 3, 2]], # 5-P4/16
[-1, 3, C3, [256]],
[-1, 1, RepVGGBlock, [512, 3, 2]], # 7-P5/32
[-1, 3, C3, [512]],
[-1, 1, RepVGGBlock, [768, 3, 2]], # 9-P6/64
[-1, 3, C3, [768]],
[-1, 1, SPPF, [768, 5]], # 11
]

YOLOv5 v6.0 head

head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 8], 1, Concat, [1]], # cat backbone P5
[-1, 3, C3, [512, False]], # 15

[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [256, False]], # 19

[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [128, False]], # 23 (P3/8-small)

[-1, 1, Conv, [128, 3, 2]],
[[-1, 20], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [256, False]], # 26 (P4/16-medium)

[-1, 1, Conv, [256, 3, 2]],
[[-1, 16], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [512, False]], # 29 (P5/32-large)

[-1, 1, Conv, [512, 3, 2]],
[[-1, 12], 1, Concat, [1]], # cat head P6
[-1, 3, C3, [768, False]], # 32 (P6/64-xlarge)

[[23, 26, 29, 32], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5, P6)
]

@Hiwyl Hiwyl closed this as completed Jan 12, 2022
@Hiwyl Hiwyl reopened this Jan 12, 2022
@ppogg
Copy link
Owner

ppogg commented Jan 12, 2022

Hi, 朋友,非常欢迎使用这个仓库!
关于你的问题,你可以看以下博客:
https://zhuanlan.zhihu.com/p/410874403
里面介绍了我对于g模型的所有详细实验。
另外,对于你的问题,我有以下几点疑问:
①是否重参化?重参化的脚本rep_convert.py在script文件夹中
②是否使用了repvgg block fuse的功能,代码在yolo.py文件中

    def fuse(self): 
    # fuse repvgg block
        print('Fusing layers... ')
        for m in self.model.modules():
            # print(m)
            if type(m) is RepVGGBlock:
                if hasattr(m, 'rbr_1x1'):
                    # print(m)
                    kernel, bias = m.get_equivalent_kernel_bias()
                    rbr_reparam = nn.Conv2d(in_channels=m.rbr_dense.conv.in_channels,
                                            out_channels=m.rbr_dense.conv.out_channels,
                                            kernel_size=m.rbr_dense.conv.kernel_size,
                                            stride=m.rbr_dense.conv.stride,
                                            padding=m.rbr_dense.conv.padding, dilation=m.rbr_dense.conv.dilation,
                                            groups=m.rbr_dense.conv.groups, bias=True)
                    rbr_reparam.weight.data = kernel
                    rbr_reparam.bias.data = bias
                    for para in self.parameters():
                        para.detach_()
                    m.rbr_dense = rbr_reparam
                    # m.__delattr__('rbr_dense')
                    m.__delattr__('rbr_1x1')
                    if hasattr(self, 'rbr_identity'):
                        m.__delattr__('rbr_identity')
                    if hasattr(self, 'id_tensor'):
                        m.__delattr__('id_tensor')
                    m.deploy = True
                    m.forward = m.fusevggforward  # update forward
                # continue
                # print(m)
            if type(m) is Conv and hasattr(m, 'bn'):
                # print(m)
                m.conv = fuse_conv_and_bn(m.conv, m.bn)  # update conv
                delattr(m, 'bn')  # remove batchnorm
                m.forward = m.fuseforward  # update forward

③你测试的图片基数大概多少?推荐5000张以上的图片,分别测试bs=1, bs=16, bs=32
④补充我之前做实验的一些见解,repvgg block reparam和fuse后,等价于3×3卷积,速度是一模一样的,在此已做过大量测试,详细可见上面的链接

@Hiwyl
Copy link
Author

Hiwyl commented Jan 13, 2022

感谢!

  1. 使用了rep_convert.py这个脚本
# YOLOv5 experimental modules
import torch
from utils.google_utils import attempt_download

if __name__ == "__main__":
    load = "runs/yolov5s6_repvgg/weights/best.pt"
    save = "runs/yolov5s6_repvgg/weights/best_deploy.pt"
    input_size = 640
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    for w in load if isinstance(load, list) else [load]:
        attempt_download(w)
        ckpt = torch.load(w, map_location=None)  # load

    torch.save(ckpt, save)

    print(f"Done. Befrom weights:({load})")
    print(f"Done. Befrom weights:({save})")

貌似不是这个脚本吧。

@ppogg
Copy link
Owner

ppogg commented Jan 13, 2022

是的,是这个脚本,他会调用yolo.py里面的函数接口,捕捉需要重参化的repvgg block

@Hiwyl
Copy link
Author

Hiwyl commented Jan 13, 2022

我其他设置和您的一样,只是模型定义yaml使用的是四分支,训练结束后,先通过rep_convert再detect,相比于原始yolov5s6,使新模型的FLOPs是其1/4, bs1 detect了1000次 平均为12.5ms,原始yolov5s6为12ms

@ppogg
Copy link
Owner

ppogg commented Jan 13, 2022

检测的数据大概多少张呢

@Hiwyl
Copy link
Author

Hiwyl commented Jan 13, 2022

1000

@ppogg
Copy link
Owner

ppogg commented Jan 13, 2022

好的,请问下使新模型的FLOPs是其1/4是什么意思呢,另外能否推理5000次试试,我对这个结果比较好奇

@Hiwyl
Copy link
Author

Hiwyl commented Jan 13, 2022

好的,QQ群里回您

@ppogg
Copy link
Owner

ppogg commented Jan 13, 2022

好的

@ppogg
Copy link
Owner

ppogg commented Jan 29, 2022

修复了之前重参化脚本的漏洞:
#64 (comment)

@ppogg ppogg closed this as completed May 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants