speed problem #100

Hiwyl · 2022-01-12T13:41:17Z

使用原始yolov5s6 640输入，速度为12ms，使用下面repvgg_block 640输入为13ms GPU v100

YOLOv5 🚀 by Ultralytics, GPL-3.0 license

Parameters

nc: 80 # number of classes
depth_multiple: 0.33 # model depth multiple
width_multiple: 0.50 # layer channel multiple
anchors:

[19,27, 44,40, 38,94] # P3/8
[96,68, 86,152, 180,137] # P4/16
[140,301, 303,264, 238,542] # P5/32
[436,615, 739,380, 925,792] # P6/64

YOLOv5 v6.0 backbone

backbone:

[from, number, module, args]

[[-1, 1, Conv, [32, 6, 2, 2]], # 0-P1/2
[-1, 1, Conv, [64, 3, 2]], # 1-P2/4
[-1, 1, C3, [64]],
[-1, 1, RepVGGBlock, [128, 3, 2]], # 3-P3/8
[-1, 3, C3, [128]],
[-1, 1, RepVGGBlock, [256, 3, 2]], # 5-P4/16
[-1, 3, C3, [256]],
[-1, 1, RepVGGBlock, [512, 3, 2]], # 7-P5/32
[-1, 3, C3, [512]],
[-1, 1, RepVGGBlock, [768, 3, 2]], # 9-P6/64
[-1, 3, C3, [768]],
[-1, 1, SPPF, [768, 5]], # 11
]

YOLOv5 v6.0 head

head:
[[-1, 1, Conv, [512, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 8], 1, Concat, [1]], # cat backbone P5
[-1, 3, C3, [512, False]], # 15

[-1, 1, Conv, [256, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 6], 1, Concat, [1]], # cat backbone P4
[-1, 3, C3, [256, False]], # 19

[-1, 1, Conv, [128, 1, 1]],
[-1, 1, nn.Upsample, [None, 2, 'nearest']],
[[-1, 4], 1, Concat, [1]], # cat backbone P3
[-1, 3, C3, [128, False]], # 23 (P3/8-small)

[-1, 1, Conv, [128, 3, 2]],
[[-1, 20], 1, Concat, [1]], # cat head P4
[-1, 3, C3, [256, False]], # 26 (P4/16-medium)

[-1, 1, Conv, [256, 3, 2]],
[[-1, 16], 1, Concat, [1]], # cat head P5
[-1, 3, C3, [512, False]], # 29 (P5/32-large)

[-1, 1, Conv, [512, 3, 2]],
[[-1, 12], 1, Concat, [1]], # cat head P6
[-1, 3, C3, [768, False]], # 32 (P6/64-xlarge)

[[23, 26, 29, 32], 1, Detect, [nc, anchors]], # Detect(P3, P4, P5, P6)
]

ppogg · 2022-01-12T16:02:22Z

Hi, 朋友，非常欢迎使用这个仓库！
关于你的问题，你可以看以下博客:
https://zhuanlan.zhihu.com/p/410874403
里面介绍了我对于g模型的所有详细实验。
另外，对于你的问题，我有以下几点疑问:
①是否重参化？重参化的脚本rep_convert.py在script文件夹中
②是否使用了repvgg block fuse的功能，代码在yolo.py文件中

    def fuse(self): 
    # fuse repvgg block
        print('Fusing layers... ')
        for m in self.model.modules():
            # print(m)
            if type(m) is RepVGGBlock:
                if hasattr(m, 'rbr_1x1'):
                    # print(m)
                    kernel, bias = m.get_equivalent_kernel_bias()
                    rbr_reparam = nn.Conv2d(in_channels=m.rbr_dense.conv.in_channels,
                                            out_channels=m.rbr_dense.conv.out_channels,
                                            kernel_size=m.rbr_dense.conv.kernel_size,
                                            stride=m.rbr_dense.conv.stride,
                                            padding=m.rbr_dense.conv.padding, dilation=m.rbr_dense.conv.dilation,
                                            groups=m.rbr_dense.conv.groups, bias=True)
                    rbr_reparam.weight.data = kernel
                    rbr_reparam.bias.data = bias
                    for para in self.parameters():
                        para.detach_()
                    m.rbr_dense = rbr_reparam
                    # m.__delattr__('rbr_dense')
                    m.__delattr__('rbr_1x1')
                    if hasattr(self, 'rbr_identity'):
                        m.__delattr__('rbr_identity')
                    if hasattr(self, 'id_tensor'):
                        m.__delattr__('id_tensor')
                    m.deploy = True
                    m.forward = m.fusevggforward  # update forward
                # continue
                # print(m)
            if type(m) is Conv and hasattr(m, 'bn'):
                # print(m)
                m.conv = fuse_conv_and_bn(m.conv, m.bn)  # update conv
                delattr(m, 'bn')  # remove batchnorm
                m.forward = m.fuseforward  # update forward

③你测试的图片基数大概多少？推荐5000张以上的图片，分别测试bs＝1, bs＝16, bs＝32
④补充我之前做实验的一些见解，repvgg block reparam和fuse后，等价于3×3卷积，速度是一模一样的，在此已做过大量测试，详细可见上面的链接

Hiwyl · 2022-01-13T02:06:12Z

感谢！

使用了rep_convert.py这个脚本

# YOLOv5 experimental modules
import torch
from utils.google_utils import attempt_download

if __name__ == "__main__":
    load = "runs/yolov5s6_repvgg/weights/best.pt"
    save = "runs/yolov5s6_repvgg/weights/best_deploy.pt"
    input_size = 640
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    for w in load if isinstance(load, list) else [load]:
        attempt_download(w)
        ckpt = torch.load(w, map_location=None)  # load

    torch.save(ckpt, save)

    print(f"Done. Befrom weights:({load})")
    print(f"Done. Befrom weights:({save})")

貌似不是这个脚本吧。

ppogg · 2022-01-13T03:31:30Z

是的，是这个脚本，他会调用yolo.py里面的函数接口，捕捉需要重参化的repvgg block

Hiwyl · 2022-01-13T03:34:55Z

我其他设置和您的一样，只是模型定义yaml使用的是四分支，训练结束后，先通过rep_convert再detect，相比于原始yolov5s6，使新模型的FLOPs是其1/4， bs1 detect了1000次平均为12.5ms，原始yolov5s6为12ms

ppogg · 2022-01-13T03:49:08Z

检测的数据大概多少张呢

Hiwyl · 2022-01-13T03:49:21Z

1000

ppogg · 2022-01-13T03:51:04Z

好的，请问下使新模型的FLOPs是其1/4是什么意思呢，另外能否推理5000次试试，我对这个结果比较好奇

Hiwyl · 2022-01-13T03:53:12Z

好的，QQ群里回您

ppogg · 2022-01-13T04:22:39Z

好的

ppogg · 2022-01-29T06:36:58Z

修复了之前重参化脚本的漏洞：
#64 (comment)

Hiwyl closed this as completed Jan 12, 2022

Hiwyl reopened this Jan 12, 2022

ppogg closed this as completed May 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

speed problem #100

speed problem #100

Hiwyl commented Jan 12, 2022

ppogg commented Jan 12, 2022 •

edited

Loading

Hiwyl commented Jan 13, 2022

ppogg commented Jan 13, 2022

Hiwyl commented Jan 13, 2022

ppogg commented Jan 13, 2022

Hiwyl commented Jan 13, 2022

ppogg commented Jan 13, 2022

Hiwyl commented Jan 13, 2022

ppogg commented Jan 13, 2022

ppogg commented Jan 29, 2022

speed problem #100

speed problem #100

Comments

Hiwyl commented Jan 12, 2022

YOLOv5 🚀 by Ultralytics, GPL-3.0 license

Parameters

YOLOv5 v6.0 backbone

[from, number, module, args]

YOLOv5 v6.0 head

ppogg commented Jan 12, 2022 • edited Loading

Hiwyl commented Jan 13, 2022

ppogg commented Jan 13, 2022

Hiwyl commented Jan 13, 2022

ppogg commented Jan 13, 2022

Hiwyl commented Jan 13, 2022

ppogg commented Jan 13, 2022

Hiwyl commented Jan 13, 2022

ppogg commented Jan 13, 2022

ppogg commented Jan 29, 2022

ppogg commented Jan 12, 2022 •

edited

Loading