YOLOv5-6.x通过设置可学习参数来结合BiFPN（yolov5s）

Posted 2022-04-13 嗜睡的篠龙

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了YOLOv5-6.x通过设置可学习参数来结合BiFPN（yolov5s）相关的知识，希望对你有一定的参考价值。

文章目录

前言
修改common.py
修改yolo.py
修改train.py

1. 向优化器中添加BiFPN的权重参数
2. 修改分布式训练DDP（单卡训练不用管）
3. 查看BiFPN_Concat层参数更新情况

yolov5s-bifpn.yaml
测试结果
Concat全部换成BiFPN_Concat
References

前言

在之前的这篇博客中，简要介绍了BiFPN的原理，以及YOLOv5作者如何结合BiFPN：【魔改YOLOv5-6.x（中）】：加入ACON激活函数、CBAM和CA注意力机制、加权双向特征金字塔BiFPN

本文将尝试进一步结合BiFPN，主要参考自：YOLOv5结合BiFPN

修改common.py

复制粘贴一下代码：

# 结合BiFPN 设置可学习参数 学习不同分支的权重
class BiFPN_Concat(nn.Module):
    def __init__(self, c1, c2):
        super(BiFPN_Concat, self).__init__()
        # 设置可学习参数 nn.Parameter的作用是：将一个不可训练的类型Tensor转换成可以训练的类型parameter
        # 并且会向宿主模型注册该参数 成为其一部分 即model.parameters()会包含这个parameter
        # 从而在参数优化的时候可以自动一起优化
        self.w1 = nn.Parameter(torch.ones(2, dtype=torch.float32), requires_grad=True)
        self.w2 = nn.Parameter(torch.ones(3, dtype=torch.float32), requires_grad=True)
        self.epsilon = 0.0001
        self.conv = nn.Conv2d(c1, c2, kernel_size=1, stride=1, padding=0)
        self.silu = nn.SiLU()

    def forward(self, x):
        if len(x) == 2:  # add两个分支
            w = self.w1
            weight = w / (torch.sum(w, dim=0) + self.epsilon)
            return self.conv(self.silu(weight[0] * x[0] + weight[1] * x[1]))
        elif len(x) == 3:  # add三个分支
            w = self.w2
            weight = w / (torch.sum(w, dim=0) + self.epsilon)  # 将权重进行归一化
            # Fast normalized fusion
            return self.conv(self.silu(weight[0] * x[0] + weight[1] * x[1] + weight[2] * x[2]))

修改yolo.py

在parse_model函数中找到elif m is Concat:语句，在其后面加上BiFPN_Concat相关语句：

elif m is Concat:
	c2 = sum(ch[x] for x in f)
elif m is BiFPN_Concat:  # 增加BiFPN_Concat
	c2 = max([ch[x] for x in f])

修改train.py

1. 向优化器中添加BiFPN的权重参数

此时模型参数情况如下：
- 可以看到g1参数（存放weight）有64个

将BiFPN_Concat函数中定义的w1和w2参数，加入g1

    g0, g1, g2 = [], [], []  # optimizer parameter groups
    for v in model.modules():
        # hasattr: 测试指定的对象是否具有给定的属性，返回一个布尔值
        if hasattr(v, \'bias\') and isinstance(v.bias, nn.Parameter):  # bias
            g2.append(v.bias)  # biases
        if isinstance(v, nn.BatchNorm2d):  # weight (no decay)
            g0.append(v.weight)
        elif hasattr(v, \'weight\') and isinstance(v.weight, nn.Parameter):  # weight (with decay)
            g1.append(v.weight)
        # BiFPN_Concat
        # elif isinstance(v, BiFPN_Concat) and hasattr(v, \'w1\') and isinstance(v.w1, nn.Parameter):
        #     g1.append(v.w1)
        # elif isinstance(v, BiFPN_Concat) and hasattr(v, \'w2\') and isinstance(v.w2, nn.Parameter):
        #     g1.append(v.w2)

加入后再来看看g1就变成了68个了（多了4层BiFPN_Concat）

2. 修改分布式训练DDP（单卡训练不用管）

使用命令python -m torch.distributed.launch --nproc_per_node 2 train.py进行分布式训练时，会出现下面这个错误：

RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by passing the keyword argument find_unused_parameters=True to torch.nn.parallel.DistributedDataParallel, and by making sure all forward function outputs participate in calculating loss.

我们按照错误提示，在train.py中进行如下修改即可：

# DDP mode
    if cuda and RANK != -1:
        model = DDP(model, device_ids=[LOCAL_RANK], output_device=LOCAL_RANK, find_unused_parameters=True)
        # model = DDP(model, device_ids=[LOCAL_RANK], output_device=LOCAL_RANK)

3. 查看BiFPN_Concat层参数更新情况

yolov5s-bifpn.yaml

修改模型配置文件时要注意以下几点：

这里的yaml文件只修改了一处，也就是将19层的Concat换成了BiFPN_Concat，要想修改其他层的Concat，可以类比进行修改
BiFPN_Concat本质是add操作，不是concat操作，因此，BiFPN_Concat的各个输入层要求大小完全一致（通道数、feature map大小等），因此，这里要修改之前的参数[-1, 13, 6]，来满足这个要求：
- -1层就是上一层的输出，原来上一层的输出channel数为256，这里改成512
- 13层就是这里[-1, 3, C3, [512, False]], # 13
- 这样修改后，BiFPN_Concat各个输入大小都是[bs,256,40,40]
- 最后BiFPN_Concat后面的参数层设置为[256, 256]也就是输入输出channel数都是256

# YOLOv5 以上是关于YOLOv5-6.x通过设置可学习参数来结合BiFPN（yolov5s）的主要内容，如果未能解决你的问题，请参考以下文章 
 魔改YOLOv5-6.x（中）加入ACON激活函数CBAM和CA注意力机制加权双向特征金字塔BiFPN
 BIF
 YOLOv5-6.x输出中文标签修改标签框的位置和大小
 Python 学习笔记 -- 一些常用的BIF
 和类相关的BIF
 课时40：类与对象：一些相关的BIF