美文网首页神经网络与深度学习
ResNeXt里组卷积是什么?

ResNeXt里组卷积是什么?

作者: Jinglever | 来源:发表于2019-12-29 11:50 被阅读0次

    转载请注明出处:https://www.jianshu.com/p/328b29d20403 如果觉得有用,麻烦点个赞噢~

    ResNeXt里,用到了一个结构叫组卷积,如下图所示:

    blocks of ResNeXt

    在Pytorch的卷积函数(如:Conv2d)里,有一个参数叫group,它实现的就是组卷积逻辑。下面看官方对这个参数的解释:

    • :attr:groups controls the connections between inputs and outputs.
      :attr:in_channels and :attr:out_channels must both be divisible by
      :attr:groups. For example,
      * At groups=1, all inputs are convolved to all outputs.
      * At groups=2, the operation becomes equivalent to having two conv
      layers side by side, each seeing half the input channels,
      and producing half the output channels, and both subsequently
      concatenated.
      * At groups= :attr:in_channels, each input channel is convolved with
      its own set of filters, of size:
      :math:\left\lfloor\frac{out\_channels}{in\_channels}\right\rfloor.

    大概意思就是:将输入的通道分组,每组都跟各自的卷积核做卷积计算,计算结果拼接在一起作为输出。其中,每个分组的卷积核channel数等于out_channels // groups

    ResNeXt的语境里,组卷积的group数就是cardinalities C,而每个分组的卷积核的通道数是width of bottleneck d,组卷积输出的channel数就是width of group conv。见下图:

    (out_channels = C * d)

    再来看论文里展示的ResNeXt-50(32x4d)的网络结构:

    可见,ResNeXt的几个bottleneck的卷积核channel数(128 -> 256 -> 512 -> 1024)的递增倍数跟ResNet的(64 -> 128 -> 256 -> 512)一致。在代码里要怎么写呢?

    参考torchvision里resnet模型的源码,可以看到有这样的式子:

     width = int(planes * (base_width / 64.)) * groups
    

    width是作为out_channels传给bottleneck内的卷积计算的。乍一看是去,不太好理解这个式子。我改成下面的写法:

    width = int(base_width * groups * (planes/64.))
    

    意思是否更加明确些了呢?对于第一个bottleneck,ResNeXt-50(32x4d)的卷积核channels等于128,即width = base_width * groups = 4 * 32 = 128。从第二个bottleneck开始,width要成倍递增,倍数跟ResNet-50的相同,即plaines/64.,于是就有了上面计算width的式子。

    相关文章

      网友评论

        本文标题:ResNeXt里组卷积是什么?

        本文链接:https://www.haomeiwen.com/subject/eklyoctx.html