转载请注明出处:https://www.jianshu.com/p/328b29d20403 如果觉得有用,麻烦点个赞噢~
在ResNeXt里,用到了一个结构叫组卷积,如下图所示:
在Pytorch的卷积函数(如:Conv2d
)里,有一个参数叫group
,它实现的就是组卷积逻辑。下面看官方对这个参数的解释:
- :attr:
groups
controls the connections between inputs and outputs.
:attr:in_channels
and :attr:out_channels
must both be divisible by
:attr:groups
. For example,
* At groups=1, all inputs are convolved to all outputs.
* At groups=2, the operation becomes equivalent to having two conv
layers side by side, each seeing half the input channels,
and producing half the output channels, and both subsequently
concatenated.
* At groups= :attr:in_channels
, each input channel is convolved with
its own set of filters, of size:
:math:.
大概意思就是:将输入的通道分组,每组都跟各自的卷积核做卷积计算,计算结果拼接在一起作为输出。其中,每个分组的卷积核channel数等于out_channels // groups
。
在ResNeXt
的语境里,组卷积的group数就是cardinalities C
,而每个分组的卷积核的通道数是width of bottleneck d
,组卷积输出的channel数就是width of group conv
。见下图:
再来看论文里展示的ResNeXt-50(32x4d)
的网络结构:
可见,ResNeXt
的几个bottleneck的卷积核channel数(128 -> 256 -> 512 -> 1024)的递增倍数跟ResNet
的(64 -> 128 -> 256 -> 512)一致。在代码里要怎么写呢?
参考torchvision里resnet模型的源码,可以看到有这样的式子:
width = int(planes * (base_width / 64.)) * groups
width
是作为out_channels
传给bottleneck内的卷积计算的。乍一看是去,不太好理解这个式子。我改成下面的写法:
width = int(base_width * groups * (planes/64.))
意思是否更加明确些了呢?对于第一个bottleneck,ResNeXt-50(32x4d)
的卷积核channels等于128,即width = base_width * groups = 4 * 32 = 128
。从第二个bottleneck开始,width
要成倍递增,倍数跟ResNet-50
的相同,即plaines/64.
,于是就有了上面计算width
的式子。
网友评论