InceptionV3的PyTorch实现:https://github.com/pytorch/vision/blob/master/torchvision/models/inception.py
2a表示第2组的第1个Block,同一组的空间维度相同
但为何没有3a, 5a?
(299, 299, 3)
→【1a, Cout=32, f=3, s=2】→(149, 149, 32)
→【2a, Cout=32, f=3】→(147, 147, 32)→【2b, Cout=64, f=3, p=1】→(147, 147, 64)
→【max pool, f=3, s=2】→(73, 73, 64)
→【3b, Cout=80, f=1】→(73, 73, 80)
→【4a, Cout=192, f=3】→(71, 71, 192)→【max pool, f=3, s=2】→(35, 35, 192)
→【Mixed_5b, InceptionA】→(35, 35, 256)→【Mixed_5c, InceptionA】→(35, 35, 288)
→【Mixed_5d, InceptionA】→(35, 35, 288)
→【Mixed_6a, InceptionB】→(17, 17, 768)
→【Mixed_6b, InceptionC】→(17, 17, 768)→【Mixed_6c, InceptionC】→(17, 17, 768)
→【Mixed_6d, InceptionC】→(17, 17, 768)→【Mixed_6e, InceptionC】→(17, 17, 768)
→【Mixed_7a, InceptionD】→(8, 8, 1280)
→【Mixed_7b, InceptionE】→(8, 8, 2048)→【Mixed_7c, InceptionE】→(8, 8, 2048)
→【global avg pool】→(2048,)→【dropout】→(2048,)→【fc】→(1000,)
分支:→【Mixed_6e, InceptionC】→(17, 17, 768)→【InceptionAux】→(1000,)
pool层几个值得注意的地方
(147, 147, 64)→【max pool, f=3, s=2】→(73, 73, 64)
(71, 71, 192)→【max pool, f=3, s=2】→(35, 35, 192),使用f=3(一般使用f=2)
InceptionA
使用了3次,分别用在Mixed_5b, Mixed_5c, Mixed_5d
中,包含参数pool_features
,输入为(35, 35, in_channels)
,输出固定为(35, 35, 224+pool features)
(35, 35, 192)→【Mixed_5b,InceptionA, pool_features=32】→(35, 35, 224+32=256)
(35, 35, 256)→【Mixed_5c,InceptionA, pool_features=64】→(35, 35, 224+64=288)
(35, 35, 288)→【Mixed_5d,InceptionA, pool_features=64】→(35, 35, 224+64=288)
以(35, 35, 192)→【Mixed_5b, InceptionA, pool_features=64】→(35, 35, 256)
为例
输入:(35, 35, 192)
分支1:→【BasicConv2d, Cout=64, f=1】→(35, 35, 64)
分支2:→【BasicConv2d, Cout=48, f=1】→(35, 35, 48)
→【BasicConv2d, Cout=64, f=5, p=2】→(35, 35, 64)
分支3:→【BasicConv2d, Cout=64, f=1】→(35, 35, 64)
→【BasicConv2d, Cout=96, f=3, p=1】→(35, 35, 96)
→【BasicConv2d, Cout=96, f=3, p=1】→(35, 35, 96)
分支4:→【avg pool, f=3, s=1, p=1】→(35, 35, 192)
→【BasicConv2d, Cout=pool_features, f=1】→(35, 35, pool_features)
合并:(35, 35, 224+pool_features)
InceptionB
只使用了1次,用在Mixed_6a
中
(35, 35, 288)→【Mixed_6a, InceptionB】→(17, 17, 768)
,空间维度减半,通道数增加到大约2.7倍
输入:(35, 35, 288)
分支1:→【BasicConv2d, Cout=384, f=3, s=2】→(17, 17, 384)
分支2:→【BasicConv2d, Cout=64, f=1】→(35, 35, 64)
→【BasicConv2d, Cout=96, f=3, p=1】→(35, 35, 96)
→【BasicConv2d, Cout=96, f=3, s=2】→(17, 17, 96)
分支3:→【max pool, f=3, s=2】→(17, 17, 288)
合并:(17, 17, 384+96+288=768)
InceptionC
使用了4次,分别用在Mixed_6b, Mixed_6c, Mixed_6d, Mixed_6e
中,输入和输出均为(17, 17, 768)
,只是参数channels_7x7
不同,参数channels_7x7
简记为c7
Mixed_6b,c7=128
Mixed_6c,c7=160
Mixed_6d,c7=160
Mixed_6e,c7=192
输入:(17, 17, 768)
分支1:→【BasicConv2d, Cout=192, f=1】→(17, 17, 192)
分支2:→【BasicConv2d, Cout=c7, f=1】→(17, 17, c7)
→【BasicConv2d, Cout=c7, f=(1, 7), p=(0, 3)】→(17, 17, c7)
→【BasicConv2d, Cout=192, f=(7, 1), p=(3, 0)】→(17, 17, 192)
分支3:→【BasicConv2d, Cout=c7, f=1】→(17, 17, c7)
→【BasicConv2d, Cout=c7, f=(7, 1), p=(3, 0)】→(17, 17, c7)
→【BasicConv2d, Cout=c7, f=(1, 7), p=(0, 3)】→(17, 17, c7)
→【BasicConv2d, Cout=c7, f=(7, 1), p=(3, 0)】→(17, 17, c7)
→【BasicConv2d, Cout=192, f=(1, 7), p=(0, 3)】→(17, 17, 192)
分支4:→【avg pool, f=3, s=1, p=1】→(17, 17, 768)
→【BasicConv2d, Cout=192, f=1】→(17, 17, 192)
合并:(17, 17, 192×4=768)
InceptionD
只使用了1次,用在Mixed_7a
中
(17, 17, 768)→【Mixed_7a, InceptionD】→(8, 8, 1280)
,空间维度减半,通道数增加到大约1.7倍
输入:(17, 17, 768)
分支1:→【BasicConv2d, Cout=192, f=1】→(17, 17, 192)
→【BasicConv2d, Cout=320, f=3, s=2】→(8, 8, 320)
分支2:→【BasicConv2d, Cout=192, f=1】→(17, 17, 192)
→【BasicConv2d, Cout=192, f=(1, 7), p=(0, 3)】→(17, 17, 192)
→【BasicConv2d, Cout=192, f=(7, 1), p=(3, 0)】→(17, 17, 192)
→【BasicConv2d, Cout=192, f=3, s=2】→(8, 8, 192)
分支3:→【max pool, f=3, s=2】→(8, 8, 768)
合并:(8, 8, 320+192+768=1280)
InceptionE
使用了2次, 分别用在Mixed_7b, Mixed_7c
中,输入为(8, 8, in_channels)
,输出固定为(8, 8, 2048)
以(8, 8, 1280)→【Mixed_7b, InceptionE】→(8, 8, 2048)
为例
输入:(8, 8, 1280)
分支1:→【BasicConv2d, Cout=320, f=1】→(8, 8, 320)
分支2:→【BasicConv2d, Cout=384, f=1】→(8, 8, 384)
分支2-1:→【BasicConv2d, Cout=384, f=(1, 3), p=(0, 1)】→(8, 8, 384)
分支2-2:→【BasicConv2d, Cout=384, f=(3, 1), p=(1, 0)】→(8, 8, 384)
合并:(8, 8, 384×2=768)
分支3:→【BasicConv2d, Cout=448, f=1】→(8, 8, 448)
→【BasicConv2d, Cout=384, f=3, p=1】→(8, 8, 384)
分支3-1:→【BasicConv2d, Cout=384, f=(1, 3), p=(0, 1)】→(8, 8, 384)
分支3-2:→【BasicConv2d, Cout=384, f=(3, 1), p=(1, 0)】→(8, 8, 384)
合并:(8, 8, 384×2=768)
分支4:→【avg pool, f=3, s=1, p=1】→(8, 8, 1280)
→【BasicConv2d, Cout=192, f=1】→(8, 8, 192)
合并:(8, 8, 320+768+768+192=2048)
InceptionAux
只用了1次,连接在【Mixed_6e, InceptionC】
的输出(17, 17, 768)
上
(17, 17, 768)
→【avg pool, f=5, s=3】→(5, 5, 768)
→【BasicConv2d, Cout=128, f=1】→(5, 5, 128)
→【BasicConv2d, Cout=768, f=5】→(1, 1, 768)
→【reshape】→(768,)→【fc】→(1000,)
网友评论