美文网首页
Inception Note

Inception Note

作者: o0Helloworld0o | 来源:发表于2018-11-26 08:44 被阅读0次

    InceptionV3的PyTorch实现:https://github.com/pytorch/vision/blob/master/torchvision/models/inception.py

    2a表示第2组的第1个Block,同一组的空间维度相同
    但为何没有3a, 5a?

    (299, 299, 3)
    
    →【1a, Cout=32, f=3, s=2】→(149, 149, 32)
    
    →【2a, Cout=32, f=3】→(147, 147, 32)→【2b, Cout=64, f=3, p=1】→(147, 147, 64)
    →【max pool, f=3, s=2】→(73, 73, 64)
    
    →【3b, Cout=80, f=1】→(73, 73, 80)
    
    →【4a, Cout=192, f=3】→(71, 71, 192)→【max pool, f=3, s=2】→(35, 35, 192)
    
    →【Mixed_5b, InceptionA】→(35, 35, 256)→【Mixed_5c, InceptionA】→(35, 35, 288)
    →【Mixed_5d, InceptionA】→(35, 35, 288)
    
    →【Mixed_6a, InceptionB】→(17, 17, 768)
    →【Mixed_6b, InceptionC】→(17, 17, 768)→【Mixed_6c, InceptionC】→(17, 17, 768)
    →【Mixed_6d, InceptionC】→(17, 17, 768)→【Mixed_6e, InceptionC】→(17, 17, 768)
    
    →【Mixed_7a, InceptionD】→(8, 8, 1280)
    →【Mixed_7b, InceptionE】→(8, 8, 2048)→【Mixed_7c, InceptionE】→(8, 8, 2048)
    
    →【global avg pool】→(2048,)→【dropout】→(2048,)→【fc】→(1000,)
    
    分支:→【Mixed_6e, InceptionC】→(17, 17, 768)→【InceptionAux】→(1000,)
    

    pool层几个值得注意的地方
    (147, 147, 64)→【max pool, f=3, s=2】→(73, 73, 64)
    (71, 71, 192)→【max pool, f=3, s=2】→(35, 35, 192),使用f=3(一般使用f=2)

    InceptionA使用了3次,分别用在Mixed_5b, Mixed_5c, Mixed_5d中,包含参数pool_features,输入为(35, 35, in_channels),输出固定为(35, 35, 224+pool features)

    (35, 35, 192)→【Mixed_5b,InceptionA, pool_features=32】→(35, 35, 224+32=256)
    (35, 35, 256)→【Mixed_5c,InceptionA, pool_features=64】→(35, 35, 224+64=288)
    (35, 35, 288)→【Mixed_5d,InceptionA, pool_features=64】→(35, 35, 224+64=288)
    

    (35, 35, 192)→【Mixed_5b, InceptionA, pool_features=64】→(35, 35, 256)为例

    输入:(35, 35, 192)
    
    分支1:→【BasicConv2d, Cout=64, f=1】→(35, 35, 64)
    
    分支2:→【BasicConv2d, Cout=48, f=1】→(35, 35, 48)
    →【BasicConv2d, Cout=64, f=5, p=2】→(35, 35, 64)
    
    分支3:→【BasicConv2d, Cout=64, f=1】→(35, 35, 64)
    →【BasicConv2d, Cout=96, f=3, p=1】→(35, 35, 96)
    →【BasicConv2d, Cout=96, f=3, p=1】→(35, 35, 96)
    
    分支4:→【avg pool, f=3, s=1, p=1】→(35, 35, 192)
    →【BasicConv2d, Cout=pool_features, f=1】→(35, 35, pool_features)
    
    合并:(35, 35, 224+pool_features)
    

    InceptionB只使用了1次,用在Mixed_6a
    (35, 35, 288)→【Mixed_6a, InceptionB】→(17, 17, 768),空间维度减半,通道数增加到大约2.7倍

    输入:(35, 35, 288)
    
    分支1:→【BasicConv2d, Cout=384, f=3, s=2】→(17, 17, 384)
    
    分支2:→【BasicConv2d, Cout=64, f=1】→(35, 35, 64)
    →【BasicConv2d, Cout=96, f=3, p=1】→(35, 35, 96)
    →【BasicConv2d, Cout=96, f=3, s=2】→(17, 17, 96)
    
    分支3:→【max pool, f=3, s=2】→(17, 17, 288)
    
    合并:(17, 17, 384+96+288=768)
    

    InceptionC使用了4次,分别用在Mixed_6b, Mixed_6c, Mixed_6d, Mixed_6e中,输入和输出均为(17, 17, 768),只是参数channels_7x7不同,参数channels_7x7简记为c7

    Mixed_6b,c7=128
    Mixed_6c,c7=160
    Mixed_6d,c7=160
    Mixed_6e,c7=192
    
    输入:(17, 17, 768)
    
    分支1:→【BasicConv2d, Cout=192, f=1】→(17, 17, 192)
    
    分支2:→【BasicConv2d, Cout=c7, f=1】→(17, 17, c7)
    →【BasicConv2d, Cout=c7, f=(1, 7), p=(0, 3)】→(17, 17, c7)
    →【BasicConv2d, Cout=192, f=(7, 1), p=(3, 0)】→(17, 17, 192)
    
    分支3:→【BasicConv2d, Cout=c7, f=1】→(17, 17, c7)
    →【BasicConv2d, Cout=c7, f=(7, 1), p=(3, 0)】→(17, 17, c7)
    →【BasicConv2d, Cout=c7, f=(1, 7), p=(0, 3)】→(17, 17, c7)
    →【BasicConv2d, Cout=c7, f=(7, 1), p=(3, 0)】→(17, 17, c7)
    →【BasicConv2d, Cout=192, f=(1, 7), p=(0, 3)】→(17, 17, 192)
    
    分支4:→【avg pool, f=3, s=1, p=1】→(17, 17, 768)
    →【BasicConv2d, Cout=192, f=1】→(17, 17, 192)
    
    合并:(17, 17, 192×4=768)
    

    InceptionD只使用了1次,用在Mixed_7a
    (17, 17, 768)→【Mixed_7a, InceptionD】→(8, 8, 1280),空间维度减半,通道数增加到大约1.7倍

    输入:(17, 17, 768)
    
    分支1:→【BasicConv2d, Cout=192, f=1】→(17, 17, 192)
    →【BasicConv2d, Cout=320, f=3, s=2】→(8, 8, 320)
    
    分支2:→【BasicConv2d, Cout=192, f=1】→(17, 17, 192)
    →【BasicConv2d, Cout=192, f=(1, 7), p=(0, 3)】→(17, 17, 192)
    →【BasicConv2d, Cout=192, f=(7, 1), p=(3, 0)】→(17, 17, 192)
    →【BasicConv2d, Cout=192, f=3, s=2】→(8, 8, 192)
    
    分支3:→【max pool, f=3, s=2】→(8, 8, 768)
    
    合并:(8, 8, 320+192+768=1280)
    

    InceptionE使用了2次, 分别用在Mixed_7b, Mixed_7c中,输入为(8, 8, in_channels),输出固定为(8, 8, 2048)
    (8, 8, 1280)→【Mixed_7b, InceptionE】→(8, 8, 2048)为例

    输入:(8, 8, 1280)
    
    分支1:→【BasicConv2d, Cout=320, f=1】→(8, 8, 320)
    
    分支2:→【BasicConv2d, Cout=384, f=1】→(8, 8, 384)
      分支2-1:→【BasicConv2d, Cout=384, f=(1, 3), p=(0, 1)】→(8, 8, 384)
      分支2-2:→【BasicConv2d, Cout=384, f=(3, 1), p=(1, 0)】→(8, 8, 384)
      合并:(8, 8, 384×2=768)
    
    分支3:→【BasicConv2d, Cout=448, f=1】→(8, 8, 448)
          →【BasicConv2d, Cout=384, f=3, p=1】→(8, 8, 384)
      分支3-1:→【BasicConv2d, Cout=384, f=(1, 3), p=(0, 1)】→(8, 8, 384)
      分支3-2:→【BasicConv2d, Cout=384, f=(3, 1), p=(1, 0)】→(8, 8, 384)
      合并:(8, 8, 384×2=768)
    
    分支4:→【avg pool, f=3, s=1, p=1】→(8, 8, 1280)
    →【BasicConv2d, Cout=192, f=1】→(8, 8, 192)
    
    合并:(8, 8, 320+768+768+192=2048)
    

    InceptionAux只用了1次,连接在【Mixed_6e, InceptionC】的输出(17, 17, 768)

    (17, 17, 768)
    →【avg pool, f=5, s=3】→(5, 5, 768)
    →【BasicConv2d, Cout=128, f=1】→(5, 5, 128)
    →【BasicConv2d, Cout=768, f=5】→(1, 1, 768)
    →【reshape】→(768,)→【fc】→(1000,)
    

    相关文章

      网友评论

          本文标题:Inception Note

          本文链接:https://www.haomeiwen.com/subject/eijvqqtx.html