前言

前面三节内容主要讲解了Pytorch的基础运算操作，如何去加载数据，还有神经网络训练过程的归纳，这一章主要是去实现一些经典的神经网络，包括Alexnet、VGG、googlenet、ResNet、mobilenet、xception，当然，Pytorch中已经定义好了这些网络，实际项目中，完全可以通过迁移学习的方式，来调用这些模型。

AlexNet

alexnet网络是2012年目标识别比赛的冠军，虽然随着时间的推移，被更多优秀的网络结构超越，但是其思想还是有很多可以借鉴的地方。
Alexnet网络一共有8层，前五层为卷积层，用于提取图像特征，后三层为全接连层，用于对图像进行分类。
该模型的主要特点如下：
1.提出了ReLU激励函数，可以减少梯度消失的风险；
2.池化层用于特征降维
3.局部归一化处理（LRN）

class AlexNet(nn.Module):

    def __init__(self, num_classes=1000):
        super(AlexNet, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(64, 192, kernel_size=5, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(192, 384, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
        )
        self.classifier = nn.Sequential(
            nn.Dropout(),
            nn.Linear(256 * 6 * 6, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Linear(4096, num_classes),
        )

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), 256 * 6 * 6)
        x = self.classifier(x)
        return x

VGG

VGG网络是2014年图像识别大赛的第二名，改进了AlexNet网络，缺点是参数太多。
VGG的特点：

小卷积核。作者将卷积核全部替换为3x3（极少用了1x1）；
小池化核。相比AlexNet的3x3的池化核，VGG全部为2x2的池化核；
层数更深特征图更宽。基于前两点外，由于卷积核专注于扩大通道数、池化专注于缩小宽和高，使得模型架构上更深更宽的同时，计算量的增加放缓；
全连接转卷积。网络测试阶段将训练阶段的三个全连接替换为三个卷积，测试重用训练时的参数，使得测试得到的全卷积网络因为没有全连接的限制，因而可以接收任意宽或高为的输入。

import torch 
import torch.nn as nn
cfg = {
    'A':[64,     'M', 128,      'M', 256, 256,           'M', 512, 512,           'M', 512, 512,           'M'],
    'B':[64, 64, 'M', 128, 128, 'M', 256, 256,           'M', 512, 512,           'M', 512, 512,           'M'],
    'C':[64, 64, 'M', 128, 128, 'M', 256, 256, 256,      'M', 512, 512, 512,      'M', 512, 512, 512,      'M'],
    'D':[64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M']
}
class VGG(nn.Module):
    def __init__(self,features,num_class=100):
        super().__init__()
        self.features = features
        self.classifier = nn.Sequential(
            nn.Linear(512,4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096,4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096,num_class)
        )
    def forward(self,x):
        output = self.features(x)
        output = output.view(output.size()[0],-1)
        output = self.classifier(output)
        return output

def make_layers(cfg,batch_norm=False):
    layers=[]
    input_channel = 3
    for l in cfg:
        if l == 'M':
            layers += [nn.MaxPool2d(kernel_size=2,stride=2)]
            continue
        layers += [nn.Conv2d(input_channel,l,kernel_size=3,padding=1)]
        if batch_norm:
            layers += [nn.BatchNorm2d(l)]
        layers += [nn.ReLU(inplace=True)]
        input_channel = l
    return nn.Sequential(*layers)

def vgg11_bn():
    return VGG(make_layers(cfg['A'], batch_norm=True))

def vgg13_bn():
    return VGG(make_layers(cfg['B'], batch_norm=True))

def vgg16_bn():
    return VGG(make_layers(cfg['C'], batch_norm=True))

def vgg19_bn():
    return VGG(make_layers(cfg['D'], batch_norm=True))

Googlenet

googlenet是2014年的冠军作品，提出了Inception结构，优化了网络的参数，并且在网络层数加深的情况下，可以抑制梯度消失。
Inceotion结构主要是通过使用1*1卷积核对输入特征进行降维操作，这样一来，参数就得到了下降，通过不同尺度下的特征最后进行融合。googlenet V1 中使用了9个线性堆叠的Inception模块，在不同深度处还引入了两个辅助损失函数，避免梯度回传失败。之后该研究组又推出了V2，V3,V4三个版本的网络。其中V2主要提出了著名的batchnorm，并且将较大尺寸的卷积核通过小卷积核相乘来代替；V3中提出，任意nxn的卷积都可以通过1xn卷积后接nx1卷积来替代。 V4结合了残差神经网络ResNet。