深度学习（三）：卷积神经网络（下）

作者: fromeast | 来源:发表于2019-08-22 16:38 被阅读0次

CNN学习笔记
《解析卷积神经网络—深度学习实践手册.pdf》PDF高清完整版-
一文带你认识深度学习中不同类型的卷积
keras学习-CNN
Keras深度学习实践3—计算机视觉问题：猫vs狗
18-tensorflow
CNN中常用的四种卷积详解
深度学习中的各种卷积操作
文科小白也能入门的深度学习（一）
CNN的基础知识

一、几种典型的卷积神经网络

1.1、Lenet-5

$Lenet-5$ 出自论文Gradient-Based Learning Applied to Document Recognition，是Yann LeCun 1998年提出的一种用于手写体字符识别的非常高效的卷积神经网络，对MNIST数据集的分识别准确度可达99.2%。。 $Lenet-5$ 的网络结构如下图所示：

Lenet-5 结构示意图

1.2、AlexNet

$AlexNet$ 是第一个现代深度卷积网络模型，其首次使用了很多现代深度卷积网络的一些技术方法，比如使用GPU进行并行训练，采用了ReLU作为非线性激活函数，使用Dropout防止过拟合，使用数据增强来提高模型准确率等。 $AlexNet$ 赢得了2012年ImageNet图像分类竞赛的冠军。 $AlexNet$ 的结构如下图所示，包括5个卷积层、 3个全连接层和1个softmax 层。

AlexNet 结构示意图

1.3、VGGNet

$VGGNet$ 是牛津大学计算机视觉组（Visual Geometry Group）和Google DeepMind公司的研究员一起研发的深度卷积神经网络。 $VGGNet$ 探索了卷积神经网络的深度与其性能之间的关系，通过反复堆叠33的小型卷积核和22的最大池化层， $VGGNet$ 成功地构筑了16~19层深的卷积神经网络。 $VGGNet$ 相比之前state-of-the-art的网络结构，错误率大幅下降， $VGGNet$ 论文中全部使用了33的小型卷积核和22的最大池化核，通过不断加深网络结构来提升性能。

VGG-16 结构示意图

1.4、GoogLeNet

在卷积网络中，如何设置卷积层的卷积核大小是一个十分关键的问题。在 Inception网络中，一个卷积层包含多个不同大小的卷积操作，称为Inception模块。Inception网络是由有多个inception模块和少量的汇聚层堆叠而成。Inception模块同时使用1×1、3×3、5×5等不同大小的卷积核，并将得到的特征映射在深度上拼接（堆叠）起来作为输出特征映射。
GoogLeNet由9个Inceptionv1模块和5个汇聚层以及其它一些卷积层和全连接层构成，总共为22层网络，如下图所示。为了解决梯度消失问题，GoogLeNet 在网络中间层引入两个辅助分类器来加强监督信息。

Inception v1 的模块结构

GoogLeNet 网络结构

1.5、ResNet

残差网络（Residual Network，ResNet）是通过给非线性的卷积层增加直连边的方式来提高信息的传播效率。
假设在一个深度网络中，期望一个非线性单元（可以为一层或多层的卷积层） $f(x,θ)$ 去逼近一个目标函数为 $h(x)$ 。如果将目标函数拆分成两部分：恒等函数（Identity Function） $x$ 和残差函数（Residue Function） $h(x)−x$ 。

下图给出了一个典型的残差单元示例。残差单元由多个级联的（等长）卷积层和一个跨层的直连边组成，再经过ReLU激活后得到输出。残差网络就是将很多个残差单元串联起来构成的一个非常深的网络。

简单的残差单元结构

34层普通卷积网络与残差网络的对比

二、ResNet的手动实现

库和数据的导入。这次采用CIFAR10数据集，CIFAR-10是一个更接近普适物体的彩色图像数据集。一共包含10 个类别的RGB 彩色图片：飞机（ airplane ）、汽车（ automobile ）、鸟类（ bird ）、猫（ cat ）、鹿（ deer ）、狗（ dog ）、蛙类（ frog ）、马（ horse ）、船（ ship ）和卡车（ truck ）。每个图片的尺寸为32 × 32 ，每个类别有6000个图像，数据集中一共有50000 张训练图片和10000 张测试图片。

import torch 
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
from sklearn.manifold import TSNE
import numpy as np
from matplotlib import cm
import matplotlib.pyplot as plt

device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')

num_epochs = 30
batch_size = 100
learning_rate = 0.001
transform = transforms.Compose([
        transforms.Pad(4),
        transforms.RandomHorizontalFlip(),
        transforms.RandomCrop(32),
        transforms.ToTensor()])

train_dataset = torchvision.datasets.CIFAR10(root='Data', train=True, transform = transform, download = True)
test_dataset = torchvision.datasets.CIFAR10(root='Data', train=False, transform = transforms.ToTensor(), download = False)
train_loader = torch.utils.data.DataLoader(dataset=train_dataset, batch_size = batch_size, shuffle= True)
test_loader = torch.utils.data.DataLoader(dataset=test_dataset, batch_size = batch_size, shuffle= False)

CIFAR10举例

由于batch_size为100，因此训练集共分为500组，测试集共分为100组。

print('train size:{}, test size:{}'.format(len(train_loader),len(test_loader)))
#train size:500, test size:100

可视化其中部分图片如下：

from torchvision.transforms import ToPILImage
show = ToPILImage() # 可以把Tensor转成Image，方便可视化
fig = plt.figure()
fig.subplots_adjust(left=0,right=1,bottom=0,top=0.8,hspace=0.2,wspace=0.1)
for i in range(6):
    (image,label) = test_dataset[i]
    ax = fig.add_subplot(2,3,i+1,xticks=[],yticks=[])
    plt.title('{}'.format(classes[label]))
    ax.imshow(show(image))

残差网络的设计。分为残差网络和残差单元两部分。

class ResidualBlock(nn.Module):
    def __init__(self,in_channel,out_channel,stride=1,downsample=None):
        super(ResidualBlock,self).__init__()
        self.conv1 = nn.Conv2d(in_channel,out_channel,kernel_size=3,stride=stride,padding=1)
        self.bn1 = nn.BatchNorm2d(out_channel)
        self.relu = nn.ReLU()
        self.conv2 = nn.Conv2d(out_channel,out_channel,kernel_size=3,stride=1,padding=1)
        self.bn2 = nn.BatchNorm2d(out_channel)
        self.downsample = downsample
        
    def forward(self,x):
        residual = x
        out = self.conv1(x)
        out = self.bn1(out)
        out = self.relu(out)
        out = self.conv2(out)
        out = self.bn2(out)
        
        if self.downsample:
            residual = self.downsample(x)
        out += residual
        out = self.relu(out)
        return out
    
class ResNet(nn.Module):
    def __init__(self,block,num_classes=10):
        super(ResNet, self).__init__()
        self.in_channel = 16
        self.conv1 = nn.Conv2d(3,16, stride =1, kernel_size = 3, padding = 1)
        self.bn = nn.BatchNorm2d(16)
        self.relu = nn.ReLU()
        
        self.block1 = self.make_layer(block,16,1)
        self.block2 = self.make_layer(block,16,1)
        self.block3 = self.make_layer(block,32,2)
        self.block4 = self.make_layer(block,32,1)
        self.block5 = self.make_layer(block,64,2)
        self.block6 = self.make_layer(block,64,1)
        self.avg_pool = nn.AvgPool2d(8)
        self.fc = nn.Linear(64,num_classes)
        
    def make_layer(self,block,out_channel,stride=1):
        downsample = None
        if (stride != 1) or (self.in_channel != out_channel):
            downsample = nn.Sequential(
                    nn.Conv2d(self.in_channel,out_channel,kernel_size=3,stride=stride,padding=1),
                    nn.BatchNorm2d(out_channel))
        out_layer = block(self.in_channel, out_channel, stride, downsample)
        self.in_channel = out_channel
        return out_layer
    
    def forward(self,x):
        out = self.conv1(x)
        out = self.bn(out)
        out = self.relu(out)
        out = self.block1(out)
        out = self.block2(out)
        out = self.block3(out)
        out = self.block4(out)
        out = self.block5(out)
        out = self.block6(out)
        out = self.avg_pool(out)
        out = out.view(out.size(0), -1)
        out = self.fc(out)
        return out

网络训练即测试过程。

resnet = ResNet(ResidualBlock).to(device)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(resnet.parameters(),lr=learning_rate)

def update_lr(optimizer,lr):
    for para in optimizer.param_groups:
        para['lr'] = lr
        
total_step = len(train_loader)
curr_lr = learning_rate
for epoch in range(num_epochs):
    for idx,(images,labels) in enumerate(train_loader):
        images = images.to(device)
        labels = labels.to(device)
        #print(images.shape)
        outputs = resnet(images)
        loss = criterion(outputs,labels)
        
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        if (idx+1)%100 == 0:
            print ("Epoch [{}/{}], Step [{}/{}] Loss: {:.4f}".format(epoch+1, num_epochs, idx+1, total_step, loss.item()))

    # Decay learning rate
    if (epoch+1) % 20 == 0:
        curr_lr /= 3
        update_lr(optimizer, curr_lr)

with torch.no_grad():
    correct = 0
    total = 0
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)
        outputs = resnet(images)
        predicted = torch.argmax(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum()
    print('Test Accuracy of the model on the 10000 test images: {} %'.format(100 * correct / total))

可见在测试集上准确率为84 %，说明了网络的有效性。
Test Accuracy of the model on the 10000 test images: 84 %

分类结果的可视化。

classes = ('plane', 'car', 'bird', 'cat','deer', 'dog', 'frog', 'horse', 'ship', 'truck')
#visualization of trained flatten layer (t-SNE)
tsne = TSNE(perplexity=30,n_components=2,init='pca',n_iter=5000)
plot_only = 500
low_dim_embs = tsne.fit_transform(outputs.data.cpu().numpy())[:plot_only,:]
plot_labels = labels.cpu().numpy()[:plot_only]
plot_with_labels(low_dim_embs,plot_labels)
   
def plot_with_labels(lowDWeights, labels):
    plt.cla()
    X, Y = lowDWeights[:, 0], lowDWeights[:, 1]
    for x, y, s in zip(X, Y, labels):
        c = cm.rainbow(int(255 * s / 9)); plt.text(x, y, classes[s], backgroundcolor=c, fontsize=9)
    plt.xlim(X.min(), X.max()); plt.ylim(Y.min(), Y.max()); plt.title('Visualize last layer'); plt.show(); plt.pause(0.01)

分类结果可视化

参考资料

[1] Vishnu Subramanian. Deep Learning with PyTorch. Packet Publishing. 2018.
[2] 邱锡鹏著，神经网络与深度学习. https://nndl.github.io/ 2019.
[3] 肖智清著，神经网络与PyTorch实战. 北京：机械工业出版社. 2018.
[4] 唐进民编著，深度学习之PyTorch实战计算机视觉. 北京：电子工业出版社. 2018.
[5] Ian Goodfellow 等著, 赵申剑等译, 深度学习. 北京：人民邮电出版社, 2017.
[6] https://github.com/harsh-99/PyTorch-Tutorials

布被秋宵梦觉，眼前万里江山. ——辛弃疾《清平乐·独宿博山王氏庵》

CNN学习笔记
卷积神经网络-CNN 卷积神经网络是一类包含卷积计算且具有深度结构的前馈神经网络，是深度学习（deep learn...
《解析卷积神经网络—深度学习实践手册.pdf》PDF高清完整版-
《解析卷积神经网络—深度学习实践手册.pdf》PDF高清完整版-免费下载《解析卷积神经网络—深度学习实践手册.p...
一文带你认识深度学习中不同类型的卷积
卷积（convolution）现在可能是深度学习中最重要的概念。靠着卷积和卷积神经网络（CNN）,深度学习超越了几...
keras学习-CNN
keras学习-卷积神经网络部分参考《Python 深度学习》一书卷积神经网络在使用mnist数据训练时使...
Keras深度学习实践3—计算机视觉问题：猫vs狗
内容参考以及代码整理自“深度学习四大名“著之一《Python深度学习》一、卷积神经网络卷积神经网络，也叫con...
18-tensorflow
基础深度学习介绍深度学习，如深度神经网络、卷积神经网络和递归神经网络已被应用计算机视觉、语音识别、自然语言处理...
CNN中常用的四种卷积详解
卷积现在可能是深度学习中最重要的概念。正是靠着卷积和卷积神经网络，深度学习才超越了几乎其他所有的机器学习手段。这期...
深度学习中的各种卷积操作
吐槽：为啥简书不支持[TOC]生成目录深度学习中的各种卷积操作 1、深度学习中的卷积操作在神经网络中，卷积...
文科小白也能入门的深度学习（一）
从此部分开始写关于深度学习的文章，从卷积神经网络写起。卷积神经网络是近年来深度学习能在计算机视觉领域取得突破性成...
CNN的基础知识
CNN，全名卷积神经网络，是深度学习的一种，它与人工智能，机器学习，表示学习，深度学习的关系见下图（来自：解析卷积...