美文网首页
基于PaddlePaddle实现AlexNet在Cifar10上

基于PaddlePaddle实现AlexNet在Cifar10上

作者: LabVIEW_Python | 来源:发表于2021-03-14 06:35 被阅读0次

    AlexNet是ILSVRC 2012的图像分类项目的第一名, 它具有6000万个参数和65万个神经元,基于两块GTX 580 3GB GPU,花了五到六天的时间来训练。本文基于一块GTX-1080Ti GPU,在Cifar10数据集上实现并训练AlexNet,大约只需要30分钟,大家可以感受到技术的进步。

    AlexNet的结构图如下所示: From https://learnopencv.com/understanding-alexnet/

    根据上述结构图,用PaddlePaddle实现其网络范例如下:

    import paddle 
    import paddle.nn.functional as F # 组网相关的函数,如conv2d, relu...
    import numpy as np
    from paddle.nn.layer.common import Dropout 
    from paddle.vision.transforms import Compose, Resize, Transpose, Normalize, ToTensor
    from paddle.vision.datasets import Cifar10
    
    # 构建AlexNet 网络
    # Sequential:顺序容器,子Layer将按构造函数参数的顺序添加到此容器中,传递给构造函数的参数可以Layers或可迭代的name Layer元组
    from paddle.nn import Sequential, Conv2D, ReLU, MaxPool2D, Linear, Dropout, Flatten
    
    class AlexNet(paddle.nn.Layer):
        def __init__(self, num_classes=10):
            super().__init__()
    
            self.conv_relu_pool1 = Sequential(
                Conv2D(3,96,11,4,0),
                ReLU(),
                MaxPool2D(3,2))
    
            self.conv_relu_pool2 = Sequential(
                Conv2D(96,256,5,1,2),
                ReLU(),
                MaxPool2D(3,2))
            
            self.conv_relu3 = Sequential(
                Conv2D(256,384,3,1,1),
                ReLU())
            
            self.conv_relu4 = Sequential(
                Conv2D(384,384,3,1,1),
                ReLU())
            
            self.conv_relu_pool5 = Sequential(
                Conv2D(384,256,3,1,1),
                ReLU(),
                MaxPool2D(3,2))
            
            self.fc = Sequential(
                Linear(256*6*6, 4096),
                ReLU(),
                Dropout(0.5),
                Linear(4096,4096),
                ReLU(),
                Dropout(0.5),
                Linear(4096,num_classes))
    
            self.flatten = Flatten()
    
        def forward(self,x):
            x = self.conv_relu_pool1(x)
            x = self.conv_relu_pool2(x)
            x = self.conv_relu3(x)
            x = self.conv_relu4(x)
            x = self.conv_relu_pool5(x)
            x = self.flatten(x)
            x = self.fc(x)
            return x
    
    alex_net = AlexNet(num_classes=10)
    model = paddle.Model(alex_net)
    from paddle.static import InputSpec
    input = InputSpec([None, 3, 227, 227], 'float32', 'image')
    label = InputSpec([None, 1], 'int64', 'label')
    model = paddle.Model(alex_net, input, label)
    model.summary()
    

    Layer (type) Input Shape Output Shape Param #
    Conv2D-1 [[1, 3, 227, 227]] [1, 96, 55, 55] 34,944
    ReLU-1 [[1, 96, 55, 55]] [1, 96, 55, 55] 0
    MaxPool2D-1 [[1, 96, 55, 55]] [1, 96, 27, 27] 0
    Conv2D-2 [[1, 96, 27, 27]] [1, 256, 27, 27] 614,656
    ReLU-2 [[1, 256, 27, 27]] [1, 256, 27, 27] 0
    MaxPool2D-2 [[1, 256, 27, 27]] [1, 256, 13, 13] 0
    Conv2D-3 [[1, 256, 13, 13]] [1, 384, 13, 13] 885,120
    ReLU-3 [[1, 384, 13, 13]] [1, 384, 13, 13] 0
    Conv2D-4 [[1, 384, 13, 13]] [1, 384, 13, 13] 1,327,488
    ReLU-4 [[1, 384, 13, 13]] [1, 384, 13, 13] 0
    Conv2D-5 [[1, 384, 13, 13]] [1, 256, 13, 13] 884,992
    ReLU-5 [[1, 256, 13, 13]] [1, 256, 13, 13] 0
    MaxPool2D-3 [[1, 256, 13, 13]] [1, 256, 6, 6] 0
    Flatten-1 [[1, 256, 6, 6]] [1, 9216] 0
    Linear-1 [[1, 9216]] [1, 4096] 37,752,832
    ReLU-6 [[1, 4096]] [1, 4096] 0
    Dropout-1 [[1, 4096]] [1, 4096] 0
    Linear-2 [[1, 4096]] [1, 4096] 16,781,312
    ReLU-7 [[1, 4096]] [1, 4096] 0
    Dropout-2 [[1, 4096]] [1, 4096] 0
    Linear-3 [[1, 4096]] [1, 10] 40,970
    ===========================================
    Total params: 58,322,314
    Trainable params: 58,322,314
    Non-trainable params: 0
    ===========================================
    Input size (MB): 0.59
    Forward/backward pass size (MB): 11.11
    Params size (MB): 222.48
    Estimated Total Size (MB): 234.18
    ============================================

    训练代码如下:

    # Compose: 以列表的方式组合数据集预处理功能
    # Resize: 调整图像大小
    # Transpose: 调整通道顺序, eg, HWC(img) -> CHW(NN)
    # Normalize: 对图像数据归一化
    # ToTensor: 将 PIL.Image 或 numpy.ndarray 转换成 paddle.Tensor
    # cifar10 手动计算均值和标准差:mean = [125.31, 122.95, 113.86] 和 std = [62.99, 62.08, 66.7] link:https://www.jianshu.com/p/a3f3ffc3cac1
    
    t = Compose([Resize(size=227), 
                 Normalize(mean=[125.31, 122.95, 113.86], std=[62.99, 62.08, 66.7], data_format='HWC'), 
                 Transpose(order=(2,0,1)), 
                 ToTensor(data_format='HWC')])
    
    train_dataset = Cifar10(mode='train', transform=t, backend='cv2') 
    test_dataset  = Cifar10(mode='test', transform=t, backend='cv2')
    BATCH_SIZE = 256
    train_loader = paddle.io.DataLoader(train_dataset, shuffle=True, batch_size=BATCH_SIZE)
    test_loader = paddle.io.DataLoader(test_dataset, batch_size=BATCH_SIZE)
    
    # 为模型训练做准备,设置优化器,损失函数和精度计算方式
    learning_rate = 0.0001
    loss_fn = paddle.nn.CrossEntropyLoss()
    opt = paddle.optimizer.Adam(learning_rate=learning_rate, parameters=model.parameters())
    model.prepare(optimizer=opt, loss=loss_fn, metrics=paddle.metric.Accuracy())
    
    # 启动模型训练,指定训练数据集,设置训练轮次,设置每次数据集计算的批次大小,设置日志格式
    model.fit(train_loader, batch_size=256, epochs=20, verbose=1)
    model.evaluate(test_loader, verbose=1)
    

    训练结果:在测试数据集上,精度可以达到78.79%

    Epoch 20/20
    step 196/196 [==============================] - loss: 0.0318 - acc: 0.9822 - 748ms/step
    Eval begin...
    The loss value printed in the log is the current batch, and the metric is the average value of previous step.
    step 40/40 [==============================] - loss: 0.9298 - acc: 0.7879 - 641ms/step

    心得:深度学习图像分类技术已经非常成熟,直接用PaddlePaddle框架的高层API实现即可。目标检测网络,由于需要合并Loss函数,训练过程需要动手实现,所以使用普通API函数。

    相关文章

      网友评论

          本文标题:基于PaddlePaddle实现AlexNet在Cifar10上

          本文链接:https://www.haomeiwen.com/subject/vddxcltx.html