美文网首页
AlexNet文章复现

AlexNet文章复现

作者: DLabc | 来源:发表于2018-05-08 21:01 被阅读0次

    AlexNet_v1:ImageNet Classification with Deep Convolutional Neural Networks

    AlexNet在2012年的ILVSRC上获得了第一名。相比第二名,它的准确率提高了超过了10%。
    AlexNet的创新点
    1.数据增强。

    对图像进行了随机裁剪,水平翻转。数据增强使得数据增加了(256-224)x(256-224)x2=2048倍。并且改变图片的RGB通道的强度。

    2.ReLU激活函数

    激活函数采用了ReLU,这样就不会出现tanh和sigmoid两端的饱和现象。能够缓梯度反向传播过程中的梯度消失问题。

    3.重叠池化

    将正常池化(2x2 stride 2)改为重叠池化(3x3 stride 2)可以将top-1和top-5提高0.4%和0.3%

    4.局部响应归一化(Local Response Normalization)

    LRN有助于泛化,但是LRN的效果其实是有争议的。

    LRN其实是和BatchNorm有点像。

    LRN思想主要来自于生物学的“侧抑制”

    5.Dropout

    在fc中应用dropout可以防止过拟合,它以一种很高效地方式结合多个不同的训练模型。

    6.GPU实现

    AlexNet的GPU实现,大大加快了模型的训练。

    下图是AlexNet的架构图:


    AlexNet架构图

    AlexNet一共8层,5个conv层,3个fc层。
    以下是TF+Keras混合编写的AlexNet:

    import tensorflow as tf
    keras = tf.keras
    from tensorflow.python.keras.layers import Conv2D,MaxPool2D,Dropout,Flatten,Dense
    
    
    def inference(inputs,
                  num_classes=1000,
                  is_training=True,
                  dropout_keep_prob=0.5):
      '''
      Inference
      
      inputs: a tensor of images
      num_classes: the num of category.
      is_training: set ture when it used for training
      dropout_keep_prob: the rate of dropout during training
      '''
      
      x = inputs
      # conv1
      x = Conv2D(96, [11,11], 4, activation='relu', name='conv1')(x)
      # lrn1
      x = tf.nn.local_response_normalization(x, name='lrn1')
      # pool1
      x = MaxPool2D([3,3], 2, name='pool1')(x)
      # conv2
      x = Conv2D(256, [5,5], activation='relu', padding='same', name='conv2')(x)
      # lrn2
      x = tf.nn.local_response_normalization(x, name='lrn2')
      # pool2
      x = MaxPool2D([3,3], 2, name='pool2')(x)
      # conv3
      x = Conv2D(384, [3,3], activation='relu', padding='same', name='conv3')(x)
      # conv4
      x = Conv2D(384, [3,3], activation='relu', padding='same', name='conv4')(x)
      # conv5
      x = Conv2D(256, [3,3], activation='relu', padding='same', name='conv5')(x)
      # pool5
      x = MaxPool2D([3,3], 2, name='pool5')(x)
      # flatten
      x = Flatten(name='flatten')(x)
      # dropout
      if is_training:
        x = Dropout(dropout_keep_prob, name='dropout5')(x)
      # fc6
      x = Dense(4096, activation='relu', name='fc6')(x)
      # dropout
      if is_training:
        x = Dropout(dropout_keep_prob, name='dropout6')(x)
      # fc7
      x = Dense(4096, activation='relu', name='fc7')(x)
      # fc8
      logits = Dense(num_classes, name='logit')(x)
      return logits
    
    
    def build_cost(logits, labels,weight_decay_rate):
      '''
      cost
      
      logits: predictions
      labels: true labels
      weight_decay_rate: weight_decay_rate
      '''
      with tf.variable_scope('costs'):
        with tf.variable_scope('xent'):
          xent = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(
              logits=logits, labels=labels))
        with tf.variable_scope('decay'):
          costs=[]
          for var in tf.trainable_variables():
            costs.append(tf.nn.l2_loss(var))
            tf.summary.histogram(var.op.name, var) # summary
          cost_decay = tf.multiply(weight_decay_rate, tf.add_n(costs))
        cost = tf.add(xent,cost_decay)
        tf.summary.scalar('cost', cost) # summary
      return cost
    
    
    def build_train_op(cost, lrn_rate, global_step):
      '''
      train_op
      
      cost: cost
      lrn_rate: learning rate
      global_step: global step
      '''
      with tf.variable_scope('train'):
        lrn_rate = tf.constant(lrn_rate, tf.float32)
        tf.summary.scalar('learning_rate', lrn_rate) # summary
        
        trainable_variables = tf.trainable_variables()
        grads = tf.gradients(cost, trainable_variables)
        
        optimizer = tf.train.AdamOptimizer(lrn_rate)
        
        apply_op = optimizer.apply_gradients(
            zip(grads, trainable_variables),
            global_step=global_step, name='train_step')
        
        train_op = apply_op
      return train_op
    
    
    if __name__ == '__main__':
      images = tf.placeholder(tf.float32, [None, 224, 224, 3])
      labels = tf.placeholder(tf.float32, [None, 1000])
      logits = inference(inputs=images,
                         num_classes=1000)
      print('inference: good job')
      cost = build_cost(logits=logits,
                        labels=labels,
                        weight_decay_rate=0.0002)
      print('build_cost: good job')
      global_step = tf.train.get_or_create_global_step()
      train_op = build_train_op(cost=cost,
                                lrn_rate=0.001,
                                global_step=global_step)
      print('build_train_op: good job')
    

    在这里,我们提供inference、build_cost、build_train_op三个函数,这三部分基本上实现上AlexNet的绝大部分。
    AlexNet的具体配置:

    配置
    conv1 11x11@96 stride 4, relu
    lrn1
    pool1 3x3 maxpool, stride 2
    conv2 5x5@256 stride 1, relu
    lrn2
    pool2 3x3 maxpool, stride 2
    conv3 3x3@384 stride 1, relu
    conv4 3x3@384 stride 1, relu
    conv5 3x3@256 stride 1, relu
    pool5 3x3 maxpool, stride 2
    flatten
    dropout rate 0.5
    fc6 4096
    dropout rate 0.5
    fc7 4096
    fc8 1000

    相关文章

      网友评论

          本文标题:AlexNet文章复现

          本文链接:https://www.haomeiwen.com/subject/gcbwrftx.html