美文网首页
2018-08-10 CNN-convoutional

2018-08-10 CNN-convoutional

作者: 镜中无我 | 来源:发表于2019-02-18 17:15 被阅读0次

    compared to fully-connected neural networks, convolutional ones performs better in image recognition due to the different structures between adjacent layers

    structure:

    input: the primitive pixels of image which are 3-D matrice
    output: the confidence of classification

    • input layer: pixel matrice with RGB depth
    • convolutional layer: multiple nodes with input of a block in past layer,which means to extracting deeper feature
    • pooling: won't change the depth of last layer but will tighten the scale
    • fully-connected: for classifying by feeding the given features
    • softmax: for classifying via obtaining the probability of every class

    convolutional layer(filter or kernel)

    processed block is with the same scale as filter's
    filter depth
    output matrice scale :
    out_length=(in_length-fil_length+1)/stride_length
    out_width=(in_width-fil_width+1)/stride_width
    filter_parameter_amounts=fil_widthfil_length_in_depthfil_depth

    fil_weights=tf.get_variable('weigths',[fil_length,fil_width,in_depth,fil_depth],initializer=tf...)
    biases=tf.get_variable('biases',[fil_depth],initializer=tf...)
    #conv2d is used for forward-prob.
    conv=tf.nn.conv2d(input,filter_weight,strides=[1,len_stride,wid_stride,1], padding='SAME') #'VAILD' means no zeros-
     padding
    #bias_add is used for adding biases,note do not directly add
    bias=tf.nn.bias_add(conv,bias)
    activation=tf.nn.relu(bias)
    

    pooling layer

    usage: make the scale shrink to enhance the computational speed and avoid over-fitting

    max pooling

    simply maximizing

    #similarly to convolutional operation, you have to set the strides and padding,but stride in depth is valid for pooling #compared with convolutional layer. ksize is the scale of filter 
    pool=tf.nn.max_pool(activation,ksize=[1,fil_len,fil-wid,1], stride=[1,len_stride,wid_stride,1], padding ='SAME')
    

    average pooling

    classical models

    LeNet-5
    • first layer: convolutional layer
      input: 32321
      filter:556,padding='VALID'
      stride: [1,1,1,1]
      output:28286
    • second layer: pooling
      input : output of convolutional layer
      filter:[1,2,2,1]
      stride:[1,2,2,1]
      output:14146
    • third layer:convoluted layer
      input: output of last layer
      filter:5516,padding='VALID'
      stride:[1,1,1,1]
      output: 101016
    • fourth layer: pooling layer
      input: output of last layer
      filter:22
      stride:[1,2,2,1]
      output: 5
      5*16
    • fifth layer : fully-connected(similar to convoluted layer)
      input: output of last layer
      filter:55
      output:120
      para.:5
      516120+120
    • sixth layer: fully-connected
      input: output of last layer
      output: 84
      para.:120*84+84
    • seventh layer: fully-connected
      input: output of last layer
      output: 10
      para.:84*10+10
    xs=tf.palceholder(tf.float32,[Batch_size,mnist_inference.IMAGE_SIZE,mnist_inference.IMAGE_SIZE,mnist_inference.NUM_CHANNELS],name='x-input')
    reshaped_xs=np.reshape(xs,(Batch_size,mnist_inference.IMAGE_SIZE,mnist_inference.IMAGE_SIZE,mnist_inference.NUM_CHANNELS))
    def inference(tensor,train,regularizer)
         with tf.variable_scope('layer1-conv1'):
                conv1_weights=tf.get_variable('weights',[CONV1_SIZE,CONV1_SIZE,NUM_CHANNELS,CONV1_DEEP],initializer=...)
                conv1_biases=tf...
                conv1=tf.nn.conv2d(...)
                relu1=tf.nn.relu(tf.nn.bias_add(...))
         with tf.name_scope('layer2-pool1'):
                pool1=tf.nn.max_pool(relu1,ksize=...,strides=...,padding=...)
         with tf.variable_scope(layer3-conv2):
                ...
         with tf.name_scope(...):
                pool2=...
         # reshape the data form to prepare for next fully-connected layer
         pool_shape=pool2.get_shape().as_list()
         nodes=pool_shape[1]*pool_shape[2]*pool_shape[3]
         reshaped=tf.reshape(pool2,pool_shape[0],nodes])
         with tf.variable_scope('layer5-fc1'):
                fc1_weights=tf.get_variable('weights',[nodes,FC_SIZE],initializer=...)
                if regularizer!=None:
                    tf.add_to_collection('losses',regularizer(fc1_weights))
                fc1_biases=tf.get_variable('bias',[FC_SIZE],initializer=...)
                fc1=tf.nn.relu(...)
                if train:fc1=tf.nn.dropout(fc1,0.5)
         with ...
                ...
                logit=tf.matnul(fc1,fc2_weights)+fc2_biases
         return logit
    

    note:input->(convoluted+->pooling?)->fully-connected->softmax->out

    Inception-v3

    core method
    in convoluted layer, three different kernels are provided to simultaneously process the input and then accumulate them together
    for that,we set the stride to 1 and padding to 'SAME'

    # predetermine  the para. of some methods
    with slim.arg_scope([slim.conv2d,slim.max_pool2d,slim.avg_pool2d],stride=1,padding='SAME'):
           # inception module namespace
           with tf.variable_scope('...'):
                  #for every path
                  with tf.variable_scope('...1'):
                  with tf.variable_scope('...2'):
                  with tf.variable_scope('...3'):
           net=tf.concat(3,[...1,...2,...3])
    

    相关文章

      网友评论

          本文标题:2018-08-10 CNN-convoutional

          本文链接:https://www.haomeiwen.com/subject/ifelbftx.html