新手使用caffe

作者: 宇宙独一无二的我 | 来源:发表于2016-05-21 21:49 被阅读8836次

    1.首先,caffe的安装很麻烦,稍后有时间我在详细写一个教程。
    先贴个官网的安装方法,http://caffe.berkeleyvision.org/installation.html

    2.安装好之后,仔细阅读并照着流程跑一下官网给的例子,链接如下:
    1).http://caffe.berkeleyvision.org/gathered/examples/mnist.html
    2).http://caffe.berkeleyvision.org/gathered/examples/cifar10.html
    ……

    3.看完之后,可以仔细研究以下通过python来使用caffe的例子,了解使用caffe的方法。
    1). http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/01-learning-lenet.ipynb
    2). http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb
    3).http://www.cnblogs.com/empty16/p/4878164.html
    ……

    4.以下以人脸识别问题使用以下库使用caffe进行训练和测试:
    http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html

    里面包括了40个人,每人10张人脸照片。如下图:

    att_face数据库

    由于官网上给出了Model_Zoo的链接,通过查询得知,已经有训练好的人脸识别模型,可以直接拿来使用,即:
    下载地址:
    http://www.robots.ox.ac.uk/%7Evgg/software/vgg_face/src/vgg_face_caffe.tar.gz

    在网站VGG Face Descriptor中提供了模型和源码,具体使用参考相关说明即可,基本的流程应该比较简单:

    • 在脚本源码中指定Caffe库的路径,指定.caffemodel模型,指定输入数据,通过函数调用网络的测试功能,获取网络输出结果。
    • 执行脚本源码。

    如果源码的使用说明不能够充分理解,可以参考Jupyter Notebook Viewer的示例。基本流程与ImageNet的分类任务应该是相同的。另外,模型的数据集在VGG Face Descriptor相关论文的第三章有说明。pdf

    其次因为人脸图片是灰度图,需要首先用OpenCV将其转化成RGB的图片才能使用VGG。python代码如下:

    import os
    
    import cv2
    
    import sysdef 
    
    convert_gray_img_to_rgb(base_dir,dir_pre_str,dir_range_list,dir_post_str,file_format,partion_list):
        for i in dir_range_list:
            for index,partion_list_part in enumerate(partion_list): 
                  for k in partion_list_part:
                      if base_dir=="":
                              base_dir_str=""
                      else:
                              base_dir_str=base_dir+os.sep 
                      type=""
                      if index==0:
                            type="train"
                      elif index==1:
                            type="tst"                                               
                      file_input_path=base_dir_str+type+os.sep+dir_pre_str+str(i)+\
    
                                                dir_post_str+os.sep+str(k)+file_format
    
                      img = cv2.imread( file_input_path,0 )
                      img = cv2.cvtColor( img, cv2.COLOR_GRAY2RGB )
                      out_file= base_dir_str+type+os.sep+dir_pre_str+\
    
                                      str(i)+dir_post_str+os.sep+str(k)+".jpg"
                      cv2.imwrite(out_file, img)
    if __name__=='__main__':
    
    source_dir="/Users/Ren/Downloads/att_faces_back"
    
    dir_pre_str="s"
    
    dir_range_list=range(1,41)
    
    test_partion_list=[7,8,9,10]
    
    train_partion_list=[1,2,3,4,5,6]
    
    dir_post_str=""
    
    file_format=".pgm"
       
    convert_gray_img_to_rgb(source_dir,dir_pre_str,dir_range_list\
    
    ,dir_post_str,file_format,[train_partion_list,test_partion_list])
    

    对于此数据库,首先需要将人脸的数据进行划分:训练和测试集,并转换成lmdb模型。过程请参考:http://www.cnblogs.com/dupuleng/articles/4370236.html。我的代码如下,将其保存到了example/att_faces/create_att_faces.sh

    #!/usr/bin/env sh
    # Create the imagenet lmdb inputs
    # N.B. set the path to the imagenet train + val data dirs
    
    EXAMPLE=examples/att_faces
    DATA=data/att_faces
    TOOLS=build/tools
    DBTYPE=lmdb
    TRAIN_DATA_ROOT=$DATA/train/
    TEST_DATA_ROOT=$DATA/tst/
    ROOT=./
    # Set RESIZE=true to resize the images to 256x256. Leave as false if images have
    # already been resized using another tool.
    RESIZE=true
    if $RESIZE; then
      RESIZE_HEIGHT=224
      RESIZE_WIDTH=224
    else
      RESIZE_HEIGHT=0
      RESIZE_WIDTH=0
    fi
    
    if [ ! -d "$TRAIN_DATA_ROOT" ]; then
      echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
      echo "Set the TRAIN_DATA_ROOT variable in create_att_faces.sh to the path" \
           "where the ImageNet training data is stored."
      exit 1
    fi
    
    if [ ! -d "$TEST_DATA_ROOT" ]; then
      echo "Error: TEST_DATA_ROOT is not a path to a directory: $TEST_DATA_ROOT"
      echo "Set the TEST_DATA_ROOT variable in create_att_faces.sh to the path" \
           "where the ImageNet test data is stored."
      exit 1
    fi
    
    echo "Creating train lmdb..."
    rm -rf $EXAMPLE/att_faces_train_$DBTYPE $EXAMPLE/att_faces_tst_$DBTYPE
    
    GLOG_logtostderr=1 $TOOLS/convert_imageset \
        --resize_height=$RESIZE_HEIGHT \
        --resize_width=$RESIZE_WIDTH \
        --shuffle \
        $ROOT \
        $DATA/train.txt \
        $EXAMPLE/att_faces_train_$DBTYPE
    
    echo "Creating tst lmdb..."
    rm -f $EXAMPLE/mean.binaryproto
    GLOG_logtostderr=1 $TOOLS/convert_imageset \
        --resize_height=$RESIZE_HEIGHT \
        --resize_width=$RESIZE_WIDTH \
        --shuffle \
        $ROOT \
        $DATA/tst.txt \
        $EXAMPLE/att_faces_tst_$DBTYPE
    echo "Computing image mean..."
    ./build/tools/compute_image_mean -backend=$DBTYPE \
      $EXAMPLE/att_faces_train_$DBTYPE $EXAMPLE/mean.binaryproto
    echo "Done."
    

    之后可以使用该数据通过以models/finetune_flickr_style/train_val.prototxt 为模板,以vgg_face_caffe/VGG_FACE_deploy.prototxt 为内容将网络结构进行填充。即加入数据输入层与改变最后一层的全连接层输出数量,修正掉旧caffe的语法。修正后的内容如下:

    name: "VGG_FACE_16_Net"
    layer {
      name: "data"
      type: "ImageData"
      top: "data"
      top: "label"
      include {
        phase: TRAIN
      }
      transform_param {
        mirror: true
        crop_size: 224
        mean_file: "examples/att_faces/mean.binaryproto"
      }
      image_data_param {
        source: "data/att_faces/train.txt"
        batch_size: 1
        new_height: 224
        new_width: 224
      }
    }
    layer {
      name: "data"
      type: "ImageData"
      top: "data"
      top: "label"
      include {
        phase: TEST
      }
      transform_param {
        mirror: false
        crop_size: 224
        mean_file: "examples/att_faces/mean.binaryproto"
      }
      image_data_param {
        source: "data/att_faces/tst.txt"
        batch_size: 1
        new_height: 224
        new_width: 224
      }
    }
    layer {
      name: "conv1_1"
      type: "Convolution"
      bottom: "data"
      top: "conv1_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 64
        kernel_size: 3
        pad: 1
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "relu1_1"
      type: "ReLU"
      bottom: "conv1_1"
      top: "conv1_1"
    }
    layer {
      name: "conv1_2"
      type: "Convolution"
      bottom: "conv1_1"
      top: "conv1_2"
      param {
        lr_mult: 1 
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 64
        kernel_size: 3
        pad: 1
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      } 
    }
    layer {
      name: "relu1_2"
      type: "ReLU"
      bottom: "conv1_2"
      top: "conv1_2"
    }
    layer {
      name: "pool1"
      type: "Pooling"
      bottom: "conv1_2"
      top: "pool1"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "conv2_1"
      type: "Convolution"
      bottom: "pool1"
      top: "conv2_1"
      param {
        lr_mult: 1
        decay_mult: 1
      } 
      param {
        lr_mult: 2
        decay_mult: 0
      } 
      convolution_param {
        num_output: 128
        kernel_size: 3
        pad: 1
        weight_filler {
          type: "gaussian"
          std: 0.01
        } 
        bias_filler {
          type: "constant"
          value: 0
        } 
      } 
    }
    layer {
      name: "relu2_1"
      type: "ReLU"
      bottom: "conv2_1"
      top: "conv2_1"
    }
    layer { 
      name: "conv2_2"
      type: "Convolution"
      bottom: "conv2_1"
      top: "conv2_2"
      param {
        lr_mult: 1
        decay_mult: 1
      } 
      param {
        lr_mult: 2
        decay_mult: 0
      } 
      convolution_param {
        num_output: 128
        kernel_size: 3
        pad: 1
        weight_filler {
          type: "gaussian"
          std: 0.01 
        } 
        bias_filler {
          type: "constant"
          value: 0
        }
      } 
    }
    layer {
      name: "relu2_2"
      type: "ReLU"
      bottom: "conv2_2"
      top: "conv2_2"
    }
    layer {
      name: "pool2"
      type: "Pooling"
      bottom: "conv2_2"
      top: "pool2"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "conv3_1"
      type: "Convolution"
      bottom: "pool2"
      top: "conv3_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "relu3_1"
      type: "ReLU"
      bottom: "conv3_1"
      top: "conv3_1"
    }
    layer {
      name: "conv3_2"
      type: "Convolution"
      bottom: "conv3_1"
      top: "conv3_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "relu3_2"
      type: "ReLU"
      bottom: "conv3_2"
      top: "conv3_2"
    }
    layer {
      name: "conv3_3"
      type: "Convolution"
      bottom: "conv3_2"
      top: "conv3_3"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 256
        kernel_size: 3
        pad: 1
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "relu3_3"
      type: "ReLU"
      bottom: "conv3_3"
      top: "conv3_3"
    }
    layer {
      name: "pool3"
      type: "Pooling"
      bottom: "conv3_3"
      top: "pool3"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "conv4_1"
      type: "Convolution"
      bottom: "pool3"
      top: "conv4_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 512
        kernel_size: 3
        pad: 1
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "relu4_1"
      type: "ReLU"
      bottom: "conv4_1"
      top: "conv4_1"
    }
    layer {
      name: "conv4_2"
      type: "Convolution"
      bottom: "conv4_1"
      top: "conv4_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 512
        kernel_size: 3
        pad: 1
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "relu4_2"
      type: "ReLU"
      bottom: "conv4_2"
      top: "conv4_2"
    }
    layer {
      name: "conv4_3"
      type: "Convolution"
      bottom: "conv4_2"
      top: "conv4_3"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 512
        kernel_size: 3
        pad: 1
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "relu4_3"
      type: "ReLU"
      bottom: "conv4_3"
      top: "conv4_3"
    }
    layer {
      name: "pool4"
      type: "Pooling"
      bottom: "conv4_3"
      top: "pool4"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    layer {
      name: "conv5_1"
      type: "Convolution"
      bottom: "pool4"
      top: "conv5_1"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 512
        kernel_size: 3
        pad: 1
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "relu5_1"
      type: "ReLU"
      bottom: "conv5_1"
      top: "conv5_1"
    }
    layer {
      name: "conv5_2"
      type: "Convolution"
      bottom: "conv5_1"
      top: "conv5_2"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 512
        kernel_size: 3
        pad: 1
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "relu5_2"
      type: "ReLU"
      bottom: "conv5_2"
      top: "conv5_2"
    }
    layer {
      name: "conv5_3"
      type: "Convolution"
      bottom: "conv5_2"
      top: "conv5_3"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      convolution_param {
        num_output: 512
        kernel_size: 3
        pad: 1
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "relu5_3"
      type: "ReLU"
      bottom: "conv5_3"
      top: "conv5_3"
    }
    layer {
      name: "pool5"
      type: "Pooling"
      bottom: "conv5_3"
      top: "pool5"
      pooling_param {
        pool: MAX
        kernel_size: 2
        stride: 2
      }
    }
    
    layer {
      name: "fc6"
      type: "InnerProduct"
      bottom: "pool5"
      top: "fc6"
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      inner_product_param {
        num_output: 4096
        weight_filler {
          type: "gaussian"
          std: 0.005
        }
        bias_filler {
          type: "constant"
          value: 1
        }
      }
    }
    layer {
      name: "relu6"
      type: "ReLU"
      bottom: "fc6"
      top: "fc6"
    }
    layer {
      name: "drop6"
      type: "Dropout"
      bottom: "fc6"
      top: "fc6"
      dropout_param {
        dropout_ratio: 0.5
      }
    }
    layer {
      name: "fc7"
      type: "InnerProduct"
      bottom: "fc6"
      top: "fc7"
      # Note that lr_mult can be set to 0 to disable any fine-tuning of this, and any other, layer
      param {
        lr_mult: 1
        decay_mult: 1
      }
      param {
        lr_mult: 2
        decay_mult: 0
      }
      inner_product_param {
        num_output: 4096
        weight_filler {
          type: "gaussian"
          std: 0.005
        }
        bias_filler {
          type: "constant"
          value: 1
        }
      }
    }
    layer {
      name: "relu7"
      type: "ReLU"
      bottom: "fc7"
      top: "fc7"
    }
    layer {
      name: "drop7"
      type: "Dropout"
      bottom: "fc7"
      top: "fc7"
      dropout_param {
        dropout_ratio: 0.5
      }
    }
    layer {
      name: "fc8_flickr"
      type: "InnerProduct"
      bottom: "fc7"
      top: "fc8_flickr"
      # lr_mult is set to higher than for other layers, because this layer is starting from random while the others are already trained
      propagate_down: false
      inner_product_param {
        num_output: 40
        weight_filler {
          type: "gaussian"
          std: 0.01
        }
        bias_filler {
          type: "constant"
          value: 0
        }
      }
    }
    layer {
      name: "accuracy"
      type: "Accuracy"
      bottom: "fc8_flickr"
      bottom: "label"
      top: "accuracy"
      include {
        phase: TEST
      }
    }
    layer {
      name: "loss"
      type: "SoftmaxWithLoss"
      bottom: "fc8_flickr"
      bottom: "label"
      top: "loss"
    }
    

    拷贝models/finetune_flickr_style/solver.prototxt,并将新的针对现问题进行修改,主要修改

    net: "models/finetune/train_val.prototxt"
    test_iter: 100
    test_interval: 100
    # lr for fine-tuning should be lower than when starting from scratch
    base_lr: 0.001
    lr_policy: "step"
    gamma: 0.1
    # stepsize should also be lower, as we're closer to being done
    stepsize: 2000
    display: 20
    max_iter: 10000
    momentum: 0.9
    weight_decay: 0.0005
    snapshot: 1000
    snapshot_prefix: "models/finetune/finetune"
    # uncomment the following to default to CPU mode solving
    #solver_mode: CPU
    

    最后使用自己的数据对模型进行fine-tuning。代码如下:

    ./build/tools/caffe train -solver models/finetune/solver.prototxt -weights models/vgg_face_caffe/VGG_FACE.caffemodel -gpu 0
    

    相关文章

      网友评论

        本文标题:新手使用caffe

        本文链接:https://www.haomeiwen.com/subject/tedorttx.html