美文网首页
Training Your Own Faster RCNN

Training Your Own Faster RCNN

作者: 尼诺阳 | 来源:发表于2018-12-13 22:00 被阅读0次

    Dependecies installation

    pip install scipy pillow matplotlib pyyaml easydict opencv-python
    

    This repo

    https://github.com/smallcorgi/Faster-RCNN_TF
    

    allows us to train our own Faster-RCNN. To train the network, follow the instructions in the ReadME file of the repo above until you are able to train the VOC dataset.

    Over that process, you may encounter the following problems.

    1. When you execute

       python ./tools/demo.py --model ./VGGnet_fast_rcnn_iter_70000.ckpt
      

      you might get following error codes

       tensorflow.python.framework.errors_impl.NotFoundError: /home/neno/workspace/OCR/Faster-      
       RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol:_ZTIN10tensorflow8OpKernelE
      

      To solve this problem, replace $REPO/lib/make.sh with the following content and run

       *python ./tools/demo.py --model ./VGGnet_fast_rcnn_iter_70000.ckpt*
      

      again. (P.S. There is no need to execute make again after you have modified the make.sh)

       #!/usr/bin/env bash
       TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
       TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include()       )')
       echo $TF_INC
      
       CUDA_PATH=/usr/local/cuda/
      
       cd roi_pooling_layer
      
       nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc \
           -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_52
      
       ## if you install tf using already-built binary, or gcc version 4.x,        uncomment the two lines below
       #g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o roi_pooling.so        roi_pooling_op.cc \
       #   roi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
      
       # for gcc5-built tf
       g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc      -D_GLIBCXX_USE_CXX11_ABI=0 \
           roi_pooling_op.cu.o -I $TF_INC -L $TF_LIB -ltensorflow_framework -D         GOOGLE_CUDA=1 \
           -fPIC $CXXFLAGS -lcudart -L $CUDA_PATH/lib64
      
       cd ..
      
      
       # add building psroi_pooling layer
       cd psroi_pooling_layer
       nvcc -std=c++11 -c -o psroi_pooling_op.cu.o psroi_pooling_op_gpu.cu.cc \
           -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_52
      
       g++ -std=c++11 -shared -o psroi_pooling.so psroi_pooling_op.cc \
           psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
      
       ## if you install tf using already-built binary, or gcc version 4.x,        uncomment the two lines below
       #g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o psroi_pooling.so      psroi_pooling_op.cc \
       #   psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
      
       cd ..
      
    1. When you execute

       python ./tools/demo.py --model ./VGGnet_fast_rcnn_iter_70000.ckpt
      

      you might get the following error message

       tensorflow/stream_executor/cuda/cuda_dnn.cc:378] Loaded runtime CuDNN library: 7104 
       (compatibility version 7100) but source was compiled with 7004 (compatibility version 7000).  If 
       using a binary install, upgrade your CuDNN library to match.  If building from sources, make sure 
       the library loaded at runtime matches a compatible version specified during compile configuration.
      

      To solve this problem, uninstall the cuDNN 7.1.4 and install cuDNN 7.0.5 instead.


    Now, we are going to build our own dataset. First of all, you've got to prepare the images you want to rain and then we make them look "nice" so that our network can be trained effciently.

    The author of this blog provides us three scripts to resize, change filename and generate index files. Here is a merged and improved version I wrote on the basis of the provided scripts . The script below firstly resizes the images and save them to the output directory and then uniform the file format to jpg and finally rename them into VOC style.

    import cv2
    import os
    import sys
    from PIL import Image
    import re
    
    path = sys.argv[1]
    output_dir = sys.argv[2]
    
    if not os.path.exists(output_dir):
        os.mkdir(output_dir)
    
    ticket_width = 300
    
    print('resizing raw images...')
    
    list=os.listdir(path)
    output_dir_image = output_dir + '/images/'
    output_dir_label = output_dir + '/labels/'
    output_dir_index = output_dir + '/indexs/'
    if not os.path.exists(output_dir_image):
        os.mkdir(output_dir_image)
    if not os.path.exists(output_dir_label):
        os.mkdir(output_dir_label)
    if not os.path.exists(output_dir_index):
        os.mkdir(output_dir_index)
    count=0
    
    for pic in list:
        im = cv2.imread(path + '/' + pic)
        h = im.shape[0]
        w = im.shape[1]
        ratio = float(ticket_width) / w
        w_new = ticket_width
        h_new = int(ratio * h)
        im = cv2.resize(im, (w_new, h_new))
        new_path=output_dir_image + '/' + pic[0:-3] + 'jpg'
        cv2.imwrite(new_path, im)
    
    print("renaming...")
    
    filelist = os.listdir(output_dir_image)
    total_num = len(filelist)
    i = 10000 
    n = 6
    for item in filelist:
        if item.endswith('.jpg'):
            n = 6 - len(str(i))
            src = os.path.join(os.path.abspath(output_dir_image), item)
            dst = os.path.join(os.path.abspath(output_dir_image), str(0)*n + str(i) + '.jpg')
            try:
                os.rename(src, dst)
                i = i + 1
            except:
                continue
    
    print('finished')
    

    It takes two arguments to run.

    python script.py $IMAGES_DIR $OUTPUT_DIR
    

    Well done. Now you have completed all the operations that you need to do with the raw images. Here comes a heavier task ---- labeling the images.This tool provides utilities but it still costs much time. WHEN LABELING, PLEASE USE LOWER-CASE LETTERS FOR ALL LABELS. UPPER-CASE LABELS WOULD LEAD TO PROGRAM ERRORS.

    The last step is to produce txt index files. This script could automatically generate index files for us.

    # !/usr/bin/python
    # -*- coding: utf-8 -*-
    import os
    import random  
    import sys  
    
    trainval_percent = 0.8  # tunable parameter
    train_percent = 0.7  # tunable parameter
    xmlfilepath = sys.argv[1]
    txtsavepath = sys.argv[2]  
    total_xml = os.listdir(xmlfilepath)  
      
    num=len(total_xml)  
    list=range(num)  
    tv=int(num*trainval_percent)  
    tr=int(tv*train_percent)  
    trainval= random.sample(list,tv)  
    train=random.sample(trainval,tr)  
      
    ftrainval = open(txtsavepath + '/trainval.txt', 'a')  
    ftest = open(txtsavepath + '/test.txt', 'a')  
    ftrain = open(txtsavepath + '/train.txt', 'a')  
    fval = open(txtsavepath + '/val.txt', 'a')  
      
    for i  in list:  
        name=total_xml[i][:-4]+'\n'  
        if i in trainval:  
            ftrainval.write(name)  
            if i in train:  
                ftrain.write(name)  
            else:  
                fval.write(name)  
        else:  
            ftest.write(name)  
      
    ftrainval.close()  
    ftrain.close()  
    fval.close()  
    ftest .close()  
    

    To use it, you will need to input the directory where labels store at and the directory where index files should be placed at.

    python script.py $LABEL_DIR $OUTPUT_DIR
    

    And you would get the following four text files

    $OUTPUT_DIR/trainvel.txt
    $OUTPUT_DIR/text.txt
    $OUTPUT_DIR/train.txt
    $OUTPUT_DIR/val.txt
    

    These files will guide the neural network to locate the dataset.


    If you have reached this line, you are already very close to start training your own FRCNN. Now we are going to replace VOC dataset our prepared data.

    Put VOC images and labels into trash bin

    rm -rf $VOC_DATA_DIR/VOCdevkit/VOC2007/JPEGImages/*
    rm -rf $VOC_DATA_DIR/VOCdevkit/VOC2007/Annotations/*
    

    and bring our data under the spotlight.

    cp $IMAGE_DIR/* $VOC_DATA_DIR/VOCdevkit/VOC2007/JPEGImages/*
    cp $LABEL_DIR/* $VOC_DATA_DIR/VOCdevkit/VOC2007/Annotations/*
    cp $INDEX_DIR/* $VOC_DATA_DIR/VOCdevkit/VOC2007/ImageSets/Main/
    

    Now the very last step is to modify the source code. Four changes have to be made.

    1. $FRCNN_DIR\lib\datasets\pascal_voc.py

      Find variable _classes

       self._classes = ('__background__', # always index 0
                        'aeroplane', 'bicycle', 'bird', 'boat',
                        'bottle', 'bus', 'car', 'cat', 'chair',
                        'cow', 'diningtable', 'dog', 'horse',
                        'motorbike', 'person', 'pottedplant',
                        'sheep', 'sofa', 'train', 'tvmonitor')
      

      and append your own classes at the tail.

       self._classes = ('__background__', # always index 0
                        'aeroplane', 'bicycle', 'bird', 'boat',
                        'bottle', 'bus', 'car', 'cat', 'chair',
                        'cow', 'diningtable', 'dog', 'horse',
                        'motorbike', 'person', 'pottedplant',
                        'sheep', 'sofa', 'train', 'tvmonitor', 'class1', 'class2')
      
    2. $FRCNN_DIR\lib\networks\VGGnet_train.py

      This line indicates the totoal num of all classes

       n_classes = 21
      

      If your own data has n classes to recognize, increase its value by n.

       # Two classes to recognize for example
       n_classes = 23
      
    3. $FRCNN_DIR\lib\networks\VGGnet_test.py

      Same as 2

    4. $FRCNN_DIR\tools\demo.py

      Find variable CLASSES and append your own classes at the tail the way same as modifying _classes in pascal_voc.py


    Now, we are ready to train. Use this command to start training

    ./experiments/scripts/faster_rcnn_end2end.sh gpu 0 VGG16 pascal_voc
    

    Here are the two errors I encountered :

    1.  Traceback (most recent call last):
         File "./tools/train_net.py", line 83, in <module>
           roidb = get_training_roidb(imdb)
         File "/home/yinqsh/Ningyuan/Faster-RCNN_TF/tools/../lib/fast_rcnn/train.py", line 204, in                
         get_training_roidb
           imdb.append_flipped_images()
         File "/home/yinqsh/Ningyuan/Faster-RCNN_TF/tools/../lib/datasets/imdb.py", line 113, in       
           append_flipped_images
          assert (boxes[:, 2] >= boxes[:, 0]).all()
      

    I googled it and it comes out that this error is probably caused by some illeagal boundboxes. The boundaries of these boundboxes exceed the image boundaries and therefore lead up to crashes. One solution is to delete all cache files avoiding models mix-up.

    rm $FRCNN/output
    rm $FRCNN/data/cache
    rm $FRCNN/VOCdevkit2007/annotations_chache // if this directory exists
    

    Another solution is to modify append_flipped_images() method in $FRCNN/lib/datasets/imdb.py. Find this line of code

    boxes[:, 2] = widths[i] - oldx1 - 1
    

    and add the following lines of code right below

    boxes[:, 2] = widths[i] - oldx1 - 1
    # ------------------TO-ADD-PART------------------
    for b in range(len(boxes)):
        if boxes[b][2]< boxes[b][0]:
            boxes[b][0] = 0
    # ------------------TO-ADD-PART------------------
    
    1. KeyError: 'max_overlaps'
      

    Solution : Delete caches

    rm $FRCNN/output
    rm $FRCNN/data/cache
    rm $FRCNN/VOCdevkit2007/annotations_chache // if this directory exists
    

    After training, you would get a trained model in

    $FRCNN_DIR/output/aster_rcnn_end2end/voc_2007_trainval/
    

    By default, model would be saved every 5000 iterations. We are going to use the model that was trained with most iterations. And there are three files for that model

    model.ckpt.meta
    model.ckpt.data-00000-xx-00000
    model.ckpt.index
    

    Make a copy of model.ckpt.data file under the same folder and remove suffix .data-00000-xx-00000 from the name of that copy.

    Finally, we are ready to test the power of FRCNN. Put some test images into

    $FRCNN/data/demo
    

    and run

    python $FRCNN/tools/demo.py --model $FRCNN/output/faster_rcnn_end2end/voc_2007_trainval/VGGnet_fast_rcnn_iter_10000.ckpt
    

    BANG!BANG!BNAG!


    Bugs encountered when moving codes from python2 to python 3

    1. When you execute

       python ./tools/demo.py --model ./VGGnet_fast_rcnn_iter_70000.ckpt
      

      you might get the following error message

      ImportError: /media/neno/44B0AB27B0AB1F04/Faster RCNN/Faster-   
      RCNN_TF/tools/../lib/utils/cython_bbox.so: undefined symbol: _Py_ZeroStruct
      

      To solve this problem, go to

       $FRCNN_DIR/lib
      

      and run

       python3 setup.py build_ext --inplace
      
    2. When you execute

       python ./tools/demo.py --model ./VGGnet_fast_rcnn_iter_70000.ckpt
      

      you might get the following error message

      ImportError: No module named 'cPickle'
      

      To solve this problem, change cPickle to pickle.

    3. After upgrade pip3, pip3 crashes.

       Traceback (most recent call last):
         File "/usr/bin/pip3", line 9, in <module>
           from pip import main
       ImportError: cannot import name 'main'
      

    To solve this problem, open /usr/bin/pip3 and change the following codes

        from pip import __main__
            if __name__ == '__main__':
            sys.exit(__main__._main())
    

    to

        from pip import __main__
            if __name__ == '__main__':
            sys.exit(__main__._main())
    
    1. When you tried to run $FRCNN/lib/make.sh, you get the following error

       fatal error: nsync_cv.h: No such file or directory
       #include "nsync_cv.h"
      

    To solve this problem, open the file which causes this error and change the following two lines

      #include "external/nsync/public/nsync_cv.h"
      #include "external/nsync/public/nsync_mu.h"
    
    1. When you tried to run demo.py, you get the following error

       cudaCheckError() : no kernel image is available for execution on the device
      

    To solve this problem, go to $FRCNN/lib/make.sh file and check extra options for the nvcc compiler. There should be a option called arch which specifies the computation architecture of the Nvidia card. Check the compute ability of your card and change the arch option accordingly. In my case, I am using Telsa K80 which has 3.7 compute ability. I used to compile with -arch=sm_52 which caused this error. Then I changed it to -arch=sm_35 and things go really well now.


    P.S.

    By default, the network would load parameters from pre-trained VGG16 model. However, it might not perform well on the testset in practice. Instead, training from scratch gives a relatively better prediction.

    References:

    https://blog.csdn.net/zcy0xy/article/details/79614862

    Follw My Wechat Official Account

    相关文章

      网友评论

          本文标题:Training Your Own Faster RCNN

          本文链接:https://www.haomeiwen.com/subject/tglihqtx.html