美文网首页
Ubuntu18.04跑faster-rcnn安装配置

Ubuntu18.04跑faster-rcnn安装配置

作者: 乘瓠散人 | 来源:发表于2019-03-04 22:42 被阅读2次

我的配置: Ubuntu 18.04+nvidia 410.78+cuda 10.0+cudnn 7.4.2

  1. 下载 py-faster-rcnn
    git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git
  2. 由于用到了caffe框架,所以需要先将caffe依赖的包装上
sudo apt-get install python-pip
sudo pip install cython  
sudo pip install easydict 
sudo apt-get install python-opencv

还需要装:

  • boost
    sudo apt-get install libboost-all-dev
  • proto
    sudo apt-get install libprotobuf-dev protobuf-c-compiler protobuf-compiler
  • glog
    sudo apt-get install libgoogle-glog-dev
  • gflags
    sudo apt-get install libgflags-dev
  • lmdb
    sudo apt-get install liblmdb-dev
  • leveldb
    sudo apt-get install libleveldb-dev
  • snappy
    sudo apt-get install libsnappy-dev
  • opencv
    sudo apt-get install libopencv-dev
  • BLAS
    sudo apt-get install libatlas-base-dev
  • hdf5.h头文件
    sudo apt-get install libhdf5-\*
  1. 编译caffe-faster-rcnn
  • 编译Cython模块
    cd py-faster-rcnn/lib
    make
  • 编译caffe和pycaffe
    先进入caffe-fast-rcnn目录下
    cd py-faster-rcnn/caffe-fast-rcnn
    复制Makefile.config.example为Makefile.config
    cp Makefile.config.example Makefile.config
    编辑Makefile.config,对应地方改为如下形式:
USE_CUDNN := 1
WITH_PYTHON_LAYER := 1
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial 
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial

这时进行编译还是会出现错误,faster-rcnn默认的caffe支持的cudnn版本是v4,因此编译caffe会出现版本不兼容而导致的函数参数不对应的错误。这时参考博文https://blog.csdn.net/flygeda/article/details/78638824,下载caffe最新源码https://github.com/BVLC/caffe

用最新caffe源码的以下文件替换掉caffe-fast-rcnn中的对应文件:
include/caffe/layers/cudnn_relu_layer.hpp
src/caffe/layers/cudnn_relu_layer.cpp
src/caffe/layers/cudnn_relu_layer.cu
include/caffe/layers/cudnn_sigmoid_layer.hpp
src/caffe/layers/cudnn_sigmoid_layer.cpp
src/caffe/layers/cudnn_sigmoid_layer.cu
include/caffe/layers/cudnn_tanh_layer.hpp
src/caffe/layers/cudnn_tanh_layer.cpp
src/caffe/layers/cudnn_tanh_layer.cu

include/caffe/util/cudnn.hpp

将caffe-fast-rcnn中的src/caffe/layers/cudnn_conv_layer.cu 文件中所有的
cudnnConvolutionBackwardData_v3 函数名替换为 cudnnConvolutionBackwardData
cudnnConvolutionBackwardFilter_v3函数名替换为 cudnnConvolutionBackwardFilter

然后进行编译:

cd py-faster-rcnn/caffe-fast-rcnn
make -j8 && make pycaffe

这时编译又遇到一个错误nvcc fatal : Unsupported gpu architecture 'compute_20',这时需要将Makefile.config中CUDA_ARCH配置去掉
-gencode arch=compute_20,code=sm_20 \
-gencode arch=compute_20,code=sm_21 \
然后编译完成。

  1. 获取faster-rcnn模型
    cd py-faster-rcnn
    ./data/scripts/fetch_faster_rcnn_models.sh
    服务器没法翻墙,所以我先在本地下载后传到服务器的py-faster-rcnn/data目录下,下载URL位于fetch_faster_rcnn_models.sh中。
    然后进行解压:tar -xvf faster_rcnn_models.tgz

  2. 运行demo
    cd py-faster-rcnn
    sudo ./tools/demo.py

  • 报错:ImportError: No module named skimage.io
    解决:sudo apt-get install python-skimage
  • 报错:ImportError: No module named google.protobuf.internal
    解决:pip install protobuf
  • 报错:
Cannot create Cublas handle. Cublas won't be available
...中间省略几十行
Check failed: status == CUDNN_STATUS_SUCCESS (1 VS. 0) CUDNN_STATUS_NOT_INITIALIZED

电脑之前安装了cuda10.1,这个版本是不适合我的显卡驱动410.78的,一直没删,在此将其卸载,只保留cuda10.0。到/usr/local/cuda-10.1/bin目录下执行./cuda_uninstaller

  • 报错:
...
in <module>
    from nms.gpu_nms import gpu_nms
ImportError: libcudart.so.10.1: cannot open shared object file: No such file or directory

这是由于我之前用cuda10.1编译过,而换成cuda10.0进行编译后部分文件并没有进行重新编译,依然依赖cuda10.1。所以需要将/py-faster-rcnn/lib/下文件夹中所有的*.so文件删除,之后再重新进行make
至此demo运行成功:)

  1. 下载在ImageNet上pre-trained的模型参数(用于初始化网络参数)
    cd py-faster-rcnn
    ./data/scripts/fetch_imagenet_models.sh
    下载不下来的话方法同4.

  2. 创建PASCAL VOC数据集的符号链接,以便可以在多个项目使用该数据集,$VOCdevkit为你下载的数据集的目录
    cd py-faster-rcnn/data
    ln -s $VOCdevkit VOCdevkit2007

  3. 用VOC数据集进行训练
    cd py-faster-rcnn
    ./experiments/scripts/faster_rcnn_alt_opt.sh [GPU_ID] [NET] [--set...]
    ./experiments/scripts/faster_rcnn_alt_opt.sh 1 ZF pascal_voc
    此时报错:

File "/home/zd/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 51, in __init__
     pb2.text_format.Merge(f.read(), self.solver_param)
AttributeError: 'module' object has no attribute 'text_format'

解决办法是在py-faster-rcnn/lib/fast_rcnn/train.py中加上一句代码:
import google.protobuf.text_format
然后开始training...
但是跑了一会儿又报了个错:

  File "/home/zd/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 110, in _sample_rois
fg_inds, size=fg_rois_per_this_image, replace=False
  File "mtrand.pyx", line 1176, in mtrand.RandomState.choice (numpy/random/mtrand/mtrand.c:18822)
TypeError: 'numpy.float64' object cannot be interpreted as an index

于是重装numpy1.11.0版本sudo pip install -U numpy==1.11.0
但是会出现新的错误ImportError: numpy.core.multiarray failed to import
于是参考https://github.com/rbgirshick/py-faster-rcnn/issues/626 修改py-faster-rcnn/lib/roi_data_layer/minibatch.py文件中的line55 line98 line110 line124 line175,并且将numpy版本升级到1.13.1sudo pip install -U numpy==1.13.1

参考文章:
[1] Kali新手喝咖啡(Caffe)的艰辛之路
[2] Caffe-GPU编译问题:nvcc fatal:Unsupported gpu architecture 'compute_20'
[3] Ubuntu16.04 faster-rcnn+caffe+gpu运行环境配置以及解决各种bug
[4] caffe学习(四):py-faster-rcnn配置,运行测试程序(Ubuntu)

相关文章

网友评论

      本文标题:Ubuntu18.04跑faster-rcnn安装配置

      本文链接:https://www.haomeiwen.com/subject/uxubuqtx.html