遇到的问题:
一、GPU显存不够
按照楼主的设置跑的话,很有可能会出现如下错误:
Check failed: error == cudaSuccess (2 vs. 0) out of memory
1
这就是显存不够的原因。修改train_val.prototxt中train层和test层中的batch_size为1(最小了,也可以适当改大点看看极限。楼主先是50不行,改成1了,写博客实验时发现改成50又行了。。。很是奇怪),对应的solver.prototxt中的test_iter改为100,进行测试。
二、没有.caffemodel模型出来
原博客solver.prototxt末尾将
snapshot: 200
snapshot_prefix: "examples/myfile/minemodel"
1
2
这两句话删除了,就不会生成训练好的模型。加上这两句,就会再examples/myfile/路径下生成minemodel_iter_200.caffemodel,minemodel_iter_200.solverstate等一系列文件。这两句话的作用就很明显了。
Trouble shooting(问题解答)
1)HDF问题
src/caffe/net.cpp:8:18: fatal error: hdf5.h: No such file or directory
compilation terminated.
Makefile:581: recipe for target '.build_release/src/caffe/net.o' failed
make: *** [.build_release/src/caffe/net.o] Error 1
1
2
3
4
解决办法:
打开Makefile.config文件,定位到INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include,在后面加上/usr/include/hdf5/serial:
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial
1
然后重新运行make all。
2)gflags问题
In file included from src/caffe/net.cpp:10:0:
./include/caffe/common.hpp:5:27: fatal error: gflags/gflags.h: No such file or directory
compilation terminated.
Makefile:581: recipe for target '.build_release/src/caffe/net.o' failed
make: *** [.build_release/src/caffe/net.o] Error 1
1
2
3
4
5
解决办法:
sudo apt-get install libgflags-dev
然后重新运行make all。
3)glog问题
In file included from src/caffe/net.cpp:10:0:
./include/caffe/common.hpp:6:26: fatal error: glog/logging.h: No such file or directory
compilation terminated.
Makefile:581: recipe for target '.build_release/src/caffe/net.o' failed
make: *** [.build_release/src/caffe/net.o] Error 1
1
2
3
4
5
解决办法:
sudo apt-get install libgoogle-glog-dev
然后重新运行make all。
4)lmdb问题
In file included from src/caffe/util/db.cpp:3:0:
./include/caffe/util/db_lmdb.hpp:8:18: fatal error: lmdb.h: No such file or directory
compilation terminated.
Makefile:581: recipe for target '.build_release/src/caffe/util/db.o' failed
make: *** [.build_release/src/caffe/util/db.o] Error 1
1
2
3
4
5
解决办法:
sudo apt-get install liblmdb-dev
然后重新运行make all。
5)cannot find -lxxx问题
AR -o .build_release/lib/libcaffe.a
LD -o .build_release/lib/libcaffe.so.1.0.0
/usr/bin/ld: cannot find -lhdf5_hl
/usr/bin/ld: cannot find -lhdf5
collect2: error: ld returned 1 exit status
Makefile:572: recipe for target '.build_release/lib/libcaffe.so.1.0.0' failed
make: *** [.build_release/lib/libcaffe.so.1.0.0] Error 1
1
2
3
4
5
6
7
解决办法:
修改Makefile.config文件,在LIBRARY_DIRS后添加/usr/lib/x86_64-linux-gnu/hdf5/serial:
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial
1
然后重新运行make all。
---------------------
作者:Artprog
来源:CSDN
原文:https://blog.csdn.net/artprog/article/details/79271388
版权声明:本文为博主原创文章,转载请附上博文链接!
网友评论