1.环境
python 3.7
2.安装指导,缺少对nccl的安装指导
https://github.com/open-mmlab/mmdetection/blob/master/docs/INSTALL.md
3.第一次安装运行,是这个样子,失败
export CUDA_HOME=/usr/local/cuda
export PATH=/usr/local/cuda-9.0/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-9.0/lib64:$LD_LIBRARY_PATH
export LIBRARY_PATH=/usr/local/cuda/lib64:$LIBRARY_PATH
export PYTHON=$PYTHON:/home/v-zhiwwa/HOI/mmlab/mmdetection/build/lib.linux-x86_64-3.7
cd $mmlab/mmdetection/demo && python wzw.py
4.基础知识
查看pytorch的版本号
import torch
torch.__version__
pytorch对应cuda版本
import torch
torch.version.cuda
查看nvcc的版本
nvcc -V
which nvcc
5.第二次安装,先安装的
cd usr/local && sudo ln -s cuda-10.0 cuda
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
python 3.7
pytorch 1.2.0 支持的cuda 10.0.130
创建nccl安装包
git clone https://github.com/NVIDIA/nccl.git
cd nccl
make -j src.build
make src.build CUDA_HOME=/usr/local/cuda
make -j src.build NVCC_GENCODE="-gencode=arch=compute_70,code=sm_70"
sudo apt install build-essential devscripts debhelper fakeroot
make pkg.debian.build
ls build/pkg/deb/
安装nccl
sudo dpkg -i build/pkg/deb/libnccl-dev_2.5.6-2+cuda10.0_amd64.deb
sudo dpkg -i build/pkg/deb/libnccl2_2.5.6-2+cuda10.0_amd64.deb
安装nccl-test
git clone https://github.com/NVIDIA/nccl-tests.git
cd nccl-tests
make CUDA_HOME=/usr/local/cuda
nccl 2.5.6
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12)
python setup.py develop
pip install -v -e .
测试
python
from mmdet.apis import init_detector, inference_detector, show_result
报错,删除build,重新来
6.环境进入
第一次进入存在问题,并解决
tmux new -s mmlab
conda activate open-mmlab
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda
conda install pytorch==1.2.0 torchvision==0.4.0 cudatoolkit=10.0 -c pytorch
第二次进入,正常进入
tmux new -s mmlab
conda activate open-mmlab
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export CUDA_HOME=/usr/local/cuda
export PYTHONPATH=$mmlab/mmdetection:$PYTHONPATH
网友评论