美文网首页
docker容器内运行pytorch多gpu报错 Runtime

docker容器内运行pytorch多gpu报错 Runtime

作者: 1nvad3r | 来源:发表于2020-10-31 16:07 被阅读0次

错误1.docker容器内运行pytorch多gpu报错 RuntimeError: NCCL Error 2: unhandled system error
在启动容器的时候加上 -e NVIDIA_VISIBLE_DEVICES=0,1,2,3

docker run --runtime=nvidia --net="host" -e NVIDIA_VISIBLE_DEVICES=0,1,2,3 --shm-size 8g -it huangzc/reid:v1 /bin/bash

错误2.RuntimeError: DataLoader worker (pid 53617) is killed by signal: Bus error. It is possible that dataloader's workers are out of shared memory. Please try to raise your shared memory limit.

启动容器的时候增加交换内存 --shm-size 8g

相关文章

网友评论

      本文标题:docker容器内运行pytorch多gpu报错 Runtime

      本文链接:https://www.haomeiwen.com/subject/piisvktx.html