系统:centos 7.6
显卡 GTX 2080 Ti
1.NVIDIA
使用网络安装rpm
sudo wget http://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-repo-rhel7-10.1.168-1.x86_64.rpm
sudo rpm -i cuda-repo-rhel7-10.1.168-1.x86_64.rpm
warning: cuda-repo-rhel7-10.1.168-1.x86_64.rpm: Header V3 RSA/SHA512 Signature, key ID 7fa2af80: NOKEY
失败了,换一种方法
sudo wget https://developer.nvidia.com/compute/cuda/10.1/Prod/local_installers/cuda_10.1.168_418.67_linux.run
下载完成之后安装官网指示
Run `sudo sh cuda_10.1.168_418.67_linux.run`
Follow the command-line prompts
就会出现下面的界面
------------------------------------------------------------------------------+
| End User License Agreement |
| -------------------------- |
| |
| NVIDIA Software License Agreement and CUDA Supplement to |
| Software License Agreement. |
| |
| |
| Preface |
| ------- |
| |
| The Software License Agreement in Chapter 1 and the Supplement |
| in Chapter 2 contain license terms and conditions that govern |
| the use of NVIDIA software. By accepting this agreement, you |
| agree to comply with all the terms and conditions applicable |
| to the product(s) included herein. |
| |
| |
| NVIDIA Driver |
| |
| |
|------------------------------------------------------------------------------|
| Do you accept the above EULA? (accept/decline/quit): |
| accept |
+------------------------------------------------------------------------------+
+------------------------------------------------------------------------------+
| Options |
| Library install path (Blank for system default) |
| Done |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| Up/Down: Move | Left/Right: Expand | 'Enter': Select | 'A': Advanced options |
+------------------------------------------------------------------------------+
+------------------------------------------------------------------------------+
| CUDA Installer |
| - [X] Driver |
| [X] 418.67 |
| - [X] CUDA Toolkit 10.1 |
| + [X] CUDA Tools 10.1 |
| + [X] CUDA Libraries 10.1 |
| + [X] CUDA Compiler 10.1 |
| [X] CUDA Misc Headers 10.1 |
| [X] CUDA Samples 10.1 |
| [X] CUDA Demo Suite 10.1 |
| [X] CUDA Documentation 10.1 |
| Options |
| Install |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| Up/Down: Move | Left/Right: Expand | 'Enter': Select | 'A': Advanced options |
+------------------------------------------------------------------------------+
此时安装又报错了
Installation failed. See log at /var/log/cuda-installer.log for details.
查看细节信息
[INFO]: ERROR: The Nouveau kernel driver is currently in use by your system. This driver is incompatible with the NVIDIA driver, and must be disabled before proceeding. Please consult the NVIDIA driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.
[INFO]:
[INFO]:
[INFO]: ERROR: Installation has failed. Please see the file '/var/log/nvidia-installer.log' for details. You may find suggestions on fixing installation problems in the README available on the Linux driver download page at www.nvidia.com.
尝试解决
vim /etc/modprobe.d/blacklist-nouveau.conf
内部信息
blacklist nouveau
options nouveau modeset=0
接着执行
sudo update-initramfs -u
sudo: update-initramfs: command not found
没有这个命令
换一个
sudo grub2-mkconfig -o /boot/grub2/grub.cfg
完了之后重启
sudo reboot
等待服务器重启之后,重新执行上面的命令。
又报了同样的错误,看来The Nouveau kernel driver驱动没有禁用成功,我再去查查怎么禁用。
又找个一个方法,试一下
使用命令查看nouveau 驱动状态
lsmod | grep nouveau
nouveau 1871872 0
video 45056 1 nouveau
mxm_wmi 16384 1 nouveau
i2c_algo_bit 16384 3 igb,mgag200,nouveau
drm_kms_helper 172032 2 mgag200,nouveau
ttm 102400 2 mgag200,nouveau
drm 475136 5 drm_kms_helper,mgag200,ttm,nouveau
wmi 28672 2 mxm_wmi,nouveau
把之前的黑名单更名
sudo mv /etc/modprobe.d/blacklist-nouveau.conf /etc/modprobe.d/blacklist.conf
里面写的改成
blacklist nouveau
下一步备份
sudo mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
重建
sudo dracut -v /boot/initramfs-$(uname -r).img $(uname -r)
重启服务器
此时再次查看 是否禁用成功了
lsmod | grep nouveau
没有任何输出,应该是成功了
再次执行
sudo sh cuda_10.1.168_418.67_linux.run
终于成功了
===========
= Summary =
===========
Driver: Installed
Toolkit: Installed in /usr/local/cuda-10.1/
Samples: Installed in /home/hexialong/, but missing recommended libraries
Please make sure that
- PATH includes /usr/local/cuda-10.1/bin
- LD_LIBRARY_PATH includes /usr/local/cuda-10.1/lib64, or, add /usr/local/cuda-10.1/lib64 to /etc/ld.so.conf and run ldconfig as root
To uninstall the CUDA Toolkit, run cuda-uninstaller in /usr/local/cuda-10.1/bin
To uninstall the NVIDIA Driver, run nvidia-uninstall
Please see CUDA_Installation_Guide_Linux.pdf in /usr/local/cuda-10.1/doc/pdf for detailed information on setting up CUDA.
Logfile is /var/log/cuda-installer.log
接下来 修改环境变量
vim ~/.bash_profile
写入
export PATH=/usr/local/cuda-10.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.1/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
export CUDA_HOME=/usr/local/cuda-10.1
执行下面语句使其生效
source ~/.bash_profile
看一下
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Apr_24_19:10:27_PDT_2019
Cuda compilation tools, release 10.1, V10.1.168
此时已经可以看到NVIDIA驱动信息了,CUDA安装完成。
接下来 就安装anaconda了。
可以参看另外一篇文章了。
网友评论