美文网首页Linux 相关文章
从零安装深度学习环境Ubuntu16.0.4+TensorFlo

从零安装深度学习环境Ubuntu16.0.4+TensorFlo

作者: TensorData | 来源:发表于2016-10-04 21:51 被阅读0次

    主机配置:
    CPU:E3 1230 V5+
    GPU:EVGA GTX1080 8G SC ACX 3.0
    内存:DDR4 2133 8G 两根
    主板:技嘉X150M-PRO-ECC

    选择Ubuntu 16 LTS,因为它是一个长期支持版本,而且我的硬件比较新,可能驱动方面在支持和兼容性上面可能会更好
    另外选择这块主板一个原因是M.2的SSD接口,结果兼容性问题很严重,网上很多都在吐槽二次启动问题,没想到中标了,最后放弃了M.2,老老实实用SATA3.0。

    概览

    • 安装Ubuntu 16.0.4
    • 配置系统编译环境
    • 编译安装TensorFlow

    安装Ubuntu 16.0.4

    由于本人是两块硬盘,准备安装双系统。先在第一块硬盘装好win10,然后把下载好的Ubuntu ISO 文件烧写到U盘,修改系统BIOS,把U盘启动顺序设置到第一然后重启,重启完成后根据安装提示一步一步往下走就行,在选择系统语言的步骤最好选择英文,少折腾。

    可能会遇到的问题:

    • 安装完成重启后黑屏
    • 安装完成后登录无法进入桌面

    安装完成重启后黑屏

    由于我是双系统,在开机后显示引导菜单时候按e按钮进入编辑grub,找到quiet splash,修改为 quiet splash nomodeset,就是在末尾添加nomodeset,然后按F10键引导。如果进入到登录界面,按住ctrl+alt+f1,进入命令行登录,输入用户名密码后,编辑sudo vi /etc/default/grub 文件,找到如下行:

    GRUB_CMDLINE_LINUX_DEFAULT="quiet splash" 
    

    修改为:

    GRUB_CMDLINE_LINUX_DEFAULT="quiet splash nomodeset"
    

    保存后,重启

    sudo reboot
    

    安装ubuntu16.0.4后无法进入桌面

    不要急,按住ctrl+alt+f1,进入命令行登录,然后第一件事,更新source,大局域网,你懂的:)

    sudo vi /etc/apt/sources.list
    

    如果你不习惯,或者是linux小白,可以用nano编辑器来修改:

    sudo nano /etc/apt/sources.list
    

    添加mirrors.163.com的源,ubuntu 16的代号 xenial

    deb http://mirrors.163.com/ubuntu/ xenial main restricted universe multiverse
    deb http://mirrors.163.com/ubuntu/ xenial-updates main restricted universe multiverse
    deb http://mirrors.163.com/ubuntu/ xenial-security main restricted universe multiverse
    deb http://mirrors.163.com/ubuntu/ xenial-proposed main restricted universe multiverse
    deb http://mirrors.163.com/ubuntu/ xenial-backports main restricted universe multiverse
    deb-src http://mirrors.163.com/ubuntu/ xenial main restricted universe multiverse
    deb-src http://mirrors.163.com/ubuntu/ xenial-updates main restricted universe multiverse
    deb-src http://mirrors.163.com/ubuntu/ xenial-security main restricted universe multiverse
    deb-src http://mirrors.163.com/ubuntu/ xenial-proposed main restricted universe multiverse
    deb-src http://mirrors.163.com/ubuntu/ xenial-backports main restricted universe multiverse
    

    我个人感觉电信宽带用163的镜像源会更快一点,如果是教育网,可以用中科大的源。

    修改完成后保存,apt update,然后upgrade

    sudo apt-get update
    sudo apt-get upgrade
    

    然后升级内核(安装好后是4.4,建议升级到4.6.7),此步骤可以跳过
    先看看内核版本:

    uname -r
    
    wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.6.7/linux-headers-4.6.7-040607_4.6.7-040607.201608160432_all.deb
    wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.6.7/linux-headers-4.6.7-040607-generic_4.6.7-040607.201608160432_amd64.deb
    http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.6.7/linux-image-4.6.7-040607-generic_4.6.7-040607.201608160432_amd64.deb
    sudo dpkg -i linux-*.deb sudo update-grub
    sudo reboot now
    

    重启完成后开始安装显卡驱动了。(我这个地方是gtx1080的显卡,选择nvidia-367驱动)

    sudo add-apt-repository ppa:graphics-drivers/ppa
    sudo apt-get update
    sudo apt-get install nvidia-367
    sudo apt-get install mesa-common-dev
    sudo apt-get install freeglut3-dev
    sudo reboot
    

    完成后重启,应该能进入桌面了,电脑分辨率也正常了。(我的带鱼屏2560X1080)

    配置系统编译环境

    下载安装CUDA 8.0.44(Nvidia下载 或者 百度网盘下载

    sudo sh cuda_8.0.44_linux.run
    

    开始安装后会不断询问安装内容,请一定要注意

    Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 367.**?

    (y)es/(n)o/(q)uit: n

    这个步骤一定要选择no,否者前面最新的显卡驱动就白装了(如果实在不小心踩了这个坑,没关系,把前面步骤的显卡驱动重新装一次,安装前先卸载)

    完成后注意看提示,如果有问题可以参考这篇blog(我没遇到)

    配置环境变量:

    nano ~/.bashrc
    export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}}
    export LD_LIBRARY_PATH=/usr/local/cuda-8.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
    

    如果在桌面的Terminal配置环境变量,完成后exit下,再进入让环境变量生效,如果在系统命令行模式,可以手动执行以下上面的export两行命令。

    完成后开始安装Cudnn 5.1,官方下载地址 或者 百度网盘地址

    下载完成后,解压复制到目录(如果CUDA8.0是默认安装路径,这个地方就不用修改路径了)

    tar xvf  cudnn-8.0-linux-x64-v5.1.tgz
    sudo cp cuda/include/cudnn.h /usr/local/cuda/include
    sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
    sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
    

    好了,现在输入命令看看是否正常显示显卡信息:

    nvidia-smi
    

    然后进入刚刚CUDA安装的sample目录,默认是~/下,然后make,编译完成后输入

    ./NVIDIA_CUDA-8.0_Samples/bin/x86_64/linux/release/deviceQuery
    

    应该会正常显示详细设备信息:

      CUDA Device Query (Runtime API) version (CUDART static linking)
    
      Detected 1 CUDA Capable device(s)
    
      Device 0: "GeForce GTX 1080"
        CUDA Driver Version / Runtime Version 8.0 / 8.0
        CUDA Capability Major/Minor version number: 6.1
        Total amount of global memory: 8110 MBytes (8504279040 bytes)
        (20) Multiprocessors, (128) CUDA Cores/MP: 2560 CUDA Cores
        GPU Max Clock rate: 1848 MHz (1.85 GHz)
        Memory Clock rate: 5005 Mhz
        Memory Bus Width: 256-bit
        L2 Cache Size: 2097152 bytes
        Maximum Texture Dimension Size (x,y,z) 1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
        Maximum Layered 1D Texture Size, (num) layers 1D=(32768), 2048 layers
        Maximum Layered 2D Texture Size, (num) layers 2D=(32768, 32768), 2048 layers
        Total amount of constant memory: 65536 bytes
        Total amount of shared memory per block: 49152 bytes
        Total number of registers available per block: 65536
        Warp size: 32
        Maximum number of threads per multiprocessor: 2048
        Maximum number of threads per block: 1024
        Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
        Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
        Maximum memory pitch: 2147483647 bytes
        Texture alignment: 512 bytes
        Concurrent copy and kernel execution: Yes with 2 copy engine(s)
        Run time limit on kernels: Yes
        Integrated GPU sharing Host Memory: No
        Support host page-locked memory mapping: Yes
        Alignment requirement for Surfaces: Yes
        Device has ECC support: Disabled
        Device supports Unified Addressing (UVA): Yes
        Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0
        Compute Mode:
        < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
    
      deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = GeForce GTX 1080
      Result = PASS
    

    编译安装TensorFlow

    好了,现在开始安装TensorFlow的编译环境了,如果不想自己编译,这里下载我编译好的whl

    本文篇幅有点长,所以Bazel安装配置可以看官方手册,点这里传送

    官方下载有点慢,可以到这里下载 bazel 0.3.1版本

    继续安装
    如果你是python 2.7

    sudo apt-get install python-numpy swig python-dev python-wheel python-pip
    

    或者是3.x

    sudo apt-get install python3-numpy swig python3-dev python3-wheel python3-pip
    

    拉取TensorFlow代码:

    git clone https://github.com/tensorflow/tensorflow
    

    切到最新的 r0.11分支

    git checkout r0.11
    

    开始配置:

    $./configure
    Please specify the location of python. [Default is /usr/bin/python]: 
    Do you wish to build TensorFlow with Google Cloud Platform support? [y/N] N
    No Google Cloud Platform support will be enabled for TensorFlow
    Do you wish to build TensorFlow with Hadoop File System support? [y/N] N
    No Hadoop File System support will be enabled for TensorFlow
    Found possible Python library paths:
      /usr/local/lib/python2.7/dist-packages
      /usr/lib/python2.7/dist-packages
    Please input the desired Python library path to use.  Default is [/usr/local/lib/python2.7/dist-packages]
    
    /usr/local/lib/python2.7/dist-packages
    Do you wish to build TensorFlow with GPU support? [y/N] y
    GPU support will be enabled for TensorFlow
    Please specify which gcc should be used by nvcc as the host compiler. [Default is /usr/bin/gcc]: 
    Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: 8.0
    Please specify the location where CUDA 8.0 toolkit is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
    Please specify the Cudnn version you want to use. [Leave empty to use system default]: 5
    Please specify the location where cuDNN 5 library is installed. Refer to README.md for more details. [Default is /usr/local/cuda]: 
    Please specify a list of comma-separated Cuda compute capabilities you want to build with.
    You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus.
    Please note that each additional compute capability significantly increases your build time and binary size.
    [Default is: "3.5,5.2"]: 6.1
    
    ...
    ...
    ...
    

    配置完成后,编译GPU版本whl。不想编译的同学可以到这里下载我编译好的whl

    bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package
    
    bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg
    
    sudo pip install /tmp/tensorflow_pkg/tensorflow-0.11.0rc0-py2-none-any.whl
    

    到这里就全部完成了,完成后可以跑一下google的测试集验证下,点这里传送

    大家安装如果有任何疑问可以给我留言:)

    相关文章

      网友评论

        本文标题:从零安装深度学习环境Ubuntu16.0.4+TensorFlo

        本文链接:https://www.haomeiwen.com/subject/ycgryttx.html