美文网首页程序员
TensorFlow2.1 GPU版 win10环境下配置及测试

TensorFlow2.1 GPU版 win10环境下配置及测试

作者: 火卫控 | 来源:发表于2020-05-29 15:10 被阅读0次
    此文针对TensorFlow GPU版 win10安装配置的,CPU版本安装比较简单,如下:

    CPU版本TensorFlow安装

    tensorflow-CPU版本安装非常简单直接在终端输入命令
    pip install tensorflow
    或者安装特定的版本
    pip install tensorflow==2.1
    注:不注明版本默认是CPU版本,但好像最新版默认GPUCPU都有


    如果你有GPU的话,继续往下看------------------------------------------------------------------

    GPU版本TensorFlow安装及配置比较复杂,具体如下:

    GPU版本TensorFlow安装及配置

    所需环境安装以及安装顺序


    - Python 3.7.6(Anaconda)
    - tensorflow-gpu==2.1
    - Cuda 10.1(update2)(10.2和TensorFlow2.1不匹配)
    - Cudnn 7.6(for CUDA 10.1)

    作者已经装好,可以先让我们开始运行一下看看效果:

    首先放出代码
    #!/usr/bin/env python
    # -*- encoding: utf-8 -*-
    
    @File         :   pt2.py
    @Time         :   2020/05/21 23:58:57
    @Author       :   艾强云
    @Contact      :   aqy0716@163.com
    @Department   :   SCAU 
    @Desc         :   None
    
    #机器学习神经网络
    # here put the import lib
    import tensorflow as tf 
    from tensorflow import keras
    
    import numpy as np 
    import pandas as pd 
    import matplotlib.pyplot as plt
    mnist = keras.datasets.fashion_mnist
    (X_train, y_train),(X_test,y_test) = mnist.load_data()
    
    print("训练数据形状," , X_train.shape)
    print("数据最大值 " , np.max(X_train))
    print("查看标签数值 " , y_train)
    
    class_names =['top','trouser','pullover','dress','coat','sandal','shirt','sneaker','bag','ankle boot']#定义10个类别的名称
    
    plt.figure()#可视化
    plt.imshow(X_train[1])#【】里面的数据可以自己输入随便一个画出第几个的图
    plt.colorbar()#加一个颜色条
    plt.show()
    
    #将数据集归一化 即降低数据集的值
    X_train = X_train/255.0
    X_test = X_test/255.0
    plt.figure()#可视化
    plt.imshow(X_train[1])#【】里面的数据可以自己输入随便一个画出第几个的图
    plt.colorbar()#加一个颜色条
    plt.show()
    
    #可以看出值被缩放到0到1之间
    from tensorflow.python.keras.models import Sequential #导入训练模型
    from tensorflow.python.keras.layers import Flatten,Dense#导入神经网络的第一层和第二层
    
    
    model = Sequential()
    model.add(Flatten(input_shape = (28,28)))#此行代码是将图的大小数据转换成一维的数据
    model.add(Dense(128,activation = 'relu'))#定义第一层神经网络有128个单元,并且选择的激活函数是ReLu函数,也可以是其他函数性sigmoid函数
    # 这里要是不懂可以查看吴恩达老师深度学习的3.6节课
    model.add(Dense(10,activation = 'softmax'))#定义输出层,有10类所以输出10,激活函数是max函数
    
    print("查看自己写的代码的总体参数 " , model.summary())#查看自己写的代码的总体参数
    
    
    #模型补充
    model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])#定义损失函数
    
    #使用的优化器名叫AdamOptimizer,使用的损失函数是稀疏分类交叉熵
    model.fit(X_train,y_train,epochs = 10)#进行训练,epochs是显示运行多少次
    
    test_loss, test_acc = model.evaluate(X_test,y_test)#利用测试集测试训练下的模型的准确度
    print(test_acc)
    
    #预测模型精确度
    from sklearn.metrics import accuracy_score
    y_pred = model.predict_classes(X_test)
    
    print(accuracy_score(y_test, y_pred))
    
    print(tf.test.is_gpu_available())
    
    
    GPU运行成功:具体如下
    
    PS F:\vscode-python-kiton> & D:/ruanjian/anaconda202002/python.exe f:/vscode-python-kiton/数学/TensorFlow/pt2.py
    2020-05-29 00:33:44.308803: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
    训练数据形状, (60000, 28, 28)
    数据最大值  255
    查看标签数值  [9 0 0 ... 3 0 5]
    2020-05-29 00:33:50.532487: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
    2020-05-29 00:33:50.557648: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
    pciBusID: 0000:01:00.0 name: GeForce RTX 2060 computeCapability: 7.5
    coreClock: 1.755GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
    2020-05-29 00:33:50.561284: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
    2020-05-29 00:33:50.568473: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
    2020-05-29 00:33:50.573147: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
    2020-05-29 00:33:50.575965: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
    2020-05-29 00:33:50.581990: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
    2020-05-29 00:33:50.585757: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
    2020-05-29 00:33:50.593746: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
    2020-05-29 00:33:50.596544: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
    2020-05-29 00:33:50.598143: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
    2020-05-29 00:33:50.601011: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
    pciBusID: 0000:01:00.0 name: GeForce RTX 2060 computeCapability: 7.5
    coreClock: 1.755GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
    2020-05-29 00:33:50.605562: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
    2020-05-29 00:33:50.608013: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
    2020-05-29 00:33:50.609933: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
    2020-05-29 00:33:50.612253: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
    2020-05-29 00:33:50.614185: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
    2020-05-29 00:33:50.616119: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
    2020-05-29 00:33:50.617990: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
    2020-05-29 00:33:50.619995: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
    2020-05-29 00:33:51.071264: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
    2020-05-29 00:33:51.073211: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
    2020-05-29 00:33:51.074427: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
    2020-05-29 00:33:51.075876: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4604 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
    Model: "sequential"
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #
    =================================================================
    flatten (Flatten)            (None, 784)               0
    _________________________________________________________________
    dense (Dense)                (None, 128)               100480
    _________________________________________________________________
    dense_1 (Dense)              (None, 10)                1290
    =================================================================
    Total params: 101,770
    Trainable params: 101,770
    Non-trainable params: 0
    _________________________________________________________________
    查看自己写的代码的总体参数  None
    Train on 60000 samples
    Epoch 1/10
    2020-05-29 00:33:51.467932: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
    60000/60000 [==============================] - 2s 41us/sample - loss: 0.5001 - accuracy: 0.8269
    Epoch 2/10
    60000/60000 [==============================] - 2s 33us/sample - loss: 0.3769 - accuracy: 0.8647
    Epoch 3/10
    60000/60000 [==============================] - 2s 34us/sample - loss: 0.3376 - accuracy: 0.8768
    Epoch 4/10
    60000/60000 [==============================] - 2s 34us/sample - loss: 0.3126 - accuracy: 0.8848
    Epoch 5/10
    60000/60000 [==============================] - 2s 33us/sample - loss: 0.2953 - accuracy: 0.8902
    Epoch 6/10
    60000/60000 [==============================] - 2s 33us/sample - loss: 0.2818 - accuracy: 0.8956
    Epoch 7/10
    60000/60000 [==============================] - 2s 33us/sample - loss: 0.2693 - accuracy: 0.9008
    Epoch 8/10
    60000/60000 [==============================] - 2s 33us/sample - loss: 0.2591 - accuracy: 0.9031
    Epoch 9/10
    60000/60000 [==============================] - 2s 33us/sample - loss: 0.2496 - accuracy: 0.9071
    Epoch 10/10
    60000/60000 [==============================] - 2s 33us/sample - loss: 0.2408 - accuracy: 0.9107
    10000/10000 [==============================] - 0s 33us/sample - loss: 0.3349 - accuracy: 0.8823
    0.8823
    0.8823
    WARNING:tensorflow:From f:/vscode-python-kiton/数学/TensorFlow/pt2.py:70: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
    Instructions for updating:
    Use `tf.config.list_physical_devices('GPU')` instead.
    2020-05-29 00:34:12.313519: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties:
    pciBusID: 0000:01:00.0 name: GeForce RTX 2060 computeCapability: 7.5
    coreClock: 1.755GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 312.97GiB/s
    2020-05-29 00:34:12.318078: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
    2020-05-29 00:34:12.319828: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
    2020-05-29 00:34:12.322225: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
    2020-05-29 00:34:12.324215: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
    2020-05-29 00:34:12.325950: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
    2020-05-29 00:34:12.327717: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
    2020-05-29 00:34:12.329492: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
    2020-05-29 00:34:12.332276: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
    2020-05-29 00:34:12.333684: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
    2020-05-29 00:34:12.335506: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0
    2020-05-29 00:34:12.336623: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N
    2020-05-29 00:34:12.337972: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/device:GPU:0 with 4604 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
    True
    
    

    成功采用GPU运行计算,接下来具体描述安装流程

    安装配置流程

    1. 首先下载安装 Anaconda(开源的Python发行版本,最新版为3.7, 大小为466MB)

    安装完后会自动添加相关路径到PATH环境变量,可以直接在终端cmd或者power shell界面输入python查看是否安装好。安装方法参考

    C:\Users\Administrator>python
    Python 3.7.6 (default, Jan  8 2020, 20:23:39) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
    

    2. 更新pip到最新版本(版本需要大于20.0)

    在终端cmd或者power shell界面直接输入如下命令:
    python -m pip install --upgrade pip
    然后查看pip版本,终端输入:
    pip --version

    C:\Users\Administrator>pip --version
    pip 20.2b1 from D:\ruanjian\anaconda202002\lib\site-packages\pip-20.2b1-py3.7.egg\pip (python 3.7)
    

    3. 安装TensorFlow-GPU版本,这里选用2.1版本(2.2GPU版本有兼容问题)

    pip install tensorflow-gpu==2.1
    耐心等待下载安装完(大概300+MB),在终端进入python环境后查看tensorflow是否安装好(我这是全部配置好后的情况),版本号以及安装路径。逐个输入下方命令

    python
    import tensorflow as tf
    tf.__version__
    tf.__path__
    

    结果表明安装完毕

    
    >>> import tensorflow as tf
    2020-05-29 13:49:55.894087: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
    >>> tf.__version__
    '2.1.0'
    >>> tf.__path__
    ['C:\\Users\\Administrator\\AppData\\Roaming\\Python\\Python37\\site-packages\\tensorflow']
    >>>
    

    4. 下载安装对应的CUDA版本,这里选用CUDA10.1

    选择安装Windows x64 10 local, 然后点击下载(Download 2.5GB ),然后下载完直接点击安装好就行
    CUDA10.1
    查看安装情况,在终端输入如下命令:

    deviceQuery

    C:\Users\Administrator>deviceQuery
    deviceQuery Starting...
    
     CUDA Device Query (Runtime API) version (CUDART static linking)
    
    Detected 1 CUDA Capable device(s)
    
    Device 0: "GeForce RTX 2060"
      CUDA Driver Version / Runtime Version          10.2 / 10.1
      CUDA Capability Major/Minor version number:    7.5
      Total amount of global memory:                 6144 MBytes (6442450944 bytes)
      (30) Multiprocessors, ( 64) CUDA Cores/MP:     1920 CUDA Cores
      GPU Max Clock rate:                            1755 MHz (1.75 GHz)
      Memory Clock rate:                             7001 Mhz
      Memory Bus Width:                              192-bit
      L2 Cache Size:                                 3145728 bytes
      Maximum Texture Dimension Size (x,y,z)         1D=(131072), 2D=(131072, 65536), 3D=(16384, 16384, 16384)
      Maximum Layered 1D Texture Size, (num) layers  1D=(32768), 2048 layers
      Maximum Layered 2D Texture Size, (num) layers  2D=(32768, 32768), 2048 layers
      Total amount of constant memory:               zu bytes
      Total amount of shared memory per block:       zu bytes
      Total number of registers available per block: 65536
      Warp size:                                     32
      Maximum number of threads per multiprocessor:  1024
      Maximum number of threads per block:           1024
      Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
      Max dimension size of a grid size    (x,y,z): (2147483647, 65535, 65535)
      Maximum memory pitch:                          zu bytes
      Texture alignment:                             zu bytes
      Concurrent copy and kernel execution:          Yes with 3 copy engine(s)
      Run time limit on kernels:                     Yes
      Integrated GPU sharing Host Memory:            No
      Support host page-locked memory mapping:       Yes
      Alignment requirement for Surfaces:            Yes
      Device has ECC support:                        Disabled
      CUDA Device Driver Mode (TCC or WDDM):         WDDM (Windows Display Driver Model)
      Device supports Unified Addressing (UVA):      Yes
      Device supports Compute Preemption:            Yes
      Supports Cooperative Kernel Launch:            No
      Supports MultiDevice Co-op Kernel Launch:      No
      Device PCI Domain ID / Bus ID / location ID:   0 / 1 / 0
      Compute Mode:
         < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
    
    deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 10.2, CUDA Runtime Version = 10.1, NumDevs = 1, Device0 = GeForce RTX 2060
    Result = PASS
    
    
    也可以输入命令nvcc -V查看

    nvcc -V

    C:\Users\Administrator>nvcc -V
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2019 NVIDIA Corporation
    Built on Sun_Jul_28_19:12:52_Pacific_Daylight_Time_2019
    Cuda compilation tools, release 10.1, V10.1.243
    

    5. 下载cuDNN, 这里选用cuDNN 7.6版本

    cuDNN作为cuda的补充,安装比较简单多了,只需要把下载后的压缩文件解压缩然后 复制过去就行,具体步骤如下:
    下载红色框cuDNN7.6.4 for CUDA 10.1版本
    cuDNN7.6.4 for CUDA 10.1
    再选择win10版本
    cuDNN Library for Windows 10
    下载完后,解压后将/bin, /include 和 /lib 三个文件夹都复制到C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1文件夹下,会自动合并文件
    cuDNN解压
    将cuDNN文件复制到CUDA10.1下

    6. 环境变量设置-PATH路径添加

    将CUDA各个PATH路径添加好,否则有可能出问题,系统环境变量PATH需添加的路径如下:
    系统环境变量PATH需添加的路径

    至此,TensorFlow GPU版 win10 环境配置已然完成!


    测试

    1. 查看GPU情况

    使用NVSMI命令查看驱动版本,CUDA版本等信息

    nvidia-smi

    C:\Users\Administrator>nvidia-smi
    Fri May 29 15:04:54 2020
    +-----------------------------------------------------------------------------+                                                                                                                                                              
    | NVIDIA-SMI 441.22       Driver Version: 441.22       CUDA Version: 10.2     |                                                                                                                                                              
    |-------------------------------+----------------------+----------------------+                                                                                                                                                              
    | GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |                                                                                                                                                              
    | Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |                                                                                                                                                              
    |===============================+======================+======================|                                                                                                                                                              
    |   0  GeForce RTX 2060   WDDM  | 00000000:01:00.0  On |                  N/A |                                                                                                                                                              
    |  0%   43C    P8     7W / 175W |    880MiB /  6144MiB |      2%      Default |                                                                                                                                                              
    +-------------------------------+----------------------+----------------------+                                                                                                                                                              
                                                                                                                                                                                                                                                 
    +-----------------------------------------------------------------------------+                                                                                                                                                              
    | Processes:                                                       GPU Memory |                                                                                                                                                              
    |  GPU       PID   Type   Process name                             Usage      |                                                                                                                                                              
    |=============================================================================|                                                                                                                                                              
    |    0      1164    C+G   Insufficient Permissions                   N/A      |                                                                                                                                                              
    |    0      4296    C+G   C:\Windows\explorer.exe                    N/A      |                                                                                                                                                              
    |    0      4504    C+G   ...al\Google\Chrome\Application\chrome.exe N/A      |                                                                                                                                                              
    |    0      5052    C+G   ...t_cw5n1h2txyewy\ShellExperienceHost.exe N/A      |                                                                                                                                                              
    |    0      5204    C+G   ...dows.Cortana_cw5n1h2txyewy\SearchUI.exe N/A      |                                                                                                                                                              
    |    0      6440    C+G   ...hell.Experiences.TextInput.InputApp.exe N/A      |                                                                                                                                                              
    |    0     12332    C+G   ...rogram Files\Microsoft VS Code\Code.exe N/A      |                                                                                                                                                              
    |    0     14236    C+G   ...oftEdge_8wekyb3d8bbwe\MicrosoftEdge.exe N/A      |                                                                                                                                                              
    |    0     15860    C+G   ...rosoft Office\root\Office16\WINWORD.EXE N/A      |                                                                                                                                                              
    +-----------------------------------------------------------------------------+        
    

    2. 比较GPU 和CPU的速度--tensorflow中测试cpu和gpu的速度差距

    具体代码 就不粘贴了,结果如下:
    ******************************************************
    1500次比对
    ******************************************************
    ----------------------
    GPU
    -----------------------
    Shape: (1500, 1500) Device: /gpu:0
    Time taken: 0:00:00.958767
    ------------------------------
    CPU
    ---------------------------
    Shape: (1500, 1500) Device: /cpu:0
    Time taken: 0:00:00.601363
    
    ******************************************************
    15000次比对
    ******************************************************
    ----------------------
    GPU
    -----------------------
    Shape: (15000, 15000) Device: /gpu:0
    Time taken: 0:00:02.584088
    ------------------------------
    CPU
    ---------------------------
    Shape: (15000, 15000) Device: /cpu:0
    Time taken: 0:00:13.458996
    
    
    
    ******************************************************
    20000次比对
    ******************************************************
    ------------------------------
    GPU
    ---------------------------
    1999980200000.0
    
    Shape: (20000, 20000) Device: /gpu:0
    Time taken: 0:00:05.113321
    ----------------------
    CPU
    -----------------------
    
    2000095700000.0
    
    Shape: (20000, 20000) Device: /cpu:0
    Time taken: 0:00:32.852118
    
    从运行时间来看,在训练规模较小时,CPU还可能更快,在规模较大时,GPU优势明显。因此如果我们的训练数据集较小时可以不用调用GPU运算,而只用CPU运行,可以在导入TensorFlow前加入如下python代码:
    import os
    os.environ['CUDA_VISIBLE_DEVICES'] = '-1' #不用GPU 使用CPU
    

    最后感谢大家的阅读,让我们一起开始深度学习之旅吧!

    参考文章:

    Anaconda的安装教程

    windows下安装配置cudn和cudnn

    CUDA与cuDNN

    走进tensorflow第十二步——测试cpu和gpu的速度差距

    相关文章

      网友评论

        本文标题:TensorFlow2.1 GPU版 win10环境下配置及测试

        本文链接:https://www.haomeiwen.com/subject/lkpaahtx.html