美文网首页
tensorflow-3

tensorflow-3

作者: 我吃我样 | 来源:发表于2018-12-27 13:49 被阅读0次

checkpoint

可以上手撸代码,明白建立网络、训练、评估测试的实现,常见模型:线性回归模型、softmax应用到多分类模型。

接下来,实现卷积神经网络(常用于图像处理领域),使用GPU版本的tensorflow

outline

`GPU版本的tensorflow

`卷积神经网络

`GPU版本的tensorflow

(其实,tensorflow中文社区的版本滞后于英文版)tensorflow现在已经能直接用pip安装,而且速度很快。

pip install --upgrade pip

CPU版本

pip install tensorflow # Python 2.7; CPU support (no GPU support)

GPU版本

pip install tensorflow-gpu # Python 2.7; GPU support

当然你也可以用清华源

pip install -i Simple Index tensorflow

#测试是否已经正确安装cpu版本

>>> import tensorflow as tf>>> hello = tf.constant('Hello, TensorFlow!')  >>> sess = tf.Session()  2018-04-11 18:40:20.509133: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA>>> sess.run(hello)'Hello, TensorFlow!'>>> a = tf.constant(10)  >>> b = tf.constant(20)  >>> sess.run(a+b)30

#安装GPU版本之前,需要先安装cuda,下载cudnn的library,如果你不确定有没有装,可以先装GPU版本试试,没装好,import就会报错

否则tensorflow找不到相应的库文件,会报下面类似的错误

ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory

Process

我的环境:tensorflow 1.7 cuda9 cudnn7.0

1、安装cuda。tensorflow1.7需要cuda9.0(import的报错信息是什么就是缺什么版本),NVIDIA网站下载相应cuda的run文件

chmod 777 cuda_9.1.85_387.26_linux.run

sudo sh cuda_9.1.85_387.26_linux.run

不要安装他提供的显卡 driver,兼容性不太好,很容易把驱动搞坏,导致循环登陆问题(循环登录哦,非常牛逼,微笑,我选择放弃挣扎直接重装);

安装默认文件夹 /usr/local/cuda-9.0,并且会自动创建一个/usr/local/cuda的symbolic link,可以选择不生成。

2、下载相应版本的cudnn。cudnn downloads,下载形如cuDNN v7.1.2 Library for Linux。

解压之后,把相应的文件拷贝到cuda安装目录的相应文件夹下

tar xvzf xxxx.tgz

cp Downloads/cuda/include/cudnn.h cuda-9.0/include/

cp Downloads/cuda/lib64/libcudnn* cuda-9.0/lib64/

3、修改环境变量

vim向~/.bashrc添加下面export语句,保存之后执行source

export PYTHONPATH=$PYTHONPATH:/home/ceo1207/cuda-9.0/lib64

export PATH="$PATH:/home/ceo1207/cuda-9.0/bin"

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/home/ceo1207/cuda-9.0/lib64:/home/ceo1207/cuda-9.0/extras/CUPTI/lib64"

export LIBRARY_PATH=$LIBRARY_PATH:/home/ceo1207/cuda-9.0/lib64

export CUDA_HOME=/home/ceo1207/cuda-9.0

4、一般到这个时候,再次import tensorflow就不会有问题了。但是,但是,说到这个我很气,花了我大半天的时间去弄。我用的是pycharm中的python IDE,他通过桌面快捷启动的时候,不会继承bash的变量,所以修改.bashrc添加的环境变量并不会被启用,所以import的时候,就是找不到cuda的lib,只能通过bash运行pycharmdir/bin/pycharm.sh才能正确继承环境变量。

才能让这样的错误消除

ImportError: libcublas.so.8.0: cannot open shared object file: No such file or directory

另外,运行tf时,报错,说明cudnn的版本不对,cudnn的版本需要跟source的版本一致

Loaded runtime CuDNN library: 7102 (compatibility version 7100) but source was compiled with 7005 (compatibility version 7000)

some notes

#如何查看tensorflow是否使用了显卡加速?

查看运行时在console的显示信息

successfully opened CUDA library libcublas.so locally(用了GPU版本)

运行会话时,设置输出日志,代码如下:

tf.Session(config=tf.ConfigProto(log_device_placement=True))

日志中你应该就能看到具体的某一个op会运行在cpu还是gpu

类如cpu:0 gpu:0,1,2这样的标号

2018-04-11 09:50:12.907953: I tensorflow/core/common_runtime/http://placer.cc:884] mul: (Mul)/job:localhost/replica:0/task:0/device:CPU:0

#查看tf的版本号

tf.__path__

tf.__version__

#如何卸载cuda

cd /usr/local/cuda-8.0/bin

运行 uninstall 脚本

#如何卸载tf

pip uninstall tensorflow

pip uninstall tensorflow-gpu

选择安装制定版本的tf

pip install tensorflow==1.4

pip参数:

-U(升级 upgrade)

--user 安装在用户目录下,这样不需要root权限,也能使用pip install

#如何安装.deb

dpkg -i deb文件名

#安装.whl

pip install xx.whl 如果已经安装了低版本,需要添加 -U

`卷积神经网络

终于把GPU版本搞完了,这次来完成早就说好的卷积神经网络

现在应该能轻车熟路了,这次使用GPU,可以把循环迭代次数放在10w数量级,比之前的网络,多了建立卷积层和pooling层的部分。

1、确定输入和ground truth

2、确定网络结构

3、确定loss和优化方法

4、评估测试

5、运行前,记得Variable需要init

import tensorflow as tfimport input_data# use conv layer to recognize hand-written numbersdef weightVariable(shape):    init = tf.truncated_normal(shape, stddev=0.1)    return tf.Variable(init)def biasVariable(shape):    init = tf.constant(0.1,shape=shape)    return tf.Variable(init)input = tf.placeholder(tf.float32, shape=[None, 784])truth = tf.placeholder(tf.float32, shape=[None, 10])# set up the network# conv1 variablefilter1 = weightVariable([5,5,1,32])# batchsize height weight channelsinputImage = tf.reshape(input, [-1, 28, 28, 1])conv1 = tf.nn.conv2d(inputImage, filter1, strides=[1,1,1,1], padding="SAME")conv1 = tf.nn.relu(conv1+biasVariable([32]))pool1 = tf.nn.max_pool(conv1,ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")# conv2 Variablefilter2 = weightVariable([5,5,32,64])conv2 = tf.nn.conv2d(pool1, filter2, strides=[1,1,1,1], padding="SAME")conv2 = tf.nn.relu(conv2+biasVariable([64]))pool2 = tf.nn.max_pool(conv2, ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")# fully connectedpool2Flat = tf.reshape(pool2, [-1, 7*7*64])w1 = weightVariable([7*7*64,1024])b1 = biasVariable([1024])fc1 = tf.nn.relu(tf.matmul(pool2Flat,w1)+b1)w2 = weightVariable([1024,10])b2 = biasVariable([10])fc2 = tf.nn.relu(tf.matmul(fc1,w2)+b2)output = tf.nn.softmax(fc2)# trainloss = -tf.reduce_sum(truth*tf.log(output))train = tf.train.GradientDescentOptimizer(0.01).minimize(loss)# testresult = tf.equal(tf.argmax(truth,1),tf.argmax(output,1))accuracy = tf.reduce_mean(tf.cast(result,tf.float32))sess = tf.InteractiveSession()init = tf.initialize_all_variables()sess.run(init)mnist = import tensorflow as tfimport input_data# use conv layer to recognize hand-written numbersdef weightVariable(shape):    init = tf.truncated_normal(shape, stddev=0.1)    return tf.Variable(init)def biasVariable(shape):    init = tf.constant(0.1,shape=shape)    return tf.Variable(init)input = tf.placeholder(tf.float32, shape=[None, 784])truth = tf.placeholder(tf.float32, shape=[None, 10])# set up the network# conv1 variablefilter1 = weightVariable([5,5,1,32])# batchsize height weight channelsinputImage = tf.reshape(input, [-1, 28, 28, 1])conv1 = tf.nn.conv2d(inputImage, filter1, strides=[1,1,1,1], padding="SAME")conv1 = tf.nn.relu(conv1+biasVariable([32]))pool1 = tf.nn.max_pool(conv1,ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")# conv2 Variablefilter2 = weightVariable([5,5,32,64])conv2 = tf.nn.conv2d(pool1, filter2, strides=[1,1,1,1], padding="SAME")conv2 = tf.nn.relu(conv2+biasVariable([64]))pool2 = tf.nn.max_pool(conv2, ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")# fully connectedpool2Flat = tf.reshape(pool2, [-1, 7*7*64])w1 = weightVariable([7*7*64,1024])b1 = biasVariable([1024])fc1 = tf.nn.relu(tf.matmul(pool2Flat,w1)+b1)dropPlace = tf.placeholder(tf.float32)fc1Drop = tf.nn.dropout(fc1, dropPlace)w2 = weightVariable([1024,10])b2 = biasVariable([10])fc2 = tf.nn.relu(tf.matmul(fc1Drop,w2)+b2)output = tf.nn.softmax(fc2)# trainloss = -tf.reduce_sum(truth*tf.log(output))train = tf.train.GradientDescentOptimizer(1e-4).minimize(loss)# testresult = tf.equal(tf.argmax(truth,1),tf.argmax(output,1))accuracy = tf.reduce_mean(tf.cast(result,tf.float32))sess = tf.InteractiveSession()init = tf.initialize_all_variables()sess.run(init)mnist = input_data.read_data_sets('data/', one_hot=True)for i in range(100000):    batch = mnist.train.next_batch(50)    sess.run(train, feed_dict={input:batch[0],truth:batch[1],dropPlace:0.5})    if i%100 == 0 :        print sess.run(accuracy, feed_dict={input:batch[0],truth:batch[1],dropPlace:1.0})sess.close()input_data.read_data_sets('data/', one_hot=True)for i in range(100000):    batch = mnist.train.next_batch(50)    sess.run(train, feed_dict={input:batch[0],truth:batch[1]})    if i%100 == 0 :        print sess.run(accuracy, feed_dict={input:batch[0],truth:batch[1]})sess.close()

#note

· 挑选合适步长很重要,太大容易越过局部最优,太小收敛太慢,也容易陷入局部最优

刚开始设了0.01,跑了10w次迭代,都一直是10-20%的准确率,设为1e-4才表现正常

· 网络不好,迭代再多次也没用

· 没有添加Dropout之前,测试评估基本就60-80%的准确率,添加之后,直接跃升到99%,dropout对防止模型过拟合帮助很大

附:最后版本

import tensorflow as tfimport input_data# use conv layer to recognize hand-written numbersdef weightVariable(shape):    init = tf.truncated_normal(shape, stddev=0.1)    return tf.Variable(init)def biasVariable(shape):    init = tf.constant(0.1,shape=shape)    return tf.Variable(init)input = tf.placeholder(tf.float32, shape=[None, 784])truth = tf.placeholder(tf.float32, shape=[None, 10])# set up the network# conv1 variablefilter1 = weightVariable([5,5,1,32])# batchsize height weight channelsinputImage = tf.reshape(input, [-1, 28, 28, 1])conv1 = tf.nn.conv2d(inputImage, filter1, strides=[1,1,1,1], padding="SAME")conv1 = tf.nn.relu(conv1+biasVariable([32]))pool1 = tf.nn.max_pool(conv1,ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")# conv2 Variablefilter2 = weightVariable([5,5,32,64])conv2 = tf.nn.conv2d(pool1, filter2, strides=[1,1,1,1], padding="SAME")conv2 = tf.nn.relu(conv2+biasVariable([64]))pool2 = tf.nn.max_pool(conv2, ksize=[1,2,2,1], strides=[1,2,2,1], padding="SAME")# fully connectedpool2Flat = tf.reshape(pool2, [-1, 7*7*64])w1 = weightVariable([7*7*64,1024])b1 = biasVariable([1024])fc1 = tf.nn.relu(tf.matmul(pool2Flat,w1)+b1)dropPlace = tf.placeholder(tf.float32)fc1Drop = tf.nn.dropout(fc1, dropPlace)w2 = weightVariable([1024,10])b2 = biasVariable([10])fc2 = tf.nn.relu(tf.matmul(fc1Drop,w2)+b2)output = tf.nn.softmax(fc2)# trainloss = -tf.reduce_sum(truth*tf.log(output))train = tf.train.GradientDescentOptimizer(1e-4).minimize(loss)# testresult = tf.equal(tf.argmax(truth,1),tf.argmax(output,1))accuracy = tf.reduce_mean(tf.cast(result,tf.float32))sess = tf.InteractiveSession()init = tf.initialize_all_variables()sess.run(init)mnist = input_data.read_data_sets('data/', one_hot=True)for i in range(100000):    batch = mnist.train.next_batch(50)    sess.run(train, feed_dict={input:batch[0],truth:batch[1],dropPlace:0.5})    if i%100 == 0 :        print sess.run(accuracy, feed_dict={input:batch[0],truth:batch[1],dropPlace:1.0})sess.close()

相关文章

  • tensorflow-3

    checkpoint 可以上手撸代码,明白建立网络、训练、评估测试的实现,常见模型:线性回归模型、softmax应...

  • TensorFlow-3: 用 feed-forward ne

    今天继续看 TensorFlow Mechanics 101:https://www.tensorflow.org...

网友评论

      本文标题:tensorflow-3

      本文链接:https://www.haomeiwen.com/subject/fdurlqtx.html