tensorflow CNN实现手写数字识别

作者: 随时学丫 | 来源:发表于2019-04-29 14:06 被阅读0次

SkySeraph AI 实践到理论系列
基于TensorFlow Slim库实现手写数字识别
Tensorflow MNIST for Android
tensorflow CNN实现手写数字识别
Ocr手写识别学习
【转】Tensorflow 实现 MNIST 手写数字识别
【TensorFlow实战——笔记】第3章：TensorFlow
学习笔记TF024:TensorFlow实现Softmax Re
基于tensorflow搭建一个复杂卷积神经网络模型（cifar
TensorFlow读取MNIST数据集问题解决方法

卷积神经网络

tf.nn.conv2d

tf.nn.conv2d(input, # [batch, in_height, in_width, in_channels]
            filter, # [filter_height, filter_width, in_channels, out_channels]
            strides, # [1,1,1,1]
            padding, # SAME VALID
            use_cudnn_on_gpu=True,
            data_format='NHWC',
            dilations=[1, 1, 1, 1],
            name=None)

input: [batch, in_height, in_width, in_channels] 4-D的 Tensor (float32/float64)
```
 batch_size,高度，宽度，通道数
```

filter：[filter_height, filter_width, in_channels, out_channels]

  卷积核高度，   卷积核宽度      输入通道数    输出通道数(卷积核个数)

strides：步长 [1,1,1,1] 1-D向量，长度为4
padding：填充 SAME VALID
use_cudnn_on_gpu：是否GPU加速

结果返回一个Tensor，这个输出，就是我们常说的 feature map，shape仍然是 [batch, height, width, channels] 这种形式。

图像卷积

3x3图像，1通道，1x1卷积核

1.考虑一种最简单的情况，现在有一张3×3单通道的图像（对应的shape：[1，3，3，1]），用一个1×1的卷积核（对应的shape：[1，1，1，1]）去做卷积，最后会得到一张3×3的feature map

In [4]:

import tensorflow as tf
#                 [batch, in_height, in_width, in_channels]
input = tf.Variable(tf.random_normal([1,3,3,1]))
#                 [filter_height, filter_width, in_channels, out_channels]
filter = tf.Variable(tf.random_normal([1,1,1,1]))
op1 = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='VALID')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(op1),'\n')
    print(sess.run(op1).shape)

运行结果

[[[[-0.06386475]
   [ 0.30251193]
   [-0.36254457]]

  [[-0.1863834 ]
   [-0.11046342]
   [ 0.12128225]]

  [[-0.16598591]
   [-0.06247617]
   [ 0.10344568]]]] 

(1, 3, 3, 1)

3x3图像，5通道，1x1卷积核

2.增加图片的通道数，使用一张3×3五通道的图像（对应的shape：[1，3，3，5]），用一个1×1的卷积核（对应的shape：[1，1，1，1]）去做卷积，仍然是一张3×3的feature map，这就相当于每一个像素点，卷积核都与该像素点的每一个通道做卷积。

In [5]:

import tensorflow as tf
#                 [batch, in_height, in_width, in_channels]
input = tf.Variable(tf.random_normal([1,3,3,5]))
#                 [filter_height, filter_width, in_channels, out_channels]
filter = tf.Variable(tf.random_normal([1,1,5,1]))
op2 = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='VALID')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(op2),'\n')
    print(sess.run(op2).shape)

运行结果

[[[[ 2.611188 ]
   [ 1.1356436]
   [ 0.1728566]]

  [[ 1.7238054]
   [-2.65044  ]
   [-0.6026321]]

  [[-0.9376406]
   [-2.1341398]
   [ 0.5801886]]]] 

(1, 3, 3, 1)

3x3图像，5通道，3x3卷积核，文本卷积维度和图像一样大，所以卷积之后只有1列

3.把卷积核扩大，现在用3×3的卷积核做卷积，最后的输出是一个值，相当于情况2的feature map所有像素点的值求和

import tensorflow as tf
#                 [batch, in_height, in_width, in_channels]
input = tf.Variable(tf.random_normal([1,3,3,5]))
#                 [filter_height, filter_width, in_channels, out_channels]
filter = tf.Variable(tf.random_normal([3,3,5,1]))
op2 = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='VALID')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(op2),'\n')
    print(sess.run(op2).shape)

运行结果

[[[[-3.8528938]]]] 

(1, 1, 1, 1)

5x5图像，5通道，3x3卷积核

4.使用更大的图片将情况2的图片扩大到5×5，仍然是3×3的卷积核，令步长为1，输出3×3的feature map

import tensorflow as tf
#                 [batch, in_height, in_width, in_channels]
input = tf.Variable(tf.random_normal([1,5,5,5]))
#                 [filter_height, filter_width, in_channels, out_channels]
filter = tf.Variable(tf.random_normal([3,3,5,1]))
op2 = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='VALID')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(op2),'\n')
    print(sess.run(op2).shape)

运行结果

[[[[-3.3537188]
   [ 2.6631894]
   [10.7735815]]

  [[ 6.7866626]
   [-5.753437 ]
   [ 6.8379397]]

  [[-7.2338777]
   [-3.8412943]
   [11.663807 ]]]] 

(1, 3, 3, 1)

5x5图像，5通道，3x3卷积核

5.上面我们一直令参数padding的值为‘VALID’，当其为‘SAME’时，表示卷积核可以停留在图像边缘，如下，输出5×5的feature map

import tensorflow as tf
#                 [batch, in_height, in_width, in_channels]
input = tf.Variable(tf.random_normal([1,5,5,5]))
#                 [filter_height, filter_width, in_channels, out_channels]
filter = tf.Variable(tf.random_normal([3,3,5,1]))
op2 = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(op2),'\n')
    print(sess.run(op2).shape)

运行结果

[[[[ -4.1736193 ]
   [  7.2922435 ]
   [  2.9188058 ]
   [  0.49713266]
   [ -2.956664  ]]

  [[ -3.305164  ]
   [ -7.311406  ]
   [ -5.045771  ]
   [ -2.5354984 ]
   [ -7.40237   ]]

  [[  9.273168  ]
   [  1.2130424 ]
   [ -8.63011   ]
   [  8.675023  ]
   [  4.0911283 ]]

  [[ -2.295607  ]
   [  5.2230077 ]
   [-10.142306  ]
   [ -6.135029  ]
   [  1.3315554 ]]

  [[-11.159186  ]
   [ -3.5029335 ]
   [ -1.638276  ]
   [ -4.381499  ]
   [ -1.0199151 ]]]] 

(1, 5, 5, 1)

5x5图像，5通道，3x3卷积核，3个卷积核

6.如果卷积核有多个

此时输出3张5×5的feature map

import tensorflow as tf
#                 [batch, in_height, in_width, in_channels]
input = tf.Variable(tf.random_normal([1,5,5,5]))
#                 [filter_height, filter_width, in_channels, out_channels]
filter = tf.Variable(tf.random_normal([3,3,5,3]))
op2 = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(op2),'\n')
    print(sess.run(op2).shape)

运行结果

[[[[  2.7570508   -3.232238    -2.6215773 ]
   [  2.532285     1.9889098    3.87929   ]
   [ -3.187311     8.91769      3.224719  ]
   [  0.44387245  -9.403946     2.468867  ]
   [  0.21311586  -1.590601     5.749056  ]]

  [[ -2.6277003    6.488189     9.992645  ]
   [  1.5766766  -11.48576      0.6145782 ]
   [ -5.0482545   -0.96584886   4.0381684 ]
   [ -0.797274     2.4302173   -3.8855307 ]
   [ -2.6238062    2.05465      2.9259453 ]]

  [[  4.714437    -0.9536078   -2.9879472 ]
   [  1.5400691    1.5240853   -6.90153   ]
   [ -3.6736727    3.85059     -0.5918405 ]
   [  7.023252     2.9593654  -13.595696  ]
   [ -5.041815    -2.7133517    0.6385279 ]]

  [[  1.477376     0.47209492   5.653083  ]
   [ -0.39575818  14.780628    -1.5949147 ]
   [  2.378466   -11.533363    -0.4041656 ]
   [ -0.4129743    6.5807753   -2.7889323 ]
   [  6.1631317   -0.49479347   1.52246   ]]

  [[ -6.97586      1.1432166   -5.064254  ]
   [ -9.823753    -3.1042528   -1.6604922 ]
   [  9.015108   -10.42481     -4.7503257 ]
   [  2.3552632    0.43692362   2.7325256 ]
   [ -2.1840062    2.729301    -4.588225  ]]]] 

(1, 5, 5, 3)

5x5图像，5通道，3x3卷积核，3个卷积核，步长为2

7.步长不为1的情况，文档里说了对于图片，因为只有两维，通常strides取[1，stride，stride，1]

import tensorflow as tf
#                 [batch, in_height, in_width, in_channels]
input = tf.Variable(tf.random_normal([1,5,5,5]))
#                 [filter_height, filter_width, in_channels, out_channels]
filter = tf.Variable(tf.random_normal([3,3,5,3]))
op2 = tf.nn.conv2d(input, filter, strides=[1, 2, 2, 1], padding='SAME')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(op2),'\n')
    print(sess.run(op2).shape)

运行结果

[[[[ 2.783487  -2.09441    4.733526 ]
   [-7.3059773 -3.108855   4.022243 ]
   [-4.050215   3.0158758 -4.1893964]]

  [[-6.3690815 -5.2265515  1.1703218]
   [ 9.0784235 -3.5745146 14.855592 ]
   [-0.4078823  4.0576644  4.617129 ]]

  [[ 3.339266  -5.4302483 -3.154387 ]
   [ 1.3765206  2.6518223  5.6584387]
   [-3.9308991  1.4282804 -3.4455342]]]] 

(1, 3, 3, 3)

5x5图像，5通道，3x3卷积核，3个卷积核，步长为2，10张图像

8.如果batch值不为1，同时输入10张图

每张图，都有3张3×3的feature map，输出的shape就是[10，3，3，3]

import tensorflow as tf
#                 [batch, in_height, in_width, in_channels]
input = tf.Variable(tf.random_normal([10,5,5,5]))
#                 [filter_height, filter_width, in_channels, out_channels]
filter = tf.Variable(tf.random_normal([3,3,5,3]))
op2 = tf.nn.conv2d(input, filter, strides=[1, 2, 2, 1], padding='SAME')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(op2),'\n')
    print(sess.run(op2).shape)

运行结果

[[[[  3.7294517   -3.107039    -3.3261778 ]
   [  5.441018    -4.0686336    5.619485  ]
   [  2.1390946    3.5674727   -1.5291997 ]]

  [[ -5.6689725   16.974398     2.381929  ]
   [  2.4008992   -0.6694184    1.378117  ]
   [  1.2934582    7.2192235    0.48349503]]

  [[  2.9121394   -0.4573097   -9.765212  ]
   [  1.0088806   -0.7046843    6.591536  ]
   [ -0.72504395   0.43721557  -4.999654  ]]]


 [[[  1.967625    -1.6568589   -6.145099  ]
   [ -4.151078   -10.529405    -2.047928  ]
   [  3.4548922    2.7491624    2.9001775 ]]

  [[ -1.6896939    3.1873543    6.188783  ]
   [ 11.703161     1.6971766   -4.8438787 ]
   [  0.9549799   -1.1131762    4.593415  ]]

  [[ -1.1170579    0.4810401   -1.3526723 ]
   [  2.529728    -1.1482326    3.7958796 ]
   [ -0.24976896   3.3091352   -6.729189  ]]]


 [[[ -3.33531     -1.9344107    2.8165019 ]
   [ -1.6785766   -2.8081656    7.2197647 ]
   [  1.7976431    2.8334517   -0.08083367]]

  [[  2.5362453   -0.68693405   2.2952533 ]
   [  1.5236933    2.129165     0.194734  ]
   [  0.5964938   -8.6989565    5.084363  ]]

  [[  5.066878    -1.3026551   -5.7902007 ]
   [ -2.9802423    4.8924155    5.9025197 ]
   [ -3.933334     5.099715    -0.8536027 ]]]


 [[[ -1.0145748  -10.15126     -7.0179715 ]
   [ -8.451802    -0.17334843  -2.7171214 ]
   [  5.668031    -5.15528     -4.2402534 ]]

  [[ -2.0954626   -1.0145442    3.2066696 ]
   [  2.289553     0.9271075    0.8146973 ]
   [ -0.7423492    4.2153864   -4.70488   ]]

  [[  3.8675358   -0.35446188  -1.1588985 ]
   [ -4.8492827   -5.2945166    3.944246  ]
   [ -0.43092388  -0.8130417    2.3813803 ]]]


 [[[  0.66720986   3.8808417    2.1328838 ]
   [  7.446735    -4.522188    -5.990181  ]
   [ -2.5916054    5.0853543    2.6371577 ]]

  [[ -7.136207     0.6306949   -8.853178  ]
   [  3.7415988   -7.89348     -8.487032  ]
   [ -0.69531     -3.222552     1.5073893 ]]

  [[ -3.106504    -0.01809834  -9.029028  ]
   [ -4.416857     0.13292897   7.7073345 ]
   [  0.9844466    4.2795186   -0.76342046]]]


 [[[ -0.11672409  -7.369146    -4.8543487 ]
   [ -4.230579    -1.1736143    0.74828875]
   [ -0.6568188    6.765464    -4.9761944 ]]

  [[  3.8933635   -7.902747    -0.63001007]
   [  6.8245344    3.9199047   11.168122  ]
   [  3.7043867    0.31197003   0.04769279]]

  [[ -4.1409       8.580945     8.486864  ]
   [ -3.1867335    7.059393     6.296857  ]
   [ -0.9835455   -3.6718185    0.97860974]]]


 [[[  5.193179    -1.2967495    8.170371  ]
   [ -1.3087153   -5.5033283   -1.6919953 ]
   [  0.36510244  -6.296658    -3.7380807 ]]

  [[  3.5434737   13.3447695    1.7701437 ]
   [-10.250333    -9.407058    -3.2337494 ]
   [  8.435421     4.078936     3.4657378 ]]

  [[  3.4681797    2.228949     0.45596147]
   [ -0.57005715   4.670751     2.034872  ]
   [ -1.915133     7.9970365   -3.8922138 ]]]


 [[[  3.950432    -4.4767623   -6.447672  ]
   [  2.595737    -8.553671    -0.45686972]
   [  3.391854    -2.003466    -2.2928245 ]]

  [[ -3.6888146    3.5153918    1.2406276 ]
   [ -0.25753272   2.6999128   -2.8501456 ]
   [ -0.9058769    6.502099    -0.5419939 ]]

  [[ -0.68687534  -6.5038085    2.8593688 ]
   [ -3.683316     2.1430447    5.490655  ]
   [  5.7413816    3.3227494   -7.533464  ]]]


 [[[ -3.484571     5.3650527    2.9336984 ]
   [  0.6027174    3.7776787    1.0154141 ]
   [ -4.7919264    7.149525     1.9800262 ]]

  [[ -2.1547816   -3.2360375   -5.2381744 ]
   [  7.6362724    8.085188     9.068025  ]
   [ -4.549206    -3.8285804    5.8914824 ]]

  [[ -1.9079026    2.9233663    0.9151974 ]
   [  6.70253    -10.376949    -2.2334673 ]
   [  2.7263498    3.202616     3.6564238 ]]]


 [[[  6.6487746   -0.2954742    3.0371974 ]
   [  3.576       -7.2807136    4.2893467 ]
   [ -0.96813136  -5.533345    -6.83936   ]]

  [[ -0.10136782  -1.4625425   -7.0081096 ]
   [  3.8160882   -2.4150543   -3.9401052 ]
   [  2.7480733   -1.4603323   10.289123  ]]

  [[  2.629776    -3.5297518    0.4979372 ]
   [  0.9985927   -8.139794    -0.5185237 ]
   [ -4.5744176  -10.06965      5.6358476 ]]]] 

(10, 3, 3, 3)

文本卷积

5x5文本，1通道，2-gram卷积核，3个卷积核，10句话

import tensorflow as tf
#                 [batch, in_height, in_width, in_channels]
input = tf.Variable(tf.random_normal([10,5,5,1]))
#                 [filter_height, filter_width, in_channels, out_channels]
filter = tf.Variable(tf.random_normal([3,5,1,3]))
op1 = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='VALID')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(op1),'\n')
    print(sess.run(op1).shape)

运行结果

[[[[ -0.28560075   3.4164069    0.64287925]]

  [[  4.922906     6.569628    -4.377281  ]]

  [[  2.0096815   -0.9498653   -5.302     ]]]


 [[[ -0.30866227   0.65657634   0.08617933]]

  [[  2.1648352    2.0540233   -6.2501183 ]]

  [[  1.8437229   -3.3579445    0.648278  ]]]


 [[[ -6.248692    -8.374758     3.1102016 ]]

  [[ -4.6158714    1.0821313    2.8032086 ]]

  [[  1.912092     0.933113    -3.0924444 ]]]


 [[[  7.5399313    9.936766    -1.6083889 ]]

  [[  2.4991071   -0.3938322    5.2363515 ]]

  [[ -3.8184917    1.5327872   -0.9156568 ]]]


 [[[ -1.0705962    1.3645588   -2.2302496 ]]

  [[  0.9711383   -2.6879628   -2.1285567 ]]

  [[ -5.15031     -1.7857913   -0.64766765]]]


 [[[ -4.300858    -0.74519587   4.707138  ]]

  [[  1.1525508   -1.9355469    1.1351813 ]]

  [[ -1.930467     5.30831     -0.11006889]]]


 [[[ -0.30049637  -3.3917482   -0.98812234]]

  [[ -0.78466344  -3.508609     1.8363969 ]]

  [[  1.6145957    0.15216915  -0.27968606]]]


 [[[  1.201814     2.2275777    3.4975147 ]]

  [[  1.7633957    3.9830918   10.16128   ]]

  [[  1.9025049    4.217062     3.2219505 ]]]


 [[[  2.7462778   -0.87272054 -10.7139845 ]]

  [[  0.5596598   -7.9665465   -3.5733411 ]]

  [[  0.02203573   1.8229557    1.1090751 ]]]


 [[[ -2.8659163   -6.198704     1.1388084 ]]

  [[ -5.6100855    2.2285914    1.380748  ]]

  [[ -0.3560774    7.3229613    1.1240004 ]]]] 

(10, 3, 1, 3)

tf.nn.max_pool

TF-池化函数 tf.nn.max_pool 的介绍

tf.nn.max_pool(value, # [batch, in_height, in_width, in_channels]
                ksize, #  [1,in_height,in_width,1]
                strides, # [1,height,width,1]
                padding, # SAME VALID
                data_format='NHWC',
                name=None,)

value: 池化层的输入，一般池化层在卷积层后面，输入通常是feature_map
```
  依然是 [batch, in_height, in_width, in_channels]
```

ksize：池化窗口大小，4-D向量，一般是 [1,in_height,in_width,1]

   因为我们不想在 batch，in_channels上做池化，所以维度为1

strides：步长 [1,height,width,1] 1-D向量，长度为4
padding：填充 SAME VALID

返回一个Tensor，类型不变，shape仍然是 [batch, in_height, in_width, in_channels] 这种形式

4x4图像，2通道，池化核2x2，步长1

卷积由 [batch, in_height, in_width, in_channels]

[1,4,4,2] -> [1,3,3,2]

# 2x4x4
a=tf.constant([  
        [[1.0,2.0,3.0,4.0],  
        [5.0,6.0,7.0,8.0],  
        [8.0,7.0,6.0,5.0],  
        [4.0,3.0,2.0,1.0]], # 通道1图像
    
        [[4.0,3.0,2.0,1.0],  # 通道2图像
         [8.0,7.0,6.0,5.0],  
         [1.0,2.0,3.0,4.0],  
         [5.0,6.0,7.0,8.0]]  
    ])  

b=tf.reshape(a,[1,4,4,2])

max_pool_2x2 = tf.nn.max_pool(b,[1,2,2,1],[1,1,1,1],padding='VALID')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print('池化前原始，image：\n')
    print(sess.run(a),'\n')
    print(sess.run(a).shape,'\n')
    print('池化前，reshape，image：\n')
    print(sess.run(b),'\n')
    print(sess.run(b).shape,'\n')
    print('池化后image：\n')
    print(sess.run(max_pool_2x2),'\n')
    print(sess.run(max_pool_2x2).shape,'\n')

运行结果

池化前原始，image：

[[[1. 2. 3. 4.]
  [5. 6. 7. 8.]
  [8. 7. 6. 5.]
  [4. 3. 2. 1.]]

 [[4. 3. 2. 1.]
  [8. 7. 6. 5.]
  [1. 2. 3. 4.]
  [5. 6. 7. 8.]]] 

(2, 4, 4) 

池化前，reshape，image：

[[[[1. 2.]
   [3. 4.]
   [5. 6.]
   [7. 8.]]

  [[8. 7.]
   [6. 5.]
   [4. 3.]
   [2. 1.]]

  [[4. 3.]
   [2. 1.]
   [8. 7.]
   [6. 5.]]

  [[1. 2.]
   [3. 4.]
   [5. 6.]
   [7. 8.]]]] 

(1, 4, 4, 2) 

池化后image：

[[[[8. 7.]
   [6. 6.]
   [7. 8.]]

  [[8. 7.]
   [8. 7.]
   [8. 7.]]

  [[4. 4.]
   [8. 7.]
   [8. 8.]]]] 

(1, 3, 3, 2)

4x4图像，2通道，池化核2x2，步长2

卷积由 [batch, in_height, in_width, in_channels]

[1,4,4,2] -> [1,2,2,2]

# 2x4x4
a=tf.constant([  
        [[1.0,2.0,3.0,4.0],  
        [5.0,6.0,7.0,8.0],  
        [8.0,7.0,6.0,5.0],  
        [4.0,3.0,2.0,1.0]], # 通道1图像
    
        [[4.0,3.0,2.0,1.0],  # 通道2图像
         [8.0,7.0,6.0,5.0],  
         [1.0,2.0,3.0,4.0],  
         [5.0,6.0,7.0,8.0]]  
    ])  

b=tf.reshape(a,[1,4,4,2])

max_pool_2x2 = tf.nn.max_pool(b,[1,2,2,1],[1,2,2,1],padding='VALID')

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print('池化前原始，image：\n')
    print(sess.run(a),'\n')
    print(sess.run(a).shape,'\n')
    print('池化前，reshape，image：\n')
    print(sess.run(b),'\n')
    print(sess.run(b).shape,'\n')
    print('池化后image：\n')
    print(sess.run(max_pool_2x2),'\n')
    print(sess.run(max_pool_2x2).shape,'\n')

运行结果

池化前原始，image：

[[[1. 2. 3. 4.]
  [5. 6. 7. 8.]
  [8. 7. 6. 5.]
  [4. 3. 2. 1.]]

 [[4. 3. 2. 1.]
  [8. 7. 6. 5.]
  [1. 2. 3. 4.]
  [5. 6. 7. 8.]]] 

(2, 4, 4) 

池化前，reshape，image：

[[[[1. 2.]
   [3. 4.]
   [5. 6.]
   [7. 8.]]

  [[8. 7.]
   [6. 5.]
   [4. 3.]
   [2. 1.]]

  [[4. 3.]
   [2. 1.]
   [8. 7.]
   [6. 5.]]

  [[1. 2.]
   [3. 4.]
   [5. 6.]
   [7. 8.]]]] 

(1, 4, 4, 2) 

池化后image：

[[[[8. 7.]
   [7. 8.]]

  [[4. 4.]
   [8. 8.]]]] 

(1, 2, 2, 2)

CNN手写数字识别

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

batch_size = 100
n_batch = mnist.train.num_examples // batch_size

def weight_variable(shape):
    return tf.Variable(tf.truncated_normal(shape,stddev=0.1))

def bias_vairable(shape):
    return tf.Variable(tf.constant(0.1, shape=shape))

def conv2d(x,W):
    return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')

def max_pool_2x2(x):
    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')

x = tf.placeholder(tf.float32,[None,784])
y = tf.placeholder(tf.float32,[None,10])
keep_prob = tf.placeholder(tf.float32)

# x input  [batch,in_height,in_width,in_channels]
# [?,784] reshape -> [-1,28,28,1]
x_image = tf.reshape(x,[-1,28,28,1])

#w        [filter_height,filter_width,in_channels, out_channels]
W_conv1 = weight_variable([5,5,1,32]) # 5*5的采样窗口，32个卷积核从1个平面抽取特征
b_conv1 = bias_vairable([32]) #每个卷积核一个偏置值

# 28*28*1 的图片卷积之后变为 -1*28*28*32
h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1)
# 池化之后变为 -1*14*14*32
h_pool1 = max_pool_2x2(h_conv1)

# 第二次卷积之后变为 14*14*64
W_conv2 = weight_variable([5,5,32,64])
b_conv2 = bias_vairable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool1,W_conv2) + b_conv2)

# 第二次池化之后变为 7*7*64
h_pool2 = max_pool_2x2(h_conv2)


# 第一个全连接层
W_fc1 = weight_variable([7*7*64,1024]) #上一层有7*7*64个神经元，全连接层有1024个神经元
b_fc1 = bias_vairable([1024])#1024个节点

# 把池化层2的输出扁平化为1维向量
h_pool2_flat = tf.reshape(h_pool2,[-1,7*7*64])
#求第一个全连接的输出
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

# 初始化第二个全连接层
W_fc2 = weight_variable([1024,10])
b_fc2 = bias_vairable([10])
logits = tf.matmul(h_fc1_drop,W_fc2) + b_fc2

#计算输出
prediction = tf.nn.sigmoid(logits)

#交叉熵代价函数
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=logits))
#AdamOptimizer
train_step = tf.train.AdamOptimizer(0.001).minimize(loss)

prediction_2 = tf.nn.softmax(prediction)
correct_prediction = (tf.equal(tf.argmax(prediction_2,1), tf.argmax(y,1)))
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for epoch in range(21):
        for batch in range(n_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            sess.run(train_step, feed_dict={x:batch_xs,y:batch_ys,keep_prob:0.7})
        acc = sess.run(accuracy, feed_dict={x:mnist.test.images, y:mnist.test.labels, keep_prob:1.0})
        print("Iter: " + str(epoch) + ", acc: " + str(acc))

运行结果

Iter: 0, acc: 0.9796
Iter: 1, acc: 0.9863
Iter: 2, acc: 0.9882
Iter: 3, acc: 0.9908
Iter: 4, acc: 0.99
Iter: 5, acc: 0.991
Iter: 6, acc: 0.9926
Iter: 7, acc: 0.989
Iter: 8, acc: 0.9913
Iter: 9, acc: 0.9896
Iter: 10, acc: 0.992
Iter: 11, acc: 0.9913
Iter: 12, acc: 0.9917
Iter: 13, acc: 0.9911
Iter: 14, acc: 0.9912
Iter: 15, acc: 0.9915
Iter: 16, acc: 0.9907
Iter: 17, acc: 0.9905
Iter: 18, acc: 0.9923
Iter: 19, acc: 0.9905
Iter: 20, acc: 0.9909

SkySeraph AI 实践到理论系列
TensorFlow基础 Android+TensorFlow+CNN+MNIST 手写数字识别实现
基于TensorFlow Slim库实现手写数字识别
本文介绍如何基于Tensorflow的Slim库，利用CNN（卷积神经网络）实现手写数字识别。本文GitHub源...
Tensorflow MNIST for Android
本篇博客主要介绍如何使用 tensorflow 通过 CNN 实现 MNIST 手写数字识别问题，并将模型持久化在...
tensorflow CNN实现手写数字识别
卷积神经网络 tf.nn.conv2d TF-卷积函数 tf.nn.conv2d 介绍 input: [batch...
Ocr手写识别学习
Ocr手写识别学习（三）本文将实现基于CNN的手写汉字识别 1.目的本篇文章将使用tensorflow搭建一个...
【转】Tensorflow 实现 MNIST 手写数字识别
原文 Tensorflow 实现 MNIST 手写数字识别本节笔记作为 Tensorflow 的 Hello W...
【TensorFlow实战——笔记】第3章：TensorFlow
3.2 TensorFlow实现Softmax Regression识别手写数字 MNIST(Mixed Nati...
学习笔记TF024:TensorFlow实现Softmax Re
TensorFlow实现Softmax Regression(回归)识别手写数字。MNIST(Mixed Nati...
基于tensorflow搭建一个复杂卷积神经网络模型（cifar
上一篇搭建了一个简单的cnn网络用来识别手写数字。基于tensorflow搭建一个简单的CNN模型(code) ...
TensorFlow读取MNIST数据集问题解决方法
TensorFlow实现Softmax Regression识别手写数字是学习TensorFlow的第一个小实验，...

tensorflow CNN实现手写数字识别

卷积神经网络

tf.nn.conv2d

图像卷积

3x3图像，1通道，1x1卷积核

3x3图像，5通道，1x1卷积核

3x3图像，5通道，3x3卷积核，文本卷积维度和图像一样大，所以卷积之后只有1列

5x5图像，5通道，3x3卷积核

5x5图像，5通道，3x3卷积核

5x5图像，5通道，3x3卷积核，3个卷积核

5x5图像，5通道，3x3卷积核，3个卷积核，步长为2

5x5图像，5通道，3x3卷积核，3个卷积核，步长为2，10张图像

文本卷积

5x5文本，1通道，2-gram卷积核，3个卷积核，10句话

tf.nn.max_pool

4x4图像，2通道，池化核2x2，步长1

4x4图像，2通道，池化核2x2，步长2

CNN手写数字识别

相关文章

SkySeraph AI 实践到理论系列

基于TensorFlow Slim库实现手写数字识别

Tensorflow MNIST for Android

tensorflow CNN实现手写数字识别

Ocr手写识别学习

【转】Tensorflow 实现 MNIST 手写数字识别

【TensorFlow实战——笔记】第3章：TensorFlow

学习笔记TF024:TensorFlow实现Softmax Re

基于tensorflow搭建一个复杂卷积神经网络模型（cifar

TensorFlow读取MNIST数据集问题解决方法

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读