1.4 神经网络入门-数据处理与模型图构建

作者: 9c0ddf06559c | 来源:发表于2018-09-24 18:22 被阅读73次

    1.4 数据处理与模型图构建

    • transflow 搭建 : https://www.tensorflow.org/install/install_mac?hl=zh_cn

    • 使用的是cifar-10 的数据集 https://www.cs.toronto.edu/~kriz/cifar.html

      The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.

      CIFAR-10 下载下来的python版本文件目录结构如下

      (deeplearning-0jfGJJsa) ~/project/pycharm/deeplearning/01_nn ᐅ tree cifar-10-batches-py
      cifar-10-batches-py
      ├── batches.meta
      ├── data_batch_1 # 一共5个batch的训练数据集,每个batch中有10000张图片和对应的数据
      ├── data_batch_2
      ├── data_batch_3
      ├── data_batch_4
      ├── data_batch_5
      ├── readme.html
      └── test_batch # 测试数据集
      
      0 directories, 8 files
      
    • 关于jupyternotebook修改默认环境的文章:https://www.jianshu.com/p/f70ea020e6f9

    • 直观的显示一张图片

      • 导入数据集
    import pickle
    import numpy as np
    import os
    
    CIFAR_DIR = './cifar-10-batches-py'
    print(os.listdir(CIFAR_DIR))
    
      ['data_batch_1', 'readme.html', 'batches.meta', 'data_batch_2', 'data_batch_5', 'test_batch', 'data_batch_4', 'data_batch_3']
    
    with open(os.path.join(CIFAR_DIR, "data_batch_1"), 'rb') as f:
        data = pickle.load(f, encoding='bytes')
        print(type(data))
        print(data.keys())
        print(type(data[b'data']))
        print(type(data[b'labels']))
        print(type(data[b'batch_label']))
        print(type(data[b'filenames']))
        print(data[b'data'].shape) # 32 * 32 (像素点)* 3(rbg三通道) # RR-GG-BB
        print(data[b'data'][0:2])
        print(data[b'labels'][0:2])
        print(data[b'batch_label'])
        print(data[b'filenames'][0:2])
    
      <class 'dict'>
      dict_keys([b'batch_label', b'labels', b'data', b'filenames'])
      <class 'numpy.ndarray'>
      <class 'list'>
      <class 'bytes'>
      <class 'list'>
      (10000, 3072)
      [[ 59  43  50 ... 140  84  72]
       [154 126 105 ... 139 142 144]]
      [6, 9]
      b'training batch 1 of 5'
      [b'leptodactylus_pentadactylus_s_000004.png', b'camion_s_000148.png']
    
    • 显示一张图片
    image_arr = data[b'data'][100]
    image_arr = image_arr.reshape((3,32,32)) # 需要32,32,3
    image_arr = image_arr.transpose((1,2,0))
    
    from matplotlib.pyplot import imshow
    
    imshow(image_arr)
    
      <matplotlib.image.AxesImage at 0x12311e668>
    
    image.png
    • 模型图构建

      import tensorflow as tf
      import pickle
      import numpy as np
      import os
      
      CIFAR_DIR = './cifar-10-batches-py'
      print(os.listdir(CIFAR_DIR))
      
      def load_data(filename):
          """read data from data file."""
          with open(filename, 'rb') as f:
              data = pickle.load(f, 'bytes')
              return data[b'data'], data[b'labels']
      
      # placeholder 占位符,可以代表一个变量,可以用来构建计算图,有数据来的时候就把数据传入
      # None 代表输入的样本数目是不确定的,3072 是变量的维度
      x = tf.placeholder(tf.float32, [None, 3072]) 
      y = tf.placeholder(tf.int64, [None]) # 只有一个维度就省略了
      
      # get_variable 如果已经定义了则使用,否则使用指定方式定义
      # 'w':变量名
      # [x.get_shape()[-1],1]: 指定变量的维度
      # initializer: 初始化变量的方式
      w = tf.get_variable('w', [x.get_shape()[-1],1], 
                         initializer = tf.random_normal_initializer(0, 1)) 
      b = tf.get_variable('b', [1], 
                         initializer = tf.constant_initializer(0.0))
      
      # matmul 矩阵乘法 
      # x (None,3072) w (3072,1) b(3072,1)
      # y_ = [None,3072]*[3072,1] + [3072,1] = [None,1]
      y_ = tf.matmul(x, w) + b
      # 激活函数 (None,1)
      p_y_1 = tf.nn.sigmoid(y_)
      
      # (None,1)
      y_reshaped = tf.reshape(y, (-1,1))
      y_reshaped_float = tf.cast(y_reshaped, tf.float32)
      loss = tf.reduce_mean(tf.square(y_reshaped_float - p_y_1))
      
      # bool
      predict = p_y_1 > 0.5
      # [1,0,1,1,0,0,0]
      correct_prediction = tf.equal(tf.cast(predict,tf.int64), y_reshaped)
      # 3/7
      accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float64))
      
      with tf.name_scope('train_op'):
          # AdamOptimizer 是梯度下降的一个变种,minimize指定在哪个变量上做
          train_op = tf.train.AdamOptimizer(1e-3).minimize(loss)
      
      

    相关文章

      网友评论

        本文标题:1.4 神经网络入门-数据处理与模型图构建

        本文链接:https://www.haomeiwen.com/subject/mjlhoftx.html