深度学习——CNN（4）

作者: 飘涯 | 来源:发表于2018-06-03 09:58 被阅读153次

深度学习——CNN（4）
BAT机器学习面试1000题系列（二）
图像识别模型学习
2019-08-14 深度学习框架之一 - R-CNN 行人检测
【深度学习】目标检测算法总结（R-CNN、Fast R-CNN、
深度学习（五）下 Pytorch实现简单CNN
深度学习 RNN基础
深度学习之目标检测-RCNN系列
Tensorflow实现Neural Style
深度学习——CNN(2)

前言：自己构建CNN网络结构训练一个验证码识别的模型

分析

假定验证码中只有：数字、大小写字母，验证码的数目是4个，eg: kx3S
步骤如下：
1.收集数据，验证码的数据集合可以自己生成
生成的验证码如下：

image.png

代码如下：

def random_code_text(code_size=4):
    """
    随机产生验证码的字符
    :param code_size:
    :return:
    """
    code_text = []
    for i in range(code_size):
        c = random.choice(code_char_set)
        code_text.append(c)
    return code_text


def generate_code_image(code_size=4):
    """
    产生一个验证码的Image对象
    :param code_size:
    :return:
    """
    image = ImageCaptcha()
    code_text = random_code_text(code_size)
    code_text = ''.join(code_text)
    # 将字符串转换为验证码(流)
    captcha = image.generate(code_text)
    # 如果要保存验证码图片
    # image.write(code_text, 'captcha/' + code_text + '.jpg')

    # 将验证码转换为图片的形式
    code_image = Image.open(captcha)
    code_image = np.array(code_image)

    return code_text, code_image

2.数据处理
将生成的数据转化为可以网络模型想要的格式，也可以用亚编码的格式

def text_2_vec(text):
    vec = np.zeros((code_size, code_char_set_size))
    k = 0
    for ch in text:
        index = code_char_2_number_dict[ch]
        vec[k][index] = 1
        k += 1
    return np.array(vec.flat)

3.网络构建
可以采用三成网络结构进行模型构建

def code_cnn(x, y):
    """
    构建一个验证码识别的CNN网络
    :param x:  Tensor对象，输入的特征矩阵信息，是一个4维的数据:[number_sample, height, weight, channels]
    :param y:  Tensor对象，输入的预测值信息，是一个2维的数据，其实就是验证码的值[number_sample, code_size]
    :return: 返回一个网络
    """
    """
    网络结构：构建一个简单的CNN网络，因为起始此时验证码图片是一个比较简单的数据，所以不需要使用那么复杂的网络结构，当然了：这种简单的网络结构，80%+的正确率是比较容易的，但是超过80%比较难
    conv -> relu6 -> max_pool -> conv -> relu6 -> max_pool -> dropout -> conv -> relu6 -> max_pool -> full connection -> full connection
    """
    # 获取输入数据的格式，[number_sample, height, weight, channels]
    x_shape = x.get_shape()
    # kernel_size_k: 其实就是卷积核的数目
    kernel_size_1 = 32
    kernel_size_2 = 64
    kernel_size_3 = 64
    unit_number_1 = 124
    unit_number_2 = code_size * code_char_set_size


    with tf.variable_scope('net', initializer=tf.random_normal_initializer(0, 0.1), dtype=tf.float32):
        with tf.variable_scope('conv1'):
            w = tf.get_variable('w', shape=[5, 5, x_shape[3], kernel_size_1])
            b = tf.get_variable('b', shape=[kernel_size_1])
            net = tf.nn.conv2d(x, w, strides=[1, 1, 1, 1], padding='SAME')
            net = tf.nn.bias_add(net, b)
        with tf.variable_scope('relu1'):
            # relu6和relu的区别：relu6当输入的值大于6的时候，返回6，relu对于大于0的值不进行处理，relu6相对来讲具有一个边界
            # relu: max(0, net)
            # relu6: min(6, max(0, net))
            net = tf.nn.relu6(net)
        with tf.variable_scope('max_pool1'):
            net = tf.nn.max_pool(net, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
        with tf.variable_scope('conv2'):
            w = tf.get_variable('w', shape=[3, 3, kernel_size_1, kernel_size_2])
            b = tf.get_variable('b', shape=[kernel_size_2])
            net = tf.nn.conv2d(net, w, strides=[1, 1, 1, 1], padding='SAME')
            net = tf.nn.bias_add(net, b)
        with tf.variable_scope('relu2'):
            net = tf.nn.relu6(net)
        with tf.variable_scope('max_pool2'):
            net = tf.nn.max_pool(net, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
        with tf.variable_scope('dropout1'):
            tf.nn.dropout(net, keep_prob=keep_prob)
        with tf.variable_scope('conv3'):
            w = tf.get_variable('w', shape=[3, 3, kernel_size_2, kernel_size_3])
            b = tf.get_variable('b', shape=[kernel_size_3])
            net = tf.nn.conv2d(net, w, strides=[1, 1, 1, 1], padding='SAME')
            net = tf.nn.bias_add(net, b)
        with tf.variable_scope('relu3'):
            net = tf.nn.relu6(net)
        with tf.variable_scope('max_pool3'):
            net = tf.nn.max_pool(net, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
        with tf.variable_scope('fc1'):
            net_shape = net.get_shape()
            net_sample_feature_number = net_shape[1] * net_shape[2] * net_shape[3]
            net = tf.reshape(net, shape=[-1, net_sample_feature_number])
            w = tf.get_variable('w', shape=[net_sample_feature_number, unit_number_1])
            b = tf.get_variable('b', shape=[unit_number_1])
            net = tf.add(tf.matmul(net, w), b)
        with tf.variable_scope('softmax'):
            w = tf.get_variable('w', shape=[unit_number_1, unit_number_2])
            b = tf.get_variable('b', shape=[unit_number_2])
            net = tf.add(tf.matmul(net, w), b)
    return net

4.训练模型

def train_code_cnn(model_path):
    """
    模型训练
    :param model_path:
    :return:
    """
    # 1. 构建相关变量：占位符
    in_image_height = 60
    in_image_weight = 160
    x = tf.placeholder(tf.float32, shape=[None, in_image_height, in_image_weight, 1], name='x')
    y = tf.placeholder(tf.float32, shape=[None, code_size * code_char_set_size], name='y')
    # 1. 获取网络结构
    network = code_cnn(x, y)
    # 2. 构建损失函数（如果四个位置的值，只要有任意一个预测失败，那么我们损失就比较大）
    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=network, labels=y))
    # 3. 定义优化函数
    train = tf.train.AdamOptimizer(learning_rate=0.0001).minimize(cost)
    # 4. 计算准确率
    predict = tf.reshape(network, [-1, code_size, code_char_set_size])
    max_idx_p = tf.argmax(predict, 2)
    max_idx_y = tf.argmax(tf.reshape(y, [-1, code_size, code_char_set_size]), 2)
    correct = tf.equal(max_idx_p, max_idx_y)
    accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))

    # 5. 开始训练
    saver = tf.train.Saver()
    with tf.Session() as sess:
        # a. 变量的初始化
        sess.run(tf.global_variables_initializer())

        # b. 开始训练
        step = 1
        while True:
            # 1. 获取批次的训练数据
            batch_x, batch_y = random_next_batch(batch_size=64, code_size=code_size)
            # 2. 对数据进行一下处理
            batch_x = tf.image.rgb_to_grayscale(batch_x)
            # print(batch_x.shape)
            batch_x = tf.image.resize_images(batch_x, size=(in_image_height, in_image_weight),
                                             method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
            # 3. 训练
            _, cost_, accuracy_ = sess.run([train, cost, accuracy], feed_dict={x: batch_x.eval(), y: batch_y})
            print("Step:{}, Cost:{}, Accuracy:{}".format(step, cost_, accuracy_))

            # 4. 每10次输出一次信息
            if step % 10 == 0:
                test_batch_x, test_batch_y = random_next_batch(batch_size=64, code_size=code_size)
                # 2. 对数据进行一下处理
                test_batch_x = tf.image.rgb_to_grayscale(test_batch_x)
                test_batch_x = tf.image.resize_images(test_batch_x, size=(in_image_height, in_image_weight),
                                                      method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
                acc = sess.run(accuracy, feed_dict={x: test_batch_x.eval(), y: test_batch_y})
                print("测试集准确率:{}".format(acc))

                # 如果模型准确率0.7，模型保存，然后退出
                if acc > 0.7 and accuracy_ > 0.7:
                    saver.save(sess, model_path, global_step=step)
                    break

            step += 1

5.测试模型
6.保存并且做预测
随机输出一个结果如下：

详细代码可见：https://github.com/dctongsheng/Lenet

深度学习——CNN（4）
前言：自己构建CNN网络结构训练一个验证码识别的模型分析假定验证码中只有：数字、大小写字母，验证码的数目是4个...
BAT机器学习面试1000题系列（二）
101.深度学习（CNN RNN Attention）解决大规模文本分类问题。用深度学习（CNN RNN Att...
图像识别模型学习
大话CNN经典模型：AlexNet 大话CNN经典模型：VGGNet 大话CNN经典模型：LeNet 基于深度学习...
2019-08-14 深度学习框架之一 - R-CNN 行人检测
本文章内容：一. 深度学习框架之一 - R-CNN 行人检测二.R-CNN算法概述一.深度学习框架之一 - ...
【深度学习】目标检测算法总结（R-CNN、Fast R-CNN、
申明，转载自【深度学习】目标检测算法总结（R-CNN、Fast R-CNN、Faster R-CNN、FPN、YO...
深度学习（五）下 Pytorch实现简单CNN
上一篇里尝试自己实现CNN 深度学习（五）Python徒手实现CNN，这篇用pytorch写一个结构相同的CNN作...
深度学习 RNN基础
深度学习 RNN基础目录 1、定义 2、有了CNN，为什么需要RNN？ 3、RNN的主要应用领域有哪些呢? 4、...
深度学习之目标检测-RCNN系列
参考文章 #Deep Learning回顾#之基于深度学习的目标检测基于深度学习的目标检测技术演进：R-CNN、...
Tensorflow实现Neural Style
最近深度学习里面最cool的一个模型CNN卷积神经网络，搞明白了cnn的基本模型之后，跑了几个CNN的模型，算是C...
深度学习——CNN(2)
前言：CNN的优化方法依旧可以是梯度下降的方法，类似于BP算法中的反向传播，一般采用小批量梯度下降的方法，来更新参...