神经网络实现Discuz验证码识别

作者: 加油啊斯蒂昏 | 来源:发表于2019-06-24 21:56 被阅读26次
    南京老门东一角

    最近自己尝试了网上的验证码识别代码项目,该小项目见以下链接:

    https://cuijiahua.com/blog/2018/01/dl_5.html

    数据也就用了作者上传的60000张Discuz验证码。作者是创建了一个 封装了所有的变量和函数,我看了他的代码之后自己尝试着不用类去实现该网络。

    作者说自己可以训练到90%以上的精度。然而我看了他的代码后发现,作者是用训练过的数据来进行测试,即训练集和测试集是一样的

    我想着,测试集应该是不能参与训练过程中的,比如说我们在做mnist手写数字识别的时候,训练集与测试集就一定是不一样的。

    from tensorflow.examples.tutorials.mnist import input_data
    mnist = input_data.read_data_sets("/tmp/data/", one_hot=False)
    
    Extracting /tmp/data/train-images-idx3-ubyte.gz
    Extracting /tmp/data/train-labels-idx1-ubyte.gz
    Extracting /tmp/data/t10k-images-idx3-ubyte.gz
    Extracting /tmp/data/t10k-labels-idx1-ubyte.gz
    

    于是我在自己实现的过程中,将数据集打乱后取10000个作为测试集,不参与训练,剩余的50000张验证码作为训练集。

    训练过程中发现只有将学习率设置为0.001时,loss才会降下去,太高,loss会卡在0.07;其次,我的训练精度最多只能到50%左右,但是我用训练数据来测试保存的模型,精度确实达到了90%,即作者看到的精度。不过这个模型不具有泛化能力,它在没见过的测试集上只有50%的精确度。

    同时这个代码还有问题:测算精确度时,一张图中4个验证码有两个错误的话,正确率是50%而不是0.当一张图中4个验证码识别有一个错误时,该验证码识别就应该是失败的。因此这个精确度实在是有相当大的水分。

    于是要考虑解决办法。首先我尝试着下调学习率,发现永远还是到50%就上不去了。

    接下来我在原来的3层卷积层上,又加了一层卷积层。然而这并没有提升多少精度。

    随后我又加入了一层全连接层,期望可以拟合得更好一些,但是这样让我陷入了麻烦。

    我的loss值卡在了0.07,无论我的学习率是0.1还是0.00001.哪怕迭代一百万次也是如此。这时候的测试精度只有······3%。

    我不知道是什么问题更不知道如何改进。

    这更让我觉得没有人带我,多么地难受;同时也更深刻地体验到理论知识是多么地重要(当然我一直知道)。

    我自己的代码附上,大家可以相互交流。数据可以在文章顶部的链接里下载,作者压缩好的。

    以下是训练脚本:(理论上python3和python2应该都能跑。我是用2写的)
    训练中我使用了学习率衰减,本来还想用dropout结果发现这个训练基本不给我过拟合的机会所以训练加了没有意义。

    from __future__ import print_function, division, absolute_import
    import tensorflow as tf
    import os
    import cv2
    import matplotlib.pyplot as plt 
    import random
    import numpy as np
    from optparse import OptionParser
    
    path = 'Discuz/' #存放数据的路径
    imgs = os.listdir(path) #以列表形式读取所有图片名称
    random.shuffle(imgs) #打乱
    max_steps = 1000000 #最大迭代步数
    save_path = 'model4cnn-1fcn' #保存模型的路径,会自动生成
    dropout = 1 #没用到
    
    trainnum = 50000 #定义训练集和测试集的大小
    testnum = 10000
    
    traindatas = imgs[:trainnum] #取出训练集和测试集及其标签
    trainlabels = list(map(lambda x: x.split('.')[0],traindatas))
    
    testdatas = imgs[trainnum:]
    testlabels = list(map(lambda x: x.split('.')[0],testdatas))
    
    #定义取数据集的指针
    train_ptr = 0
    test_ptr = 0
    
    def next_batch(batch=100, train_flag=True):
        global train_ptr
        global test_ptr
        batch_x = np.zeros([batch,30*100])
        batch_y = np.zeros([batch, 4*63])
    
        if train_flag == True:
            if batch + train_ptr < trainnum:
                trains = traindatas[train_ptr:(train_ptr+batch)]
                labels = trainlabels[train_ptr:(train_ptr+batch)]
                train_ptr += batch
            else:
                new_ptr = (train_ptr + batch) % trainnum 
                trains = traindatas[train_ptr:] + traindatas[:new_ptr]
                labels = trainlabels[train_ptr:] + traindatas[:new_ptr]
                train_ptr = new_ptr
    
            for index, train in enumerate(trains):
                img = np.mean(cv2.imread(path + train), -1)
                batch_x[index,:] = img.flatten() /255
            for index, label in enumerate(labels):
                batch_y[index,:] = text2vec(label)
    
        else:
            if batch + test_ptr < testnum:
                tests = testdatas[test_ptr:(test_ptr+batch)]
                labels = testlabels[test_ptr:(test_ptr+batch)]
                test_ptr += batch
            else:
                new_ptr = (test_ptr + batch) % testnum 
                tests = testdatas[test_ptr:] + testdatas[:new_ptr]
                labels = testlabels[test_ptr:] + testlabels[:new_ptr]
                test_ptr = new_ptr
    
            for index, test in enumerate(tests):
                img = np.mean(cv2.imread(path + test), -1)
                batch_x[index, :] = img.flatten() /255
            for index, label in enumerate(labels):
                batch_y[index,:] = text2vec(label)
    
        return batch_x, batch_y
    
    def text2vec(text):
        if len(text) > 4:
            raise ValueError('too long captcha')
    
        vector = np.zeros(4*63)
        def char2pos(c):
            if c == '_':
                k = 62
                return k
            k = ord(c)-48
            if k > 9:
                k = ord(c)-55
                if k > 35:
                    k = ord(c) - 61
                    if k > 61:
                        raise ValueError('No Map')
    
            return k
    
        for i, c in enumerate(text):
            idx = i*63 + char2pos(c)
            vector[idx] = 1
    
        return vector
    
    X = tf.placeholder(tf.float32, [None, 30*100])
    Y = tf.placeholder(tf.float32, [None,4*63])
    _lr = tf.placeholder(tf.float32)
    keep_prob = tf.placeholder(tf.float32)
    
    def conv2d(x, W, b, strides=1):
        x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
        x = tf.nn.bias_add(x, b)
        return tf.nn.relu(x)
    
    def max_pool2d(x, k=2):
        x = tf.nn.max_pool(
            x, ksize=[
                1, k, k, 1], strides=[
                1, k, k, 1], padding='SAME')
        return x
    
    weights = {
            'wc1': tf.Variable(0.01*tf.random_normal([3, 3, 1, 32])),
            'wc2': tf.Variable(0.01*tf.random_normal([3, 3, 32, 64])),
            'wc3': tf.Variable(0.01*tf.random_normal([3, 3, 64, 64])),
            'wc4': tf.Variable(0.01*tf.random_normal([3, 3, 64, 64])),
            'wf1': tf.Variable(0.01*tf.random_normal([2 * 7 * 64, 1024])),
            'wf2': tf.Variable(0.01*tf.random_normal([1024, 1024])),
            'wout': tf.Variable(0.01*tf.random_normal([1024, 4*63]))
            }
    
    biases = {
            'bc1': tf.Variable(0.1*tf.random_normal([32])),
            'bc2': tf.Variable(0.1*tf.random_normal([64])),
            'bc3': tf.Variable(0.1*tf.random_normal([64])),
            'bc4': tf.Variable(0.1*tf.random_normal([64])),
            'bf1': tf.Variable(0.1*tf.random_normal([1024])),
            'bf2': tf.Variable(0.1*tf.random_normal([1024])),
            'bout': tf.Variable(0.1*tf.random_normal([4*63]))
        }
    
    def conv_net(x, weights, biases, dropout):
        x = tf.reshape(x, [-1,100,30,1])
    
        conv1 = conv2d(x, weights['wc1'], biases['bc1'], 1)
        conv1 = max_pool2d(conv1, 2)
    
        conv2 = conv2d(conv1, weights['wc2'], biases['bc2'], 1)
        conv2 = max_pool2d(conv2, 2)
    
        conv3 = conv2d(conv2, weights['wc3'], biases['bc3'], 1)
        conv3 = max_pool2d(conv3, 2)
        
        conv4 = conv2d(conv3, weights['wc4'], biases['bc4'], 1)
        conv4 = max_pool2d(conv4, 2)
    
        fc1 = tf.reshape(
            conv4, shape=[-1, weights['wf1'].get_shape().as_list()[0]])
        fc1 = tf.matmul(fc1, weights['wf1'])
        fc1 = tf.add(fc1, biases['bf1'])
        fc1 = tf.nn.relu(fc1)
    
    
        out = tf.add(tf.matmul(fc1, weights['wout']), biases['bout'])
    
        return out
    
    
    output = conv_net(X, weights, biases, keep_prob)
    
    loss_op = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(
                logits=output, labels=Y))
    optimizer = tf.train.AdamOptimizer(learning_rate=_lr).minimize(loss_op)
    
    y = tf.reshape(output, [-1,4,63])
    y_ = tf.reshape(Y, [-1,4,63])
    
    correct_pred = tf.equal(tf.argmax(y, 2), tf.argmax(y_,2))
    accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
    init = tf.global_variables_initializer()
    lr = 0.001
    saver = tf.train.Saver()
    with tf.Session() as sess:
        sess.run(init)
        for step in range(1,1+max_steps):
            batch_x, batch_y = next_batch(100,True)
            loss_value,_ = sess.run([loss_op, optimizer],
                feed_dict = {X:batch_x, Y:batch_y, keep_prob:dropout,_lr:lr})
            if step % 10 == 0:
                batch_x_test, batch_y_test = next_batch(100, False)
                acc = sess.run(accuracy, 
                    feed_dict={X:batch_x_test, Y:batch_y_test,keep_prob:1})
                print('step{}, loss={}, accuracy={}'.format(step,loss_value, acc))
    
            if step % 500 == 0:
                random.shuffle(traindatas)
                trainlabels = list(map(lambda x: x.split('.')[0],traindatas))
    
            if step % 3000 == 0:
                lr *= 0.9
    
            if step % 10000 == 0:
                saver.save(sess, save_path + "/model.ckpt-%d" % step)
                print('model saved!')
    

    接下来是我写的一个直观观察训练效果的,新建一个脚本,添加如下代码,然后运行该脚本,将会随机展示4张验证码和你的预测结果,终端还会显示本次预测的精确度。

    from __future__ import print_function, division, absolute_import
    import tensorflow as tf
    import os
    import cv2
    import matplotlib.pyplot as plt 
    import random
    import numpy as np
    from datasplit import use
    #from optparse import OptionParser
    
    
    testnumber = 4 #要更改的话需要改画图部分的代码否则会出错
    path = 'Discuz/'
    imgs = os.listdir(path)
    model_path = 'model4cnn-1fcn/model.ckpt-500000' #读取你训练好的模型
    testdatas = random.sample(imgs,testnumber)
    testlabels = list(map(lambda x: x.split('.')[0],testdatas))
    #testnum = len(testdatas)
    #test_ptr = 0
    
    X = tf.placeholder(tf.float32, [None, 30*100])
    Y = tf.placeholder(tf.float32, [None,4*63])
    keep_prob = tf.placeholder(tf.float32)
    
    def text2vec(text):
        if len(text) > 4:
            raise ValueError('too long captcha')
    
        vector = np.zeros(4*63)
        def char2pos(c):
            if c == '_':
                k = 62
                return k
            k = ord(c)-48
            if k > 9:
                k = ord(c)-55
                if k > 35:
                    k = ord(c) - 61
                    if k > 61:
                        raise ValueError('No Map')
    
            return k
    
        for i, c in enumerate(text):
            idx = i*63 + char2pos(c)
            vector[idx] = 1
    
        return vector
    
    def vec2text(vec):
    
        char_pos = vec.nonzero()[0]
        text = []
        for i, c in enumerate(char_pos):
            char_at_pos = i #c/63
            char_idx = c % 63
            if char_idx < 10:
                char_code = char_idx + ord('0')
            elif char_idx < 36:
                char_code = char_idx - 10 + ord('A')
            elif char_idx < 62:
                char_code = char_idx - 36 + ord('a')
            elif char_idx == 62:
                char_code = ord('_')
            else:
                raise ValueError('error')
            text.append(chr(char_code))
        return "".join(text)
    
    batch_x = np.zeros([testnumber,30*100])
    batch_y = np.zeros([testnumber, 4*63])
    
    for index, test in enumerate(testdatas):
        img = np.mean(cv2.imread(path + test), -1)
        batch_x[index, :] = img.flatten() /255
    for index, label in enumerate(testlabels):
        batch_y[index, :] = text2vec(label)
    
    def conv2d(x, W, b, strides=1):
        x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
        x = tf.nn.bias_add(x, b)
        return tf.nn.relu(x)
    
    def max_pool2d(x, k=2):
        x = tf.nn.max_pool(
            x, ksize=[
                1, k, k, 1], strides=[
                1, k, k, 1], padding='SAME')
        return x
    
    weights = {
            'wc1': tf.Variable(0.01*tf.random_normal([3, 3, 1, 32])),
            'wc2': tf.Variable(0.01*tf.random_normal([3, 3, 32, 64])),
            'wc3': tf.Variable(0.01*tf.random_normal([3, 3, 64, 64])),
            'wc4': tf.Variable(0.01*tf.random_normal([3, 3, 64, 64])),
            'wf1': tf.Variable(0.01*tf.random_normal([2 * 7 * 64, 1024])),
            'wf2': tf.Variable(0.01*tf.random_normal([1024, 1024])),
            'wout': tf.Variable(0.01*tf.random_normal([1024, 4*63]))
            }
    
    biases = {
            'bc1': tf.Variable(0.1*tf.random_normal([32])),
            'bc2': tf.Variable(0.1*tf.random_normal([64])),
            'bc3': tf.Variable(0.1*tf.random_normal([64])),
            'bc4': tf.Variable(0.1*tf.random_normal([64])),
            'bf1': tf.Variable(0.1*tf.random_normal([1024])),
            'bf2': tf.Variable(0.1*tf.random_normal([1024])),
            'bout': tf.Variable(0.1*tf.random_normal([4*63]))
        }
    
    def conv_net(x, weights, biases, dropout):
        x = tf.reshape(x, [-1,100,30,1])
    
        conv1 = conv2d(x, weights['wc1'], biases['bc1'], 1)
        conv1 = max_pool2d(conv1, 2)
    
        conv2 = conv2d(conv1, weights['wc2'], biases['bc2'], 1)
        conv2 = max_pool2d(conv2, 2)
    
        conv3 = conv2d(conv2, weights['wc3'], biases['bc3'], 1)
        conv3 = max_pool2d(conv3, 2)
        
        conv4 = conv2d(conv3, weights['wc4'], biases['bc4'], 1)
        conv4 = max_pool2d(conv4, 2)
        
        fc1 = tf.reshape(
            conv4, shape=[-1, weights['wf1'].get_shape().as_list()[0]])
        fc1 = tf.matmul(fc1, weights['wf1'])
        fc1 = tf.add(fc1, biases['bf1'])
        fc1 = tf.nn.relu(fc1)
    
        out = tf.add(tf.matmul(fc1, weights['wout']), biases['bout'])
    
        return out
    
    output = conv_net(X, weights, biases, keep_prob)
    
    y = tf.reshape(output, [-1,4,63])
    y_ = tf.reshape(Y, [-1,4,63])
    
    predict = tf.argmax(y,2)
    correct_pred = tf.equal(predict, tf.argmax(y_,2))
    accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
    saver = tf.train.Saver()
    
    with tf.Session() as sess:
        saver.restore(sess, model_path)
    
        pred, acc = sess.run([predict,accuracy], feed_dict ={ X:batch_x, Y:batch_y,keep_prob:1})
        print('accuracy={}'.format(acc))
        for i in range(1,testnumber+1):
    
            plt.subplot(2,2,i)
            img = cv2.imread(path+testdatas[i-1])
            plt.imshow(img)
            plt.title('number%d' %i)
            plt.xticks([])
            plt.yticks([])
            vect = np.zeros([4*63])
    
            #print(pred[i-1])
            for ind,j in enumerate(pred[i-1]):
                vect[ind*63+j] = 1
    
            xlabel = 'True label:{};Pred label:{}'.format(testlabels[i-1], vec2text(vect))
            plt.xlabel(xlabel)
    
        plt.show()
    

    有任何问题欢迎讨论。

    相关文章

      网友评论

        本文标题:神经网络实现Discuz验证码识别

        本文链接:https://www.haomeiwen.com/subject/aitwqctx.html