Ocr手写识别学习

作者: IT技术官 | 来源:发表于2021-04-19 08:50 被阅读0次

Ocr手写识别学习
人工智能时代，手写图片文字怎么识别？
Python-图片文字识别
干货整理‖推荐5款OCR识别小程序，原来不用装app也能高效使用
Smartisan 上使用 One Step 结合白描更高效更准
百度Ocr文字识别
OCR识别（文档识别、文字识别）SDK
Tesseract OCR（光学字符识别）
固定格式表格自定义识别SDK
微信小程序身份证ocr识别

Ocr手写识别学习（三）

本文将实现基于CNN的手写汉字识别

1.目的

本篇文章将使用tensorflow搭建一个卷积神经网络(CNN)实现对手写汉字的识别。

2.数据来源

CASIA-HWDB官网.中的HWDB1.1，这个数据集来自于模式识别国家重点实验室。

3.数据预处理

首先，先将数据下载好并解压到指定文件夹，然后对数据进行一个可视化处理，看看这些数据到底是啥样子的。

def read_gnt_dir(gnt_dir=train_data_dir):

def one_file(f):

header_size = 10

while True:

header = np.fromfile(f, dtype='uint8', count=header_size)

if not header.size: break

print(header[0],header[1],header[2],header[3],header[4],header[5],header[6],header[7],header[8],header[9])

sample_size = header[0] + (header[1]<<8) + (header[2]<<16) +(header[3]<<24)

#print(sample_size)

tagcode = header[5] + (header[4]<<8)

width = header[6] + (header[7]<<8)

height = header[8] +(header[9]<<8)

if header_size + width*height != sample_size:

break

try:

image = np.fromfile(f,dtype='uint8', count=width*height).reshape((height, width))

except:

print (struct.pack('>H',tagcode).decode('gb2312'))

yield image, tagcode

for file_name in os.listdir(gnt_dir):

if file_name.endswith('.gnt'):

file_path = os.path.join(gnt_dir, file_name)

with open(file_path, 'rb') as f:

for image, tagcode inone_file(f):

yield image, tagcode

运行完成train里面总共有3755个文件夹，随机打开个文件夹

可以看到文件里的每张图片的写法都有所不同,还有一点不难发现，就是每张图片的分辨率有所不同。接下来就是对数据进行处理对数据进行增强操作，然后进行标签进行onehot编码转换

def setdata_image(i):

psize = abs(i.shape[0] - i.shape[1]) // 2

if i.shape[0] < i.shape[1]:

pdim = ((psize, psize), (0, 0))

else:

pdim = ((0, 0), (psize, psize))

i= np.lib.pad(i, pdim, mode='constant', constant_values=255)

i= scipy.misc.imresize(i, (64 - 4 * 2, 64 - 4 * 2))

i= np.lib.pad(i, ((4, 4), (4, 4)), mode='constant', constant_values=255)

assert i.shape == (64, 64)

i= i.flatten()

i= (i - 128) / 128

return i

def convert_to_one_hot(char):

vector = np.zeros(len(char_set))

vector[char_set.index(char)] = 1

return vector

处理完数据，接下来当然是进行模型的搭建啦！！！

4.利用tensorflow搭建模型

通过阅读论文参考搭建好模型,tensorboard中的Graph

用了三个卷积层，三个maxpooling层。

def handwriting_cnn():

x= tf.reshape(X, shape=[-1, 64, 64, 1])

weight_c1 = tf.Variable(tf.random_normal([3, 3, 1, 32], stddev=0.01))

bias_c1 = tf.Variable(tf.zeros([32]))

conv2_2_1 = tf.nn.relu(tf.nn.bias_add(tf.nn.conv2_2d(x, weight_c1,strides=[1, 1, 1, 1], padding='SAME'), bias_c1))

conv2_2_1 = tf.nn.max_pool(conv2_2_1, ksize=[1, 2, 2, 1], strides=[1, 2,2, 1], padding='SAME')

weight_c2 = tf.Variable(tf.random_normal([3, 3, 32, 64], stddev=0.01))

bias_c2 = tf.Variable(tf.zeros([64]))

conv2_2 = tf.nn.relu(tf.nn.bias_add(tf.nn.conv2_2d(conv2_2_1, weight_c2,strides=[1, 1, 1, 1], padding='SAME'), bias_c2))

conv2_2 = tf.nn.max_pool(conv2_2, ksize=[1, 2, 2, 1], strides=[1, 2, 2,1], padding='SAME')

weight_c3 = tf.Variable(tf.random_normal([3, 3, 64, 128], stddev=0.01))

bias_c3 = tf.Variable(tf.zeros([128]))

conv2_3 = tf.nn.relu(tf.nn.bias_add(tf.nn.conv2_2d(conv2_2, weight_c3,strides=[1, 1, 1, 1], padding='SAME'), bias_c3))

conv2_3 = tf.nn.max_pool(conv2_3, ksize=[1, 2, 2, 1], strides=[1, 2, 2,1], padding='SAME')

conv2_3 = tf.nn.dropoutpu(conv2_3, keep_prob)

wight_d = tf.Variable(tf.random_normal([8 * 32 * 64, 1024],stddev=0.01))

bias_d = tf.Variable(tf.zeros([1024]))

dense = tf.reshape(conv2_2, [-1, wight_d.get_shape().as_list()[0]])

dense = tf.nn.relu(tf.add(tf.matmul(dense, wight_d), bias_d))

dense = tf.nn.dropoutpu(dense, keep_prob)

w_outpu = tf.Variable(tf.random_normal([1024, label_size], stddev=0.01))

b_outpu = tf.Variable(tf.zeros([label_size]))

outpu = tf.add(tf.matmul(dense, w_outpu), b_outpu)

return outpu

5.得到结果

可以看出效果还可以最高的准确率可达98.889%

网友评论

本文标题：Ocr手写识别学习

本文链接：https://www.haomeiwen.com/subject/sdablltx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Ocr手写识别学习

相关文章