美文网首页
opencv3+python3.5成语填字游戏(一)印刷体汉字的

opencv3+python3.5成语填字游戏(一)印刷体汉字的

作者: mler801 | 来源:发表于2018-05-22 13:42 被阅读0次
    • 首先这是一个成语填字游戏,大概就是一张成语填字游戏图片,通过opencv图像识别后转为矩阵,再通过解算法,解出答案,在显示到图片上。

    GitHub源代码

    image
    本文采用投影分割法对印刷体汉字进行分割。

    投影分割是先水平方向投影,在竖直方向投影,或者先竖直方向再水平方向投影。本文选用先竖直,再水平。

    • 竖直投影。


    代码:

    <pre name="code" class="python" style="box-sizing: border-box; margin: 0px 0px 24px; padding: 0px 16px; overflow-x: auto; background-color: rgb(240, 240, 240); font-family: Consolas, Inconsolata, Courier, monospace; font-size: 12px; line-height: 20px; color: rgb(0, 0, 0); font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: normal; letter-spacing: normal; orphans: 2; text-align: justify; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px;">#针对的是印刷版的汉字,所以采用了投影法分割
    #此函数是行分割,结果是一行文字
    def YShadow(path):
        img  = cv2.imread(path)   #原图像
        gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) #灰度图像
        height,width = img.shape[:2]
    
        #blur = cv2.GaussianBlur(gray,(5,5),0) #高斯模糊
    
        blur = cv2.blur(gray,(8,8)) #均值模糊
        thresh = cv2.adaptiveThreshold(blur,255,1,1,11,2)  #自适应阈值分割
        temp = thresh
    
        if(width > 500 and height > 400): #图像字体较小时,需要进行膨胀操作
            kernel = np.ones((5,5),np.uint8) #卷积核
            dilation = cv2.dilate(thresh,kernel,iterations = 1) #膨胀操作使得单个文字图像被黑像素填充
            temp = dilation
    
        '''
        cv2.imshow('image',temp)
        cv2.waitKey(0)
        cv2.destroyAllWindows()
        '''
    
        perPixelValue = 1 #每个像素的值
        projectValArry = np.zeros(width, np.int8) #创建一个用于储存每列黑色像素个数的数组
    
        for i in range(0,height):
            for j in range(0,width):
                perPixelValue = temp[i,j]
                if (perPixelValue == 255): #如果是黑字,对应位置的值+1
                    projectValArry[i] += 1
           # print(projectValArry[i])
    
        canvas = np.zeros((height,width), dtype="uint8")
    
        for i in range(0,height):
            for j in range(0,width):
                perPixelValue = 255 #白色背景
                canvas[i, j] = perPixelValue
    
        for i in range(0,height):
            for j in range(0,projectValArry[i]):
                perPixelValue = 0 #黑色直方图投影
                canvas[i, width-j-1] = perPixelValue
        '''
        cv2.imshow('canvas',canvas)
        cv2.waitKey(0)
        cv2.destroyAllWindows()
        '''
    
        list = []
        startIndex = 0 #记录进入字符区的索引  
        endIndex = 0 #记录进入空白区域的索引  
        inBlock = 0 #是否遍历到了字符区内  
    
        for i in range(height):
            if (inBlock == 0 and projectValArry[i] != 0): #进入字符区
                inBlock = 1  
                startIndex = i
            elif (inBlock == 1 and projectValArry[i] == 0):#进入空白区
                endIndex = i
                inBlock = 0
                subImg = gray[startIndex:endIndex+1,0:width] #将对应字的图片截取下来
                #print(startIndex,endIndex+1)
                list.append(subImg)#添加这个字图像到list
        #print(len(list))
        return list</pre>
    
    • 水平投影
    #对行字进行单个字的分割
    def XShadow(path):
        img  = cv2.imread(path)       
        gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
        height,width = img.shape[:2]
       # print(height,width)
        #blur = cv2.GaussianBlur(gray,(5,5),0)
    
        blur = cv2.blur(gray,(8,8))
        thresh = cv2.adaptiveThreshold(blur,255,1,1,11,2) 
    
        if(width > 500):
            kernel = np.ones((4, 4),np.uint8) #卷积核
        else:
            kernel = np.ones((2, 2),np.uint8) #卷积核
        dilation = cv2.dilate(thresh,kernel,iterations = 1) #膨胀操作使得单个文字图像被黑像素填充
    
        '''
        cv2.imshow('image',thresh)
        cv2.waitKey(0)
        cv2.destroyAllWindows()
        '''
    
        perPixelValue = 1 #每个像素的值
        projectValArry = np.zeros(width, np.int8) #创建一个用于储存每列黑色像素个数的数组
    
        for i in range(0,width):
            for j in range(0,height):
                perPixelValue = dilation[j,i]
                if (perPixelValue == 255): #如果是黑字
                    projectValArry[i] += 1
           # print(projectValArry[i])
    
        canvas = np.zeros((height,width), dtype="uint8")
    
        for i in range(0,width):
            for j in range(0,height):
                perPixelValue = 255 #白色背景
                canvas[j, i] = perPixelValue
    
        for i in range(0,width):
            for j in range(0,projectValArry[i]):
                perPixelValue = 0 #黑色直方图投影
                canvas[height-j-1, i] = perPixelValue
        '''
        cv2.imshow('canvas',canvas)
        cv2.waitKey(0)
        cv2.destroyAllWindows()
        '''
    
        list = []
        startIndex = 0 #记录进入字符区的索引  
        endIndex = 0 #记录进入空白区域的索引  
        inBlock = 0 #是否遍历到了字符区内  
    
        for i in range(width):
            if (inBlock == 0 and projectValArry[i] != 0): #进入字符区
                inBlock = 1  
                startIndex = i
            elif (inBlock == 1 and projectValArry[i] == 0): #进入投影区
                endIndex = i
                inBlock = 0
                #subImg = gray[0:height, startIndex:endIndex+1] #endIndex+1
                #print(startIndex,endIndex+1)
                list.append([startIndex, 0, endIndex-startIndex-1, height])
        #print(len(list))
        return list</pre>
    

    分割完后,将对应图片样本存储到对应文件夹,每个字共10种样本

    将这些样本及标记保存后,分别加载到samples.npy, label.npy中。供后续的机器学习算法训练使用。

    下篇讲解填字图片汉字的提取与机器学习算法训练样本,识别汉字的过程。

    相关文章

      网友评论

          本文标题:opencv3+python3.5成语填字游戏(一)印刷体汉字的

          本文链接:https://www.haomeiwen.com/subject/cmmhjftx.html