突然发现可以用markdown插入代码的╮(╯▽╰)╭
data:image/s3,"s3://crabby-images/e7252/e7252404f74774898263f90fedba68496e027b68" alt=""
如不对验证码图片进行处理,识别结果如图2,结果并不理想。
data:image/s3,"s3://crabby-images/325cf/325cfb01981d31fbbcd197d7f628b41540428887" alt=""
可以使用ImageEnhance方法对图像进行处理,首先改变图像的对比度,使用ImageEnhance.Contrast(对比度值)来调节图像对比度:
enhancer = ImageEnhance.Contrast(image)
for i in range(9):
enhancer.enhance(i*0.5).save("E:\\code_contrast_"+str(i*0.5)+".png")
将处理后的不同对比度图片保存下来:
data:image/s3,"s3://crabby-images/e308d/e308d17f52bfdcf483f1a9a5f4bf847cecea0121" alt=""
在下一步,我们使用对比度为4的图片进行处理。
将图片RGB模式转换为黑白(“1”)或灰度模式(“L”),代码如下:
enhancer = ImageEnhance.Contrast(image)
image = enhancer.enhance(4)
image = image.convert('1').save("E:\\code_binary.png")
得
data:image/s3,"s3://crabby-images/d126a/d126a5e82d40fca54a4c1a7eef728213978901b4" alt=""
下一步需要对黑白两色的图片进行去噪
对于每一个像素点来说,计算其九宫格中黑点个数,若周边黑点个数小于3个,就判别该点为噪点,将其像素值置为255(白色)
#去噪
image = Image.open("E:\\code_binary.png")
width = image.size[0]
height = image.size[1]
def remove_noise(image, x, y, width, height):
# 注:getpixel里面的参数是个元组
loc = image.getpixel((x,y))
# 255为白色
if loc == 255:
return
loc_x = x
loc_y = y
black_num = 0
for x in range(loc_x - 1, loc_x + 2):
for y in range(loc_y - 1, loc_y + 2):
if x >= 0 and y >= 0 and x < width and y < height:
if image.getpixel((x,y)) == 0:
black_num = black_num + 1
if black_num < 4:
image.putpixel((loc_x, loc_y), 255)
return
for x in range(width):
for y in range(height):
remove_noise(image, x, y, width, height)
image.save("E:\\code_remove_noise.png")
得
data:image/s3,"s3://crabby-images/e0aa5/e0aa5c7bc9c6449b590abf34afd084598059becc" alt=""
识别后得
data:image/s3,"s3://crabby-images/65ba7/65ba763b9e2dd72fd651b3783efb9220464130a6" alt=""
整体代码如下:
import tesserocr
from PIL import Image,ImageEnhance
image = Image.open('E:\code.jpg')
# # 改变对比度进行测试,选用对比度为4的图片
# enhancer = ImageEnhance.Contrast(image)
# for i in range(9):
# enhancer.enhance(i*0.5).save("E:\\code_contrast_"+str(i*0.5)+".png")
enhancer = ImageEnhance.Contrast(image)
image = enhancer.enhance(4)
# convert 将“RGB”转换为其他模式 “1”为二值图像,仅黑白两色 “L”为灰色图像,每个像素用8个bit表示,0为黑,255为白
image.convert('1').save("E:\\code_binary.png")
#去噪
image = Image.open("E:\\code_binary.png")
width = image.size[0]
height = image.size[1]
def remove_noise(image, x, y, width, height):
# 注:getpixel里面的参数是个元组
loc = image.getpixel((x,y))
# 255为白色
if loc == 255:
return
loc_x = x
loc_y = y
black_num = 0
for x in range(loc_x - 1, loc_x + 2):
for y in range(loc_y - 1, loc_y + 2):
if x >= 0 and y >= 0 and x < width and y < height:
if image.getpixel((x,y)) == 0:
black_num = black_num + 1
if black_num < 4:
image.putpixel((loc_x, loc_y), 255)
return
for x in range(width):
for y in range(height):
remove_noise(image, x, y, width, height)
image.save("E:\\code_remove_noise.png")
result = tesserocr.image_to_text(image)
print(result)
网友评论