检测是计算机视觉任务中的主要任务之一,而且应用很广泛。检测技术可以帮助人类检测那些容易被肉眼忽略的错误;也可以”帮助“自动驾驶汽车感知空间信息。无疑自动化的检测技术的广泛应用将为我们带来效率与安全。
本篇是这个系列的第三篇。整个系列目录如下:
- 理解颜色模型与在图像上绘制图形(图像处理基本操作)。
- 基本的图像处理与滤波技术。
- 从特征检测到人脸检测。
- 轮廓检测
之前已经介绍了几种颜色模型以及如何在图像上绘制图形。还介绍了常用的图像处理技术,如:模糊、梯度、腐蚀、扩张等。本篇将把这些技术应用到图像特征检测和人脸检测中。
本篇会用到本系列前两篇中介绍的图像处理技术。
边缘检测 (Edge Detection)
边缘检测本质上是检测图像中变化剧烈或者不连续的像素点。将这些像素点连接线段即为边。实际上,在上一篇文章中我们已经介绍了一种基础的边缘检测技术:使用Sobel算子和拉普拉斯算子进行梯度滤波。通过计算图像像素值在给定方向上的导数,梯度滤波器即可以描绘出图像的边缘从而实现边缘检测。
Canny检测算法是另外一种图像边缘检测技术。而且是目前最流行的边缘检测技术之一,分为以下四个步骤实现:降噪、判断梯度及梯度方向、非最大值抑制和滞后阈值化处理。
首先通过高斯模糊技术实现降噪。然后,使用sobel算子得到图像梯度。接着使用得到的梯度,检测每一个像素点与其中周围的像素点,确认这个像素点是不是这些局部像素点中的局部最大值。如果不是局部最大值,则将这个点的像素值置为零(完全缺失,黑色)。这个过程即为非极大值抑制。
<tt-image data-tteditor-tag="tteditorTag" contenteditable="false" class="syl1557038377487" data-render-status="finished" data-syl-blot="image" style="box-sizing: border-box; cursor: text; color: rgb(34, 34, 34); font-family: "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "WenQuanYi Micro Hei", "Helvetica Neue", Arial, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; display: block;"> image<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
Python学习交流群:1004391443,这里是python学习者聚集地,有大牛答疑,有资源共享!小编也准备了一份python学习资料,有想学习python编程的,或是转行,或是大学生,还有工作中想提升自己能力的,正在学习的小伙伴欢迎加入学习。
如果这个点被确认为局部最大值,则进行下一步即第四个步骤。第四步是决定之前检测出的边是否为真正边缘的最后一个决策阶段。这一决策阶段被称为滞后阈值化,它需要两个阈值(“较小阈值”、“较大阈值”)来进行决策。
给定两个不同的阈值,我们可以得到三个阈值化区间。因此,如果这个点的像素值大于两个阈值中的“较大阈值”则被判定为边缘点。相对地,如果其小于所设定的两个阈值参数中的“较小阈值”则被认定为非边缘点,即会被丢弃。另外,如果这个点的像素值位于两个参数阈值之间则是跟据其是否与”确认边缘点“之间有连接来决定是否丢弃,遵循有连接则不丢弃的原则。
<pre spellcheck="false" style="box-sizing: border-box; margin: 5px 0px; padding: 5px 10px; border: 0px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-weight: 400; font-stretch: inherit; font-size: 16px; line-height: inherit; font-family: inherit; vertical-align: baseline; cursor: text; counter-reset: list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; background-color: rgb(240, 240, 240); border-radius: 3px; white-space: pre-wrap; color: rgb(34, 34, 34); letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">img = cv2.imread('images/giraffe.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
Canny detection without blurring
edges = cv2.Canny(image=img, threshold1=127, threshold2=127)
plt.figure(figsize = (20, 20))
plt.subplot(1, 2, 1); plt.imshow(img)
plt.axis('off')
plt.subplot(1, 2, 2); plt.imshow(edges)
plt.axis('off')
</pre>
<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
上方仅使用了一个阈值中值作判断,也没有进行图像模糊处理,边缘检测结果不是很理想。接下来让我们尝试不同的参数阈值设定:
<pre spellcheck="false" style="box-sizing: border-box; margin: 5px 0px; padding: 5px 10px; border: 0px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-weight: 400; font-stretch: inherit; font-size: 16px; line-height: inherit; font-family: inherit; vertical-align: baseline; cursor: text; counter-reset: list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; background-color: rgb(240, 240, 240); border-radius: 3px; white-space: pre-wrap; color: rgb(34, 34, 34); letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"># Set the lower and upper threshold
med_val = np.median(img)
lower = int(max(0, .7med_val))
upper = int(min(255, 1.3med_val))
</pre>
为了更直观的比较模糊化对图像边缘检测的影响,将使用两种不同尺寸的卷积核(5x5)与(9x9)。设定两种阈值参数,一种在上述阈值设定的基础上将“较大阈值”增加100。也就意味着我们会得到四种不同的组合结果图。如下:
<pre spellcheck="false" style="box-sizing: border-box; margin: 5px 0px; padding: 5px 10px; border: 0px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-weight: 400; font-stretch: inherit; font-size: 16px; line-height: inherit; font-family: inherit; vertical-align: baseline; cursor: text; counter-reset: list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; background-color: rgb(240, 240, 240); border-radius: 3px; white-space: pre-wrap; color: rgb(34, 34, 34); letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"># Blurring with ksize = 5
img_k5 = cv2.blur(img, ksize = (5, 5))
Canny detection with different thresholds
edges_k5 = cv2.Canny(img_k5, threshold1 = lower, threshold2 = upper)
edges_k5_2 = cv2.Canny(img_k5, lower, upper+100)
Blurring with ksize = 9
img_k9 = cv2.blur(img, ksize = (9, 9))
Canny detection with different thresholds
edges_k9 = cv2.Canny(img_k9, lower, upper)
edges_k9_2 = cv2.Canny(img_k9, lower, upper+100)
Plot the images
images = [edges_k5, edges_k5_2, edges_k9, edges_k9_2]
plt.figure(figsize = (20, 15))
for i in range(4):
plt.subplot(2, 2, i+1)
plt.imshow(images[i])
plt.axis('off')
plt.show()
</pre>
<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
正如上图所示,模糊化可以帮助减少噪声。我们在使用卷积核尺寸为(9x9)时得到了更好的结果。而且,在使用更大的“较大阈值”时得到了更好的边缘检测结果。
角点检测(Corner Detection)
角点检测是另一种广泛应用于目标检测、运动检测、视频目标追踪等领域的检测算法。图像处理中的角是什么?应该如何定义?在这里,我们把角看作是边相交的连接点。那我们怎么才能找到他们呢? 你可能会想到一个最基础的方式是先找到所有的边,然后找到它们相交的点。但实际上,还有另一种更高效的方法确认角点提高效率的方法,即Harris角点检测和Shi&Tomasi角点检测。接下来让我们来详细了解这两种算法。
这两种算法的工作原理如下。首先,检测出各个方向上像素强度值有很大变化的点。然后构造一个矩阵,从中提取特征值。通过这些特征值进行评分从而决定它是否是一个角。数学表达式如下所示。
<tt-image data-tteditor-tag="tteditorTag" contenteditable="false" class="syl1557038377525" data-render-status="finished" data-syl-blot="image" style="box-sizing: border-box; cursor: text; color: rgb(34, 34, 34); font-family: "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "WenQuanYi Micro Hei", "Helvetica Neue", Arial, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; display: block;"> image<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
现在让我们看看它们的代码实现。首先,需要把图片转换为灰度图。Harris角点检测可以通过OpenCV中的cv2.cornerHarris()函数实现。
<pre spellcheck="false" style="box-sizing: border-box; margin: 5px 0px; padding: 5px 10px; border: 0px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-weight: 400; font-stretch: inherit; font-size: 16px; line-height: inherit; font-family: inherit; vertical-align: baseline; cursor: text; counter-reset: list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; background-color: rgb(240, 240, 240); border-radius: 3px; white-space: pre-wrap; color: rgb(34, 34, 34); letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">img = cv2.imread('images/desk.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img_gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
Apply Harris corner detection
dst = cv2.cornerHarris(img_gray, blockSize = 2, ksize = 3, k = .04)
</pre>
参数blocksize是指定领域窗口设定的大小,k是Harris检测的自由参数对应上方公式中的k值。输出结构为得分R,我们将使用R得分检测角点。
<pre spellcheck="false" style="box-sizing: border-box; margin: 5px 0px; padding: 5px 10px; border: 0px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-weight: 400; font-stretch: inherit; font-size: 16px; line-height: inherit; font-family: inherit; vertical-align: baseline; cursor: text; counter-reset: list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; background-color: rgb(240, 240, 240); border-radius: 3px; white-space: pre-wrap; color: rgb(34, 34, 34); letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"># Spot the detected corners
img_2 = img.copy()
img_2[dst>0.01*dst.max()]=[255,0,0]
Plot the image
plt.figure(figsize = (20, 20))
plt.subplot(1, 2, 1); plt.imshow(img)
plt.axis('off')
plt.subplot(1, 2, 2); plt.imshow(img_2)
plt.axis('off')
</pre>
<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
下方是Shi-Tomasi角点检测的代码实现。使用函数cv2.goodFeaturesToTrack()实现。通过maxCorners参数指定最大角点个数。相应地,通过minDistance指定角点间的最小距离和角点评定的最小质量级别。得到检测到的角点后,使用圆圈标记这些角点,如下所示:
<pre spellcheck="false" style="box-sizing: border-box; margin: 5px 0px; padding: 5px 10px; border: 0px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-weight: 400; font-stretch: inherit; font-size: 16px; line-height: inherit; font-family: inherit; vertical-align: baseline; cursor: text; counter-reset: list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; background-color: rgb(240, 240, 240); border-radius: 3px; white-space: pre-wrap; color: rgb(34, 34, 34); letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"># Apply Shi-Tomasi corner detection
corners = cv2.goodFeaturesToTrack(img_gray, maxCorners = 50,
qualityLevel = 0.01,
minDistance = 10)
corners = np.int0(corners)
Spot the detected corners
img_2 = img.copy()
for i in corners:
x,y = i.ravel()
cv2.circle(img_2, center = (x, y),
radius = 5, color = 255, thickness = -1)
Plot the image
plt.figure(figsize = (20, 20))
plt.subplot(1, 2, 1); plt.imshow(img)
plt.axis('off')
plt.subplot(1, 2, 2); plt.imshow(img_2)
plt.axis('off')
</pre>
<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
001 (9)
人脸检测
人脸检测是一种识别图像中是否存在人脸以及人脸的位置的技术。人脸检测不同于人脸识别,人脸识别是通过一个人的脸来识别这个人。 所以人脸检测并不能告诉我们这个人脸是属于谁。
人脸检测本质上是一项分类任务,训练其分类物体是否存在来从而实现检测。基于Haar特征的级联分类器是OpenCV中常用的人脸检测模型之一。它已经在数千副图像上进行过预训练。理解该算法的四个关键点分别是:Haar特征提取、积分图像、Adaboost和级联分类器。
<tt-image data-tteditor-tag="tteditorTag" contenteditable="false" class="syl1557038377570" data-render-status="finished" data-syl-blot="image" style="box-sizing: border-box; cursor: text; color: rgb(34, 34, 34); font-family: "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "WenQuanYi Micro Hei", "Helvetica Neue", Arial, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; display: block;"> image<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
类haar特征(Haar-like features)是用于目标检测的数字图像特征,示例如上图。Haar特征这个名字来源于其与Harr小波的直观相似性,且Haar小波最初是由Alfred Haar提出的。在检测过程中,通过滑动窗口和滤波器上的卷积操作来确认这些特征是不是我们所需要的特征。如下方所示:
<tt-image data-tteditor-tag="tteditorTag" contenteditable="false" class="syl1557038377575" data-render-status="finished" data-syl-blot="image" style="box-sizing: border-box; cursor: text; color: rgb(34, 34, 34); font-family: "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "WenQuanYi Micro Hei", "Helvetica Neue", Arial, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; display: block;"> image<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
那么,我们具体如何来确定给定区域是否含有需要的特征呢? 如上方图片中所示。使用一个特定卷积核(上半区域是暗的,下半区域是亮的)得到每个区域像素值的平均值,并减去两者之间的差距。如果结果高于阈值(比如0.5),则可得出结果,其就是我们正在检测的特征。对每个内核重复这个过程,同时在图像上滑动窗口。
虽然这个计算过程并不复杂,但如果在正个图像重复这个过程计算量还是很大的。这也是积分图像要解决的主要问题。积分图像是一种图像表示方式,它是为了提高特征估计的速度与效率而衍生出来的。
如下图所示,左边是原始图像的像素值,右边是积分图像的像素值。从左上角开始计算给定矩形区域下像素的累加值。在积分图像上,将虚线框像素值的累加和填充在右边框的右下角处。
<tt-image data-tteditor-tag="tteditorTag" contenteditable="false" class="syl1557038377579" data-render-status="finished" data-syl-blot="image" style="box-sizing: border-box; cursor: text; color: rgb(34, 34, 34); font-family: "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "WenQuanYi Micro Hei", "Helvetica Neue", Arial, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; display: block;"> image<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
使用上方这个“预计算表”,我们可以通过子矩形(上图中红色、橙色、蓝色和紫色框)的值方便地得到某个区域的像素值总和。
所以积分图像可以帮助我们在一定程度上解决计算量过大的问题。但还不够,还存在着计算量优化的空间。当检测窗口位于没有目标或人脸的空白背景时,执行检测则会耗费不必要的计算量。这时就可以通过使用Adaboost和级联分类器,从而实现计算量进一步优化。
<tt-image data-tteditor-tag="tteditorTag" contenteditable="false" class="syl1557038377586" data-render-status="finished" data-syl-blot="image" style="box-sizing: border-box; cursor: text; color: rgb(34, 34, 34); font-family: "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "WenQuanYi Micro Hei", "Helvetica Neue", Arial, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; display: block;"> image<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
上图展示了级联分类器逐步构造的各个阶段,并对类haar特征进行排序。基本特征会在早期阶段被识别出来,后期只识别有希望成为目标特征的复杂特征。在每一个阶段,Adaboost模型都将由集成弱分类器进行训练。如果子部件或子窗口在前一阶段被分类为“不像人脸的区域”,则将被拒绝进入下一步。通过上述操作,只须考虑上一阶段筛选出来的特征,从而实现更高的速度。
我们的英雄在哪?
接下来让我们使用上述级联分类器实现漫威英雄面部检测--惊奇队长面部检测。
<tt-image data-tteditor-tag="tteditorTag" contenteditable="false" class="syl1557038377591" data-render-status="finished" data-syl-blot="image" style="box-sizing: border-box; cursor: text; color: rgb(34, 34, 34); font-family: "PingFang SC", "Hiragino Sans GB", "Microsoft YaHei", "WenQuanYi Micro Hei", "Helvetica Neue", Arial, sans-serif; font-size: 16px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-weight: 400; letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; white-space: pre-wrap; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-color: rgb(255, 255, 255); text-decoration-style: initial; text-decoration-color: initial; display: block;"> image<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
001 (15)
我们只须使用图像中的一部分即头部部分。首先,获取惊奇队长脸部周围感兴趣区域;然后把图像转换成灰度图。之所以只使用一个通道,是因为我们只对特征的像素值强度变化感兴趣。
<pre spellcheck="false" style="box-sizing: border-box; margin: 5px 0px; padding: 5px 10px; border: 0px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-weight: 400; font-stretch: inherit; font-size: 16px; line-height: inherit; font-family: inherit; vertical-align: baseline; cursor: text; counter-reset: list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; background-color: rgb(240, 240, 240); border-radius: 3px; white-space: pre-wrap; color: rgb(34, 34, 34); letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">cap_mavl = cv2.imread('images/captin_marvel.jpg')
Find the region of interest
roi = cap_mavl[50:350, 200:550]
roi = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY)
plt.imshow(roi, cmap = 'gray')
</pre>
<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
通过下方代码使用Haar级联分类器。
<pre spellcheck="false" style="box-sizing: border-box; margin: 5px 0px; padding: 5px 10px; border: 0px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-weight: 400; font-stretch: inherit; font-size: 16px; line-height: inherit; font-family: inherit; vertical-align: baseline; cursor: text; counter-reset: list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; background-color: rgb(240, 240, 240); border-radius: 3px; white-space: pre-wrap; color: rgb(34, 34, 34); letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"># Load Cascade filter
face_cascade = cv2.CascadeClassifier('haarcascades/haarcascade_frontalface_default.xml')
</pre>
接下来,我们将创建一个函数来检测人脸并在目标区域周围绘制一个矩形。为了检测人脸,我们可以使用上面加载的分类器face_cascade的. detectmulitscale()方法。它返回指定区域的四个点所以我们在那个位置画一个矩形。scaleFactor是一个参数,表示在每个图像尺度上图像大小减少了多少,minNeighbors表示每个候选矩形应该训练多少个邻居。现在我们把这个函数应用到图像上,看看结果。
<pre spellcheck="false" style="box-sizing: border-box; margin: 5px 0px; padding: 5px 10px; border: 0px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-weight: 400; font-stretch: inherit; font-size: 16px; line-height: inherit; font-family: inherit; vertical-align: baseline; cursor: text; counter-reset: list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; background-color: rgb(240, 240, 240); border-radius: 3px; white-space: pre-wrap; color: rgb(34, 34, 34); letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"># Create the face detecting function
def detect_face(img):
img_2 = img.copy()
face_rects = face_cascade.detectMultiScale(img_copy,
scaleFactor = 1.1,
minNeighbors = 3)
for (x, y, w, h) in face_rects:
cv2.rectangle(img_2, (x, y), (x+w, y+h), (255, 255, 255), 3)
return img_2
Detect the face
roi_detected = detect_face(roi)
plt.imshow(roi_detected, cmap = 'gray')
plt.axis('off')
</pre>
<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
正如看到的那样,haar级联分类器取得了不错的人脸检测效果。接下来,让我们尝试检测含有多张人脸的图片。
<pre spellcheck="false" style="box-sizing: border-box; margin: 5px 0px; padding: 5px 10px; border: 0px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-weight: 400; font-stretch: inherit; font-size: 16px; line-height: inherit; font-family: inherit; vertical-align: baseline; cursor: text; counter-reset: list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; background-color: rgb(240, 240, 240); border-radius: 3px; white-space: pre-wrap; color: rgb(34, 34, 34); letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;"># Load the image file and convert the color mode
avengers = cv2.imread('images/avengers.jpg')
avengers = cv2.cvtColor(avengers, cv2.COLOR_BGR2GRAY)
Detect the face and plot the result
detected_avengers = detect_face(avengers)
display(detected_avengers, cmap = 'gray')
</pre>
<input class="pgc-img-caption-ipt" placeholder="图片描述(最多50字)" value="" style="box-sizing: border-box; outline: 0px; color: rgb(102, 102, 102); position: absolute; left: 187.5px; transform: translateX(-50%); padding: 6px 7px; max-width: 100%; width: 375px; text-align: center; cursor: text; font-size: 12px; line-height: 1.5; background-color: rgb(255, 255, 255); background-image: none; border: 0px solid rgb(217, 217, 217); border-radius: 4px; transition: all 0.2s cubic-bezier(0.645, 0.045, 0.355, 1) 0s;"></tt-image>
很明显检测结果不完全准确。出现了错误捕捉“非人脸”目标以及丢失了部分“真实人脸”。有趣的是,它成功地探测到了蜘蛛侠,却把美国队长和黑寡妇的手误当成了眼睛。通常在人脸图像凸显出更加清晰的五官时,可以得到更好的人脸检测结果。
尝试检测自己的脸
接下来介绍使用网络摄像头检测人脸的实现方法。类似上方的实现方式。代码如下方所示。可以通过ESC按键终止退出检测。
<pre spellcheck="false" style="box-sizing: border-box; margin: 5px 0px; padding: 5px 10px; border: 0px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; font-variant-numeric: inherit; font-variant-east-asian: inherit; font-weight: 400; font-stretch: inherit; font-size: 16px; line-height: inherit; font-family: inherit; vertical-align: baseline; cursor: text; counter-reset: list-1 0 list-2 0 list-3 0 list-4 0 list-5 0 list-6 0 list-7 0 list-8 0 list-9 0; background-color: rgb(240, 240, 240); border-radius: 3px; white-space: pre-wrap; color: rgb(34, 34, 34); letter-spacing: normal; orphans: 2; text-align: left; text-indent: 0px; text-transform: none; widows: 2; word-spacing: 0px; -webkit-text-stroke-width: 0px; text-decoration-style: initial; text-decoration-color: initial;">import cv2
import numpy as np
Step 1. Define detect function
face_cascade = cv2.CascadeClassifier('haarcascades/haarcascade_frontalface_default.xml')
def detect_face(img):
img_copy = img.copy()
face_rects = face_cascade.detectMultiScale(img_copy)
for (x, y, w, h) in face_rects:
cv2.rectangle(img_copy, (x, y), (x+w, y+h), (255, 255, 255), 3)
return img_copy
Step 2. Call the cam
cap = cv2.VideoCapture(0)
while True:
ret, frame = cap.read(0)
frame = detect_face(frame)
cv2.imshow('Video Face Detection', frame)
c = cv2.waitKey(1)
if c == 27:
break
cap.release()
cv2.destroyAllWindows()
</pre>
总结
本篇介绍了传统的边缘检测、角点检测以及人脸检测方法。下篇将介绍轮廓检测技术等。敬请期待。
网友评论