美文网首页
方向梯度直方图

方向梯度直方图

作者: 原上的小木屋 | 来源:发表于2020-06-06 15:15 被阅读0次

    HOG(Histogram of Oriented Gradients)是一种表示图像特征量的方法。特征量是表示图像的状态等的向量集合。

    • 在图像识别(图像是什么)和检测(物体在图像中的哪个位置)中,我们需要:
    1. 从图像中获取特征量(特征提取);
    2. 基于特征量识别和检测(识别和检测)。
    • 通过以下算法获得HOG:
    1. 图像灰度化之后,在x方向和y方向上求出亮度的梯度:
    • x方向,gx=I(x+1,y)−I(x−1,y)
    • y方向,gy=I(x,y+1)−I(x,y−1)
    1. 从gx和gy确定梯度幅值和梯度方向
    • 梯度幅值 mag={\sqrt{[(gx)^2+(gy)^2]}}
    • 梯度方向 mag=arctan\frac{gy}{gx}
    1. 将梯度方向[0,180]进行9等分量化。也就是说,对于[0,20]量化为 index 0,对于[20,40]量化为 index 1
    2. 将图像划分为N×N个区域(该区域称为 cell),并作出 cell 内步骤3得到的 index 的直方图。
    3. C x C个 cell 被称为一个 block。对每个 block 内的 cell 的直方图通过下面的式子进行归一化。由于归一化过程中窗口一次移动一个 cell 来完成的,因此一个 cell 会被归一化多次,通常ϵ=1:
    • h(t)=\frac{h(t)}{\sqrt{\sum\ h(t)+\epsilon}}
    • 以上,求出 HOG 特征值。
    • 综上来说,前三步还是比较简单的,非常常规,图像转灰度,然后求出x方向y方向上的梯度,结合x方向和y方向的梯度,求出梯度幅值矩阵和梯度方向矩阵,对梯度方向矩阵进行量化,将方向归一到0-8九个值
    • 第四步有些难度,将图像按8*8的块进行切分,比如原图像为高240宽240的图像,切分后就变成高上有30宽上有30的900个小块,每个小块上结合梯度幅度图和量化后的梯度方向图,将梯度幅度归类到0-8对应的九个梯度方向上,这样很类似直方图归一到0-255的256个位置,不过这里是9个位置
    • 第五步同样理解起来很累,在第四步的基础上,以3X3的九个小块作为一个单元进行归一化,就是按照公式把第四个步骤中的每个小块在其范围内的9个块中进行归一
    • 第六步轮到画方向了,根据梯度方向找出计算出初始坐标,终点坐标,设置线宽,线的颜色就开始画线
    import cv2#导入opencv\numpy\matplotlib库
    import numpy as np
    import matplotlib.pyplot as plt
    # get HOG step1
    def HOG_step1(img):#HOG第一步函数
         # Grayscale
         def BGR2GRAY(img):#转灰度,注意numpy的写法
              gray = 0.2126 * img[..., 2] + 0.7152 * img[..., 1] + 0.0722 * img[..., 0]
              return gray
         # Magnitude and gradient计算幅度和梯度
         def get_gradXY(gray):#计算梯度,相当于还是在x、y方向上做一阶差分,类似sobel滤波
              H, W = gray.shape
              # padding before grad
              gray = np.pad(gray, (1, 1), 'edge')#numpy的一种写法,扩充外围一圈为0
              # get grad x
              gx = gray[1:H+1, 2:] - gray[1:H+1, :W]#x方向上做差分
              # get grad y
              gy = gray[2:, 1:W+1] - gray[:H, 1:W+1]#y方向做差分
              # replace 0 with 
              gx[gx == 0] = 1e-6#因为后期计算幅度时要用除法,消除gx里面的0
              return gx, gy
         # get magnitude and gradient得到幅度和梯度
         def get_MagGrad(gx, gy):
              # get gradient maginitude
              magnitude = np.sqrt(gx ** 2 + gy ** 2)#幅度计算公式
              # get gradient angle#梯度计算公式
              gradient = np.arctan(gy / gx)
              gradient[gradient < 0] = np.pi / 2 + gradient[gradient < 0] + np.pi / 2#消除梯度方向的负值
              return magnitude, gradient
         # Gradient histogram梯度直方图
         def quantization(gradient):#对梯度进行量化
              # prepare quantization table#准备量化表格
              gradient_quantized = np.zeros_like(gradient, dtype=np.int)
              # quantization base量化基
              d = np.pi / 9#以20°作为一个基准
              # quantization
              for i in range(9):
                   gradient_quantized[np.where((gradient >= d * i) & (gradient <= d * (i + 1)))] = i#将gradient_quantized矩阵中的值归一到1-9
              return gradient_quantized
         # 1. BGR -> Gray
         gray = BGR2GRAY(img)
         # 1. Gray -> Gradient x and y
         gx, gy = get_gradXY(gray)
         # 2. get gradient magnitude and angle
         magnitude, gradient = get_MagGrad(gx, gy)
         # 3. Quantization
         gradient_quantized = quantization(gradient)
         return magnitude, gradient_quantized
    # Read image
    img = cv2.imread("123.jpg").astype(np.float32)
    # get HOG step1
    magnitude, gradient_quantized = HOG_step1(img)
    # Write gradient magnitude to file
    _magnitude = (magnitude / magnitude.max() * 255).astype(np.uint8)#将幅度归一到0-255
    cv2.imwrite("out_mag.jpg", _magnitude)
    # Write gradient angle to file
    H, W, C = img.shape
    out = np.zeros((H, W, 3), dtype=np.uint8)
    # define color定义对应0-9的九种颜色
    C = [[255, 0, 0], [0, 255, 0], [0, 0, 255], [255, 255, 0], [255, 0, 255], [0, 255, 255],
         [127, 127, 0], [127, 0, 127], [0, 127, 127]]
    # draw color
    for i in range(9):
         out[gradient_quantized == i] = C[i]#画出量化后赋予不同颜色的梯度方向图像
    cv2.imwrite("out_gra.jpg", out)
    cv2.imshow("result", out)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    

    总结一下上述代码

    1. 图像转灰度
    2. 图像进行x方向和y方向上的一阶差分
    3. 由步骤2得到的两个矩阵构造出幅度矩阵和梯度方向矩阵
    4. 对梯度方向矩阵进行量化,给定0-9对应的标签
    5. 为0-9对应的标签赋予不同的颜色显示
    import cv2
    import numpy as np
    import matplotlib.pyplot as plt
    # get HOG step2
    def HOG_step2(img):
        # Grayscale
        def BGR2GRAY(img):#转灰度
            gray = 0.2126 * img[..., 2] + 0.7152 * img[..., 1] + 0.0722 * img[..., 0]
            return gray
        # Magnitude and gradient
        def get_gradXY(gray):#得到x、y方向上的梯度
            H, W = gray.shape
            # padding before grad
            gray = np.pad(gray, (1, 1), 'edge')
            # get grad x
            gx = gray[1:H+1, 2:] - gray[1:H+1, :W]
            # get grad y
            gy = gray[2:, 1:W+1] - gray[:H, 1:W+1]
            # replace 0 with 
            gx[gx == 0] = 1e-6
            return gx, gy
        # get magnitude and gradient
        def get_MagGrad(gx, gy):#得到幅度矩阵和梯度方向矩阵
            # get gradient maginitude
            magnitude = np.sqrt(gx ** 2 + gy ** 2)
            # get gradient angle
            gradient = np.arctan(gy / gx)
            gradient[gradient < 0] = np.pi / 2 + gradient[gradient < 0] + np.pi / 2
            return magnitude, gradient
        # Gradient histogram
        def quantization(gradient):#对梯度方向矩阵进行量化
            # prepare quantization table
            gradient_quantized = np.zeros_like(gradient, dtype=np.int)
            # quantization base
            d = np.pi / 9
            # quantization
            for i in range(9):
                gradient_quantized[np.where((gradient >= d * i) & (gradient <= d * (i + 1)))] = i
            return gradient_quantized  
        # get gradient histogram
        def gradient_histogram(gradient_quantized, magnitude, N=8):#将量化之后的矩阵、幅度矩阵以及cell大小N=8的参数传入梯度直方图函数
            # get shape
            H, W = magnitude.shape
            # get cell num
            cell_N_H = H // N
            cell_N_W = W // N
            histogram = np.zeros((cell_N_H, cell_N_W, 9), dtype=np.float32)#构造直方图矩阵,相当于高和宽缩小N倍,但是加了9个通道,对应9个量化之后的梯度方向,在每个通道赋予不同的颜色
            # each pixel
            for y in range(cell_N_H):
                for x in range(cell_N_W):
                    for j in range(N):
                        for i in range(N):#举例y=x=j=i=0则下式为
                            histogram[y, x, gradient_quantized[y * 4 + j, x * 4 + i]] += magnitude[y * 4 + j, x * 4 + i]#计算hisogram每个像素每个通道的取值
            return histogram#返回直方图
        # 1. BGR -> Gray
        gray = BGR2GRAY(img)
        # 1. Gray -> Gradient x and y
        gx, gy = get_gradXY(gray)
        # 2. get gradient magnitude and angle
        magnitude, gradient = get_MagGrad(gx, gy)
        # 3. Quantization
        gradient_quantized = quantization(gradient)
        # 4. Gradient histogram
        histogram = gradient_histogram(gradient_quantized, magnitude)
        return histogram
    # Read image
    img = cv2.imread("123.jpg").astype(np.float32)
    # get HOG step2
    histogram = HOG_step2(img)       
    # write histogram to file
    for i in range(9):#画出每个通道的图像
        plt.subplot(3,3,i+1)
        plt.imshow(histogram[..., i])
        plt.axis('off')
        plt.xticks(color="None")
        plt.yticks(color="None")
    plt.savefig("out.png")
    plt.show()
    

    对上述代码总结以下

    1. 图像转灰度
    2. 图像进行x方向和y方向上的一阶差分
    3. 由步骤2得到的两个矩阵构造出幅度矩阵和梯度方向矩阵
    4. 对梯度方向矩阵进行量化,给定0-9对应的标签
    5. 取N=8,8×8个像素为一个 cell,将每个 cell 的梯度幅值加到梯度方向的index处,因为一共有九个梯度方向,因此histogram第三个维度大小为9。
    import cv2
    import numpy as np
    import matplotlib.pyplot as plt
    # get HOG
    def HOG(img):
        # Grayscale
        def BGR2GRAY(img):#转灰度
            gray = 0.2126 * img[..., 2] + 0.7152 * img[..., 1] + 0.0722 * img[..., 0]
            return gray
        # Magnitude and gradient
        def get_gradXY(gray):
            H, W = gray.shape
            # padding before grad
            gray = np.pad(gray, (1, 1), 'edge')
            # get grad x
            gx = gray[1:H + 1, 2:] - gray[1:H + 1, :W]
            # get grad y
            gy = gray[2:, 1:W + 1] - gray[:H, 1:W + 1]
            # replace 0 with
            gx[gx == 0] = 1e-6
            return gx, gy
        # get magnitude and gradient
        def get_MagGrad(gx, gy):
            # get gradient maginitude计算x方向和y方向的梯度
            magnitude = np.sqrt(gx ** 2 + gy ** 2)
            # get gradient angle
            gradient = np.arctan(gy / gx)
            gradient[gradient < 0] = np.pi / 2 + gradient[gradient < 0] + np.pi / 2
            return magnitude, gradient
        # Gradient histogram
        def quantization(gradient):
            # prepare quantization table准备量化表格
            gradient_quantized = np.zeros_like(gradient, dtype=np.int)
            # quantization base
            d = np.pi / 9
            # quantization量化梯度方向
            for i in range(9):
                gradient_quantized[np.where((gradient >= d * i) & (gradient <= d * (i + 1)))] = i
            return gradient_quantized
        # get gradient histogram得到梯度直方图
        def gradient_histogram(gradient_quantized, magnitude, N=8):
            # get shape
            H, W = magnitude.shape
            # get cell num
            cell_N_H = H // N
            cell_N_W = W // N
            histogram = np.zeros((cell_N_H, cell_N_W, 9), dtype=np.float32)
            # each pixel
            for y in range(cell_N_H):
                for x in range(cell_N_W):
                    for j in range(N):
                        for i in range(N):
                            histogram[y, x, gradient_quantized[y * 4 + j, x * 4 + i]] += magnitude[y * 4 + j, x * 4 + i]
            return histogram
        # histogram normalization直方图归一化,归一化函数为最上面提到的
        def normalization(histogram, C=3, epsilon=1):
            cell_N_H, cell_N_W, _ = histogram.shape
            ## each histogram
            for y in range(cell_N_H):
                for x in range(cell_N_W):
                    # for i in range(9):
                    histogram[y, x] /= np.sqrt(np.sum(histogram[max(y - 1, 0): min(y + 2, cell_N_H),
                                                      max(x - 1, 0): min(x + 2, cell_N_W)] ** 2) + epsilon)
            return histogram
        # 1. BGR -> Gray
        gray = BGR2GRAY(img)
        # 1. Gray -> Gradient x and y
        gx, gy = get_gradXY(gray)
        # 2. get gradient magnitude and angle
        magnitude, gradient = get_MagGrad(gx, gy)
        # 3. Quantization
        gradient_quantized = quantization(gradient)
        # 4. Gradient histogram
        histogram = gradient_histogram(gradient_quantized, magnitude)
        # 5. Histogram normalization
        histogram = normalization(histogram)
        return histogram
    # Read image
    img = cv2.imread("123.jpg").astype(np.float32)
    # get HOG
    histogram = HOG(img)
    # Write result to file
    for i in range(9):
        plt.subplot(3, 3, i + 1)
        plt.imshow(histogram[..., i])
        plt.axis('off')
        plt.xticks(color="None")
        plt.yticks(color="None")
    plt.savefig("out.png")
    plt.show()
    

    最终的完整代码

    import cv2
    import numpy as np
    import matplotlib.pyplot as plt
    # get HOG
    def HOG(img):
        # Grayscale
        def BGR2GRAY(img):#转灰度
            gray = 0.2126 * img[..., 2] + 0.7152 * img[..., 1] + 0.0722 * img[..., 0]
            return gray
        # Magnitude and gradient
        def get_gradXY(gray):#x方向和y方向梯度
            H, W = gray.shape
            # padding before grad
            gray = np.pad(gray, (1, 1), 'edge')
            # get grad x
            gx = gray[1:H + 1, 2:] - gray[1:H + 1, :W]
            # get grad y
            gy = gray[2:, 1:W + 1] - gray[:H, 1:W + 1]
            # replace 0 with
            gx[gx == 0] = 1e-6
            return gx, gy
        # get magnitude and gradient
        def get_MagGrad(gx, gy):#梯度幅度和方向
            # get gradient maginitude
            magnitude = np.sqrt(gx ** 2 + gy ** 2)
            # get gradient angle
            gradient = np.arctan(gy / gx)
            gradient[gradient < 0] = np.pi / 2 + gradient[gradient < 0] + np.pi / 2
            return magnitude, gradient
        # Gradient histogram
        def quantization(gradient):#梯度方向量化
            # prepare quantization table
            gradient_quantized = np.zeros_like(gradient, dtype=np.int)
            # quantization base
            d = np.pi / 9
            # quantization
            for i in range(9):
                gradient_quantized[np.where((gradient >= d * i) & (gradient <= d * (i + 1)))] = i
            return gradient_quantized
        # get gradient histogram
        def gradient_histogram(gradient_quantized, magnitude, N=8):#梯度直方图
            # get shape
            H, W = magnitude.shape
            # get cell num
            cell_N_H = H // N
            cell_N_W = W // N
            histogram = np.zeros((cell_N_H, cell_N_W, 9), dtype=np.float32)
            # each pixel
            for y in range(cell_N_H):
                for x in range(cell_N_W):
                    for j in range(N):
                        for i in range(N):
                            histogram[y, x, gradient_quantized[y * 4 + j, x * 4 + i]] += magnitude[y * 4 + j, x * 4 + i]
            return histogram
        # histogram normalization
        def normalization(histogram, C=3, epsilon=1):#直方图归一化
            cell_N_H, cell_N_W, _ = histogram.shape
            ## each histogram
            for y in range(cell_N_H):
                for x in range(cell_N_W):
                    # for i in range(9):
                    histogram[y, x] /= np.sqrt(np.sum(histogram[max(y - 1, 0): min(y + 2, cell_N_H),
                                                      max(x - 1, 0): min(x + 2, cell_N_W)] ** 2) + epsilon)
            return histogram
        # 1. BGR -> Gray
        gray = BGR2GRAY(img)
        # 1. Gray -> Gradient x and y
        gx, gy = get_gradXY(gray)
        # 2. get gradient magnitude and angle
        magnitude, gradient = get_MagGrad(gx, gy)
        # 3. Quantization
        gradient_quantized = quantization(gradient)
        # 4. Gradient histogram
        histogram = gradient_histogram(gradient_quantized, magnitude)
        # 5. Histogram normalization
        histogram = normalization(histogram)
        return histogram
    # draw HOG
    def draw_HOG(img, histogram):#将梯度直方图叠加到原灰度图像中
        # Grayscale
        def BGR2GRAY(img):
            gray = 0.2126 * img[..., 2] + 0.7152 * img[..., 1] + 0.0722 * img[..., 0]
            return gray
        def draw(gray, histogram, N=8):
            # get shape
            H, W = gray.shape
            cell_N_H, cell_N_W, _ = histogram.shape
            ## Draw
            out = gray[1: H + 1, 1: W + 1].copy().astype(np.uint8)
            for y in range(cell_N_H):#对每个小块画线
                for x in range(cell_N_W):
                    cx = x * N + N // 2
                    cy = y * N + N // 2
                    x1 = cx + N // 2 - 1
                    y1 = cy
                    x2 = cx - N // 2 + 1
                    y2 = cy
                    h = histogram[y, x] / np.sum(histogram[y, x])
                    h /= h.max()
                    for c in range(9):#对每个方向画线
                        # angle = (20 * c + 10 - 90) / 180. * np.pi
                        # get angle
                        angle = (20 * c + 10) / 180. * np.pi
                        rx = int(np.sin(angle) * (x1 - cx) + np.cos(angle) * (y1 - cy) + cx)
                        ry = int(np.cos(angle) * (x1 - cx) - np.cos(angle) * (y1 - cy) + cy)
                        lx = int(np.sin(angle) * (x2 - cx) + np.cos(angle) * (y2 - cy) + cx)
                        ly = int(np.cos(angle) * (x2 - cx) - np.cos(angle) * (y2 - cy) + cy)
                        # color is HOG value
                        c = int(255. * h[c])
                        # draw line
                        cv2.line(out, (lx, ly), (rx, ry), (c, c, c), thickness=1)#设置线形
            return out
        # get gray
        gray = BGR2GRAY(img)
        # draw HOG
        out = draw(gray, histogram)
        return out
    # Read image
    img = cv2.imread("123.jpg").astype(np.float32)
    # get HOG
    histogram = HOG(img)
    # draw HOG
    out = draw_HOG(img, histogram)
    # Save result
    cv2.imwrite("out.jpg", out)
    cv2.imshow("result", out)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    
    • 综上来说,前三步还是比较简单的,非常常规,图像转灰度,然后求出x方向y方向上的梯度,结合x方向和y方向的梯度,求出梯度幅值矩阵和梯度方向矩阵,对梯度方向矩阵进行量化,将方向归一到0-8九个值
    • 第四步有些难度,将图像按8*8的块进行切分,比如原图像为高240宽240的图像,切分后就变成高上有30宽上有30的900个小块,每个小块上结合梯度幅度图和量化后的梯度方向图,将梯度幅度归类到0-8对应的九个梯度方向上,这样很类似直方图归一到0-255的256个位置,不过这里是9个位置
    • 第五步同样理解起来很累,在第四步的基础上,以3X3的九个小块作为一个单元进行归一化,就是按照公式把第四个步骤中的每个小块在其范围内的9个块中进行归一
    • 第六步轮到画方向了,根据梯度方向找出计算出初始坐标,终点坐标,设置线宽,线的颜色就开始画线

    色彩追踪

    • 色彩追踪是提取特定颜色的区域的方法。
    • 然而,由于在 RGB 色彩空间内颜色有2563种,因此十分困难(或者说手动提取相当困难),因此进行 HSV 变换。
    • HSV 变换在之前提到过,是将 RGB 变换到色相(Hue)、饱和度(Saturation)、明度(Value)的方法。
    • 饱和度越小越白,饱和度越大颜色越浓烈,0≤S≤1;
    • 明度数值越高越接近白色,数值越低越接近黑色(0≤V≤1);
    • 色相:将颜色使用0到360度表示,具体色相与数值按下表对应
    绿 青色 蓝色 品红
    0^\circ 60^\circ 120^\circ 180^\circ 240^\circ 300^\circ 360^\circ
    • 也就是说,为了追踪蓝色,可以在进行 HSV 转换后提取其中180≤H≤260的位置,将其变为255。
    def BGR2HSV(_img):
        img = _img.copy() / 255.
        hsv = np.zeros_like(img, dtype=np.float32)
        # get max and min
        max_v = np.max(img, axis=2).copy()
        min_v = np.min(img, axis=2).copy()
        min_arg = np.argmin(img, axis=2)
        # H
        hsv[..., 0][np.where(max_v == min_v)]= 0
        ## if min == B
        ind = np.where(min_arg == 0)
        hsv[..., 0][ind] = 60 * (img[..., 1][ind] - img[..., 2][ind]) / (max_v[ind] - min_v[ind]) + 60
        ## if min == R
        ind = np.where(min_arg == 2)
        hsv[..., 0][ind] = 60 * (img[..., 0][ind] - img[..., 1][ind]) / (max_v[ind] - min_v[ind]) + 180
        ## if min == G
        ind = np.where(min_arg == 1)
        hsv[..., 0][ind] = 60 * (img[..., 2][ind] - img[..., 0][ind]) / (max_v[ind] - min_v[ind]) + 300 
        # S
        hsv[..., 1] = max_v.copy() - min_v.copy()
        # V
        hsv[..., 2] = max_v.copy()
        return hsv
    # make mask
    def get_mask(hsv):#构造掩膜,把匹配到的图像提取出来
        mask = np.zeros_like(hsv[..., 0])
        #mask[np.where((hsv > 180) & (hsv[0] < 260))] = 255
        mask[np.logical_and((hsv[..., 0] > 180), (hsv[..., 0] < 260))] = 255
        return mask
    # Read image
    img = cv2.imread("imori.jpg").astype(np.float32)
    # RGB > HSV
    hsv = BGR2HSV(img)
    # color tracking
    mask = get_mask(hsv)
    out = mask.astype(np.uint8)
    # Save result
    cv2.imwrite("out.png", out)
    cv2.imshow("result", out)
    

    相关文章

      网友评论

          本文标题:方向梯度直方图

          本文链接:https://www.haomeiwen.com/subject/wzhwzhtx.html