非常详细的Kaggle实战(二)图片预处理

作者: 海盗船长_coco | 来源:发表于2020-03-20 15:30 被阅读0次

非常详细的Kaggle实战(二)图片预处理
非常详细的Kaggle实战(一)（鲸鱼识别）
非常详细的Kaggle实战(五)训练过程
非常详细的Kaggle实战(六)预测测试集
非常详细的Kaggle实战(三)孪生网络的搭建
python数据分析之数据处理终极神器
非常详细的Kaggle实战(四)生成训练数据集
Alteryx-008（资料）
如何在kaggle上使用fastai v1.0（上）
Kaggle实战：Plant Pathology（二）

图片预处理

对图片进行以下操作：
1、转化为黑白图像
2、进行一系列仿射变换

一、转化为黑白图像

在早期实验中，作者发现当比较两个彩色图像或两个黑白图像时，模型能够达到了差不多相同的精度。但是，将彩色图像与黑白图像进行比较会导致精度大大降低。最简单的解决方案是将所有图像转换为黑白图像，这样即使在比较原始彩色图像时也不会降低准确性。

二、仿射变换

观察下图可以发现给我们的图像中存在大量的背景像素，先使用bounding box将鲸鱼的矩形区域框出来，再利用仿射变换将图像的矩形区域映射到分辨率为384x384x1（黑白仅一个通道）的方形图像。矩形区域的宽高比为2.15，接近于所有图片的平均高宽比。矩形的边界框比bounding box模型预测的边界框略大。
在训练期间，通过添加一系列缩放，平移，旋转和剪切的随机变换来进行数据增强。而测试时会跳过这些随机变换。
最后，将图像标准化为零均值和单位方差。
ps:bounding box文件也在上篇博客的百度云资源中，该数据主要是通过网络模型预测得到，具体可参考https://www.kaggle.com/martinpiotte/bounding-box-model

bounding box的裁剪图片
同时在之前的DataSource资源文件夹下添加bounding box的读取

class DataSource():

    def __init__(self, TRAIN_DF, SUB_DF, HASH_PATH, SIZE_PATH, BBOX_PATH):
        super(DataSource, self).__init__()
        # Read the dataset description
        self.picture_2_bbox = self.get_bbox(BBOX_PATH)  # 每张图片中的bounding box

    def get_bbox(self, bbox_path):
        # Read the bounding box data from the bounding box
        bbox = {p: (x0, y0, x1, y1) for _, p, x0, y0, x1, y1 in pd.read_csv(bbox_path).to_records()}
        return bbox

[('72c3ce75c.jpg', (0, 0, 1045, 389)), ('a7ad640ee.jpg', (35, 71, 523, 291)), 
('df2b6c364.jpg', (12, 15, 1033, 308)), ('26013fcb5.jpg', (3, 1, 1024, 292)), 
('09eff7b37.jpg', (11, 8, 926, 334))]

在util.py在添加包括旋转、裁切、缩放、平移等随机变换的方法

# 图像增强：随机变换
def build_transform(rotation, shear, height_zoom, width_zoom, height_shift, width_shift):
    rotation = np.deg2rad(rotation)  # 旋转角度转化为弧度
    shear = np.deg2rad(shear)
    rotation_matrix = np.array(
        [[np.cos(rotation), np.sin(rotation), 0], [-np.sin(rotation), np.cos(rotation), 0], [0, 0, 1]])  # 旋转
    shear_matrix = np.array([[1, np.sin(shear), 0], [0, np.cos(shear), 0], [0, 0, 1]])  # 裁剪
    zoom_matrix = np.array([[1.0 / height_zoom, 0, 0], [0, 1.0 / width_zoom, 0], [0, 0, 1]])  # 放大
    shift_matrix = np.array([[1, 0, -height_shift], [0, 1, -width_shift], [0, 0, 1]])  # 位移
    return np.dot(np.dot(rotation_matrix, shear_matrix), np.dot(zoom_matrix, shift_matrix))

在DataSouce类下添加读取图片的方法，训练集读取时便进行随机变换进行数据增强，而测试集读取时不必进行变换。

    # 读取裁切的图片
    def read_cropped_image(self, picture, augment, crop_margin=0.05, anisotropy=2.15, img_shape=(384, 384, 1)):
        """
        @param p : the name of the picture to read
        @param augment: True/False if data augmentation should be performed
        @:param crop_margin:The margin added around the bounding box to compensate for bounding box inaccuracy
        @:param anisotropy:高宽比
        @return a numpy array with the transformed image
        """
        # If an image id was given, convert to filename
        if picture in self.hash_2_picture:
            picture = self.hash_2_picture[picture]
        size_x, size_y = self.picture_2_size[picture]

        # Determine the region of the original image we want to capture based on the bounding box.
        x0, y0, x1, y1 = self.picture_2_bbox[picture]
        # if picture in rotate:     #无需旋转的图片
        #     x0, y0, x1, y1 = size_x - x1, size_y - y1, size_x - x0, size_y - y0
        dx = x1 - x0
        dy = y1 - y0
        x0 -= dx * crop_margin
        x1 += dx * crop_margin + 1
        y0 -= dy * crop_margin
        y1 += dy * crop_margin + 1
        # 防止越界
        x0 = 0 if x0 < 0 else x0
        x1 = size_x if x1 > size_x else x1
        y0 = 0 if y0 < 0 else y0
        y1 = size_y if y1 > size_y else y1
        # 宽高比
        dx = x1 - x0
        dy = y1 - y0
        if dx > dy * anisotropy:
            dy = 0.5 * (dx / anisotropy - dy)
            y0 -= dy
            y1 += dy
        else:
            dx = 0.5 * (dy * anisotropy - dx)
            x0 -= dx
            x1 += dx
        # Generate the transformation matrix
        trans = np.array([[1, 0, -0.5 * img_shape[0]], [0, 1, -0.5 * img_shape[1]], [0, 0, 1]])
        trans = np.dot(np.array([[(y1 - y0) / img_shape[0], 0, 0], [0, (x1 - x0) / img_shape[1], 0], [0, 0, 1]]),
                       trans)  # 缩放
        if augment:  # 数据增强
            trans = np.dot(build_transform(
                random.uniform(-5, 5),
                random.uniform(-5, 5),
                random.uniform(0.8, 1.0),
                random.uniform(0.8, 1.0),
                random.uniform(-0.05 * (y1 - y0), 0.05 * (y1 - y0)),
                random.uniform(-0.05 * (x1 - x0), 0.05 * (x1 - x0))
            ), trans)
        trans = np.dot(np.array([[1, 0, 0.5 * (y1 + y0)], [0, 1, 0.5 * (x1 + x0)], [0, 0, 1]]), trans)

        # Read the image, transform to black and white and comvert to numpy array
        img = np.array(Image.open(expand_path(picture)).convert('L'))

        # Apply affine transformation
        matrix = trans[:2, :2]
        offset = trans[:2, 2]
        # img = img.reshape(img.shape[:-1])
        img = affine_transform(img, matrix, offset, output_shape=img_shape[:-1], order=1, mode='constant',
                               cval=np.average(img))
        img = img.reshape(img_shape).astype(float)

        # Normalize to zero mean and unit variance
        img -= np.mean(img, keepdims=True)
        img /= np.std(img, keepdims=True)
        return img

    # Read an image for validation, i.e. without data augmentation.
    def read_for_validation(self, img_path, img_size=(384, 384, 1), anisotropy=2.15):
        return self.read_cropped_image(img_path, augment=False, img_shape=img_size, anisotropy=anisotropy)

    # Read an image for training, i.e. including a random affine transformation
    def read_for_training(self, img_path, img_size=(384, 384, 1), anisotropy=2.15):
        return self.read_cropped_image(img_path, augment=True, img_shape=img_size, anisotropy=anisotropy)

第一张为原图，第二张为训练时进行随即变换的图片，第三张为测试时不进行随机变换的图片。

原图、训练集图片、测试集图片

总结

以上就是对图片进行预处理的内容，主要是通过bounding box模型将鲸鱼的边界框进行预测，然后将预测出的矩形区域缩小到指定大小，在训练时通过随机变换进行数据增强，而测试时不进行数据增强。
当然边界框的预测也是一项重要内容，它直接关系到鲸鱼边界框的准确性，继而影响训练图像的质量，关于预测边界框的网络模型会另开一篇博客进行讲解。

非常详细的Kaggle实战(二)图片预处理
图片预处理对图片进行以下操作：1、转化为黑白图像2、进行一系列仿射变换一、转化为黑白图像在早期实验中，作者发...
非常详细的Kaggle实战(一)（鲸鱼识别）
最近在进行kaggle练习的时候，发现一篇非常nice的notebook。于是想将其转化为自己的博客，既是对自己的...
非常详细的Kaggle实战(五)训练过程
本节介绍了用于训练模型的过程。将会训练400个epoch，在训练过程中，以下参数会发生变化：1、学习率2、是否添...
非常详细的Kaggle实战(六)预测测试集
比赛提交文件的要求：对于测试集中的每张图片，我们需要预测最有可能的5个类别，基本策略是，对于测试集中的每张图片：1...
非常详细的Kaggle实战(三)孪生网络的搭建
孪生网络孪生网络会比较两幅图像，并判断两幅图像是来自同一类鲸鱼还是不同的鲸鱼。通过将测试集中的每幅图像与训练集...
python数据分析之数据处理终极神器
一行代码一行代码带着敲，通过实战讲解如何进行数据预处理，在实战中学习，最快的学习方法，精华！讲解的非常详细简单，学...
非常详细的Kaggle实战(四)生成训练数据集
在摘要部分中提到该部分数据集的质量对于模型的准确性有很大的影响。我们希望孪生网络能够从训练集所有可能的鲸中选择正确...
Alteryx-008（资料）
基于Alteryx的Kaggle实战入门（原创：郝扬金融科技实战2018-03-10） kaggle现在俨然成为了...
如何在kaggle上使用fastai v1.0（上）
这篇文章会教大家在kaggle上使用fastai v1.0来进行图片分类。 kaggle是一个在数据挖掘领域非常著...
Kaggle实战：Plant Pathology（二）
在第一篇实战里说过测试集分数还有比较大的提升空间，深度学习里有一句话：garbage in,garbage out...