美文网首页
Fast R-CNN

Fast R-CNN

作者: 初七123 | 来源:发表于2018-06-18 15:05 被阅读18次

    Introduction

    R-CNN 多阶段的训练以及测试速度非常慢
    SPPnet 借助空间金字塔池化明显的提升了效率

    我们提出的方案改进了R-CNN以及SPPnet
    1.Higher detection quality (mAP) than R-CNN, SPPnet
    2.Training is single-stage, using a multi-task loss
    3.Training can update all network layers
    4.No disk storage is required for feature caching

    Fast R-CNN architecture and training

    The RoI pooling layer uses max pooling to convert the features inside any valid region of interest into a small feature map with a fixed spatial extent of H ×W (e.g., 7×7), where H and W are layer hyper-parameters that are independent of any particular RoI

    RoI池化把特征转换为固定的空间大小

    In Fast RCNN training, stochastic gradient descent (SGD) minibatches are sampled hierarchically, first by sampling N images and then by sampling R/N RoIs from each image.

    Fast R-CNN 使用层次采样的方法,先选N张图片,然后每张图片选R/N个样本作为batch

    Mini-batch sampling

    1. Each SGD mini-batch is constructed from N = 2 images, chosen uniformly at random
    2. We use mini-batches of size R = 128, sampling 64 RoIs from each image
    3. we take 25% of the RoIs from object proposals that have intersection over union (IoU) overlap with a groundtruth bounding box of at least 0.5.
    4. The remaining RoIs are sampled from object proposals that have a maximum IoU with ground truth in the interval [0.1,0.5), following [11].
    5. The lower threshold of 0.1 appears to act as a heuristic for hard example mining [8]
    6. During training, images are horizontally flipped with probability 0.5

    Multi-task loss
    Each training RoI is labeled with a ground-truth class u and a ground-truth bounding-box regression target v. We use a multi-task loss L on each labeled RoI to jointly train for classification and bounding-box regression.

    每个训练样本都标注了真实的边界,我们使用多任务损失来训练分类和回归网络。

    u 表示样本中是否存在目标
    t 是预测坐标
    v 是标注坐标
    λ 是平衡参数,论文中用的 λ =1

    Back-propagation through RoI pooling layers

    In words, for each mini-batch RoI r and for each pooling output unit yrj, the partial derivative ∂L/∂ yrj is accumulated if i is the argmax selected for yrj by max pooling. In back-propagation,the partial derivatives ∂L/∂ yrj arealready computed by the backwards function of the layer on top of the RoI pooling layer.

    Fast R-CNN detection

    Large fully connected layers are easily accelerated by compressing them with truncated SVD

    In this technique, a layer parameterized by the u × v weight matrix W is approximately factorized as

    Main results

    1. State-of-the-art mAP on VOC07, 2010, and 2012
    2. Fast training and testing compared to R-CNN, SPPnet
    3. Fine-tuning conv layers in VGG16 improves mAP

    精度


    时间


    Design evaluation

    Does multi-task training help?


    Scale invariance: to brute force or finesse?


    image.png

    Do SVMs outperform softmax?


    Do we need more training data?
    Yes

    Are more proposals always better?


    相关文章

      网友评论

          本文标题:Fast R-CNN

          本文链接:https://www.haomeiwen.com/subject/cemweftx.html