美文网首页
2020-07-07 Point-Set Anchors for

2020-07-07 Point-Set Anchors for

作者: Joyner2018 | 来源:发表于2020-07-07 23:24 被阅读0次

    paper reading

    title

    Point-Set Anchors for Object Detection, Instance Segmentation and Pose Estimation

    author

    Fangyun Wei, Xiao Sun, Hongyang Li, Jingdong Wang, and Stephen Lin

    单位

    Microsoft Research Asia
    Peking University

    论文地址

    https://arxiv.org/pdf/2007.02846.pdf
    To appear in ECCV 2020

    数据集

    MS-COCO

    摘要

    A recent approach for object detection and human pose estimation is to regress bounding boxes or human keypoints from a central point on the object or person. While this center-point regression is simple and efficient, we argue that the image features extracted at a central point contain limited information for predicting distant keypoints or bounding box boundaries, due to object deformation and scale/orientation variation. To facilitate inference, we propose to instead perform regression from a set of points placed at more advantageous positions. This point set is arranged to reflect a good initialization for the given task, such as modes in the training data for pose estimation, which lie closer to the ground truth than the central point and provide more informative features for regression. As the utility of a point set depends on how well its scale, aspect ratio and rotation matches the target, we adopt the anchor box technique of sampling these transformations to generate additional point-set candidates. We apply this proposed framework, called Point-Set Anchors, to object detection, instance segmentation, and human pose estimation. Our results show that this general-purpose approach can achieve performance competitive with state-of-the-art methods for each of these tasks.

    贡献

    1. A new object representation named Point-Set Anchors, which can be seen as a generalization and extension of classical box anchors. Point-set anchors can further provide informative features and better task-specific initializations for shape regression.
      提出了一种新的对象表示方法,即点集锚点,是对传统盒锚点的推广。点集锚可以进一步为形状回归提供信息特性和更好的特定于任务的初始化。
    2. A network based on point-set anchors called PointSetNet, which is a modification of RetinaNet [23] that simply replaces the anchor boxes with the proposed point-set anchors and also attaches a parallel regression branch. Variants of this network are applied to object detection, human pose estimation, and also instance segmentation, for which the problem of defining specific regression targets is addressed.
      一个基于点集锚的网络叫做PointSetNet,它是RetinaNet[23]的一个修改,只是用提议的点集锚替换锚盒,还附加了一个并行回归分支。将该网络的变量应用于目标检测、人体姿态估计和实例分割,解决了回归目标的定义问题。
    3. It is shown that the proposed general-purpose approach achieves performance competitive with state-of-the-art methods on object detection, instance segmentation and pose estimation.
      该通用方法在目标检测、实例分割和姿态估计等方面的性能优于现有方法。

    framework

    提出来的新框架采用点集合取代,之前目标检测中的bbox



    Three matching strategies between point-set anchor and the ground-truth mask contour for instance segmentation.
    image.png
    image.png

    损失函数

    image.png

    performance

    image.png
    image.png

    性能提升不大。


    image.png
    image.png
    image.png
    image.png
    image.png

    这个对比实验的采用backbone不一样,HRnet的效果要优于其他,这个结论很难说明提出的方法比其他的方法优秀。

    conclusion

    1. 这篇文章提出了Point-Set Anchors, 是经典的bboxes的拓展和延申,可以在目标检测,实例分割,姿势检测中作为通用框架。
    2. 这篇文章提出了PointSetNet, 这是网络是在RetinaNet的基础上把achors boxes的部分替换成point-set anchors,为关键点回归附加一个并行分支。

    参考文献

    [1] Sida Peng, Wen Jiang, Huaijin Pi, Xiuli Li, Hujun Bao, Xiaowei Zhou. Deep Snake for Real-Time Instance Segmentation. In CVPR 2020 oral.
    [2] Bolya, D., Zhou, C., Xiao, F., Lee, Y.J.: Yolact: real-time instance segmentation.
    In: Proceedings of the IEEE International Conference on Computer Vision. pp.
    9157{9166 (2019)
    [3] Chen, X., Girshick, R., He, K., Doll´ar, P.: Tensormask: A foundation for dense
    object segmentation. In: Proceedings of the IEEE International Conference on
    Computer Vision. pp. 2061{2069 (2019)
    [4] Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: Centernet: Keypoint triplets
    for object detection. In: Proceedings of the IEEE International Conference on
    Computer Vision. pp. 6569{6578 (2019)
    [5] Huang, Z., Huang, L., Gong, Y., Huang, C., Wang, X.: Mask scoring r-cnn. In:
    Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.
    pp. 6409{6418 (2019)
    [6] Nie, X., Feng, J., Zhang, J., Yan, S.: Single-stage multi-person pose machines.
    In: Proceedings of the IEEE International Conference on Computer Vision. pp.
    6951{6960 (2019)
    [7] Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning
    for human pose estimation. In: Proceedings of the IEEE conference on computer
    vision and pattern recognition. pp. 5693{5703 (2019)
    [8] Tian, Z., Shen, C., Chen, H., He, T.: Fcos: Fully convolutional one-stage object
    detection. In: Proceedings of the IEEE International Conference on Computer
    Vision. pp. 9627{9636 (2019)
    [9] Xie, E., Sun, P., Song, X., Wang, W., Liu, X., Liang, D., Shen, C., Luo, P.: Polar�mask: Single shot instance segmentation with polar representation. arXiv preprint
    arXiv:1909.13226 (2019)
    [10] Yang, Z., Liu, S., Hu, H., Wang, L., Lin, S.: Reppoints: Point set representation
    for object detection. In: Proceedings of the IEEE International Conference on
    Computer Vision. pp. 9657{9666 (2019)
    [11] Yang, Z., Xu, Y., Xue, H., Zhang, Z., Urtasun, R., Wang, L., Lin, S., Hu, H.:
    Dense reppoints: Representing visual objects with dense point sets. arXiv preprint
    arXiv:1912.11473 (2019)
    [12] Zhou, X., Zhuo, J., Krahenbuhl, P.: Bottom-up object detection by grouping ex�treme and center points. In: Proceedings of the IEEE Conference on Computer
    Vision and Pattern Recognition. pp. 850{859 (2019)

    相关文章

      网友评论

          本文标题:2020-07-07 Point-Set Anchors for

          本文链接:https://www.haomeiwen.com/subject/sqgzqktx.html