Detection in Crowded Scenes

作者: 小松qxs | 来源:发表于2020-05-19 22:15 被阅读0次

titile	Detection in Crowded Scenes: One Proposal, Multiple Predictions
url	https://arxiv.org/pdf/2003.09163.pdf
动机	提高密集场景人体检测的效果，simple and almost cost-free。
内容	贡献点： 1、每个proposal预测a set of instance。 2、EMD loss学习instance set prediction。 3、后处理Set NMS。 4、 refinement module (RM)，解决潜在的FP（可选）。现有方法解决crowd问题： 1、NMS：soft NMS、softer NMS、different NMS thresholds for different bounding boxes、adaptive-NMS。 2、Loss functions for crowded detection： Aggregation Loss(proposals更贴近gt) 、Repulsion Loss(proposal与多个gt overlap，引入惩罚项)，这些loss对crowded场景有帮助但NMS仍然限制crowd场景。 3、Re-scoring： RelationNet(不用NMS在coco也有好的效果，但是crowdhuman效果不好，different predictions from very close proposals, so their features and relations are also very similar)、part-based detectors 本文方法：Multiple Instance Prediction 一个proposal匹配多个gt 1、Instance set prediction：c：class label with confidence、l：relative coordinates 2、EMD loss（实验中K=2）： 3、Set NMS：we check whether the two box come from the same proposal; if yes, we skip the suppression 4、Refinement module：一个proposal匹配多个gt，有更多的predictions，有产生更多FP风险， 5、Discussion: relation to previous methods：（1）Double-person detector models person pairs in the DPM。（2）MultiBox 在image patch预测所有instances; YOLO v1/v2预测all instances centered at a certain location, 它们不是proposal-based。（3） https://arxiv.org/pdf/1506.04878.pdf用LSTM去decode图像中每个grid的instance boxes，和EMD loss相似，用Hungarian Loss for multiple instance supervision，后处理merge the predictions produced by adjacent grids，该方法没有用到proposals，很难检测various sizes/shapes objects(pedestrians or general objects)，LSTM复杂, 整合到framework比较难。
实验	Evaluation metrics： 1、 Averaged Precision (AP)。 2、MR−2：log-average Miss Rate on False Positive Per Image (FPPI) in [10−2,100]，对FP敏感，尤其高分的FP。 3、Jaccard Index (JI)：counting ability of a detector。 Detailed Settings： resnet50+FPN+ROIAlign，NMS=0.5。 Experiment on CrowdHuman： Main results and ablation study： 1、没有MR时，AP和JI均增长较多，说明更多的正样本检测到，MR也增长说明没有引入更多的FP 2、加入RM，AP和JI略增长，MR增长多，说明有减少FP作用。 Comparisons with various NMS strategies： 1、NMS 阈值增大(0.5->0.6)recall多，AP增大，但MR指标变差，召回FP多。 2、Soft-NMS：增加AP，JI和MR不变。 Comparisons with previous works： GossipNet and RelationNet – which are representative works categorized into advanced NMS and re-scoring approaches respectively Analysis on recalls： Experiments on CityPersons Qualitative results： Experiments on COCO coco crowdedness比较少，coco数据集效果可以说明以下两点： 1) whether our method generalizes well to multi-class detection problems; 2) whether the proposed approach is robust to different crowdedness, especially to isolated instances.
思考