这篇是CVPR2019已经录用的少数几篇re-ID方向的论文,来自中山大学和腾讯实验室,代码也已经放出来了(https://github.com/KovenYu/MAR)
想法是引入辅助数据集来挖掘无标签的样本潜在的标签信息,有效地提升了无监督re-ID的效果.
1.文章解决了什么问题,提出了怎样的解决方法
1)在无监督的re-ID中,因为缺少跨摄像头的成对的标签图片,因此我们很难学习到 discriminative information.
解决问题1:文章提出了基于无监督re-ID的soft multilabel learning(软多标签学习),主要思想是通过比较unlabeled person和一组参考数据集中的person,从而为每一个无标签的人学习一个soft multilabel.
data:image/s3,"s3://crabby-images/a5a4d/a5a4d07a91672c9e70c1593d64d33daf1af1bb6b" alt=""
2) Soft Multilabel Learning怎样挖掘无标签的re-ID数据集中的潜在标签信息?
解决问题2:提出multilabel reference learning(MAR),包括soft multilabel-guided hard negative mining, cross-view consistent soft multilabel learning, reference agent learning相互协作来挖掘隐藏的标签信息。
soft multilabel-guided hard negative mining
文章使用soft multilabel 来区分视觉上相似但实则不同的人。因为soft multilabel本质上代表了样本的comparative characteristics,所以它不仅仅是从视觉上来表达一个人。实际上,一个人的一对样本图片应当不仅在视觉上相似,且应与参考样本有相似的relative comparative characteristics. 如果一个图片对仅仅在视觉上相似而在对比特征上不相似,那么它可能是个负样本对。
data:image/s3,"s3://crabby-images/17a84/17a843881812147abd48b04b56b07e58ee4dd336" alt=""
cross-view consistent soft multilabel learning
re-ID中,大部分图片对都是跨视域的,即两个人的图片是由不同摄像头拍摄的,因此提出cross-view consistent soft multilabel learning来学习能在跨摄像头的情况下仍然保持良好的soft multilabel.
reference agent learning
因为要大量地比较目标数据集与辅助数据集的样本,文章提出了reference agent.
2.怎么实现Soft Multilabel Learning?
主要包括三个部分:
- soft multilabel-guided hard negative mining
- cross-view consistent soft multilabel learning
-
reference agent learning
它们的协作如图.
MAR framework
设定
data:image/s3,"s3://crabby-images/c5d69/c5d691f68151bdd53511ee4c058dd0f35734b1de" alt=""
data:image/s3,"s3://crabby-images/32c99/32c99c8e392c381598978fc307b9c3f3c2618fb7" alt=""
我们的目标是学习一个soft multilabel function l(x,z)和feature embedding f(x)
data:image/s3,"s3://crabby-images/335fb/335fb2807ae0c00d1e81d20cd0c2865e265cec0c" alt=""
data:image/s3,"s3://crabby-images/512f7/512f7137266ed39e488a015e945598ceefee3c1c" alt=""
soft multilabel-guided hard negative mining
Soft multilabel function
data:image/s3,"s3://crabby-images/97c4d/97c4d9a9d837554dffb7878fdea7ad704dc1c3e3" alt=""
Assumption 1
文章在这里提出一个假设:若一对无标签行人图片拥有高特征相似度,那么,称之为similar pair(相似对),若一个相似对拥有高度相似的对比特征,那么它可能是一个正样本,否则可能为负样本。
data:image/s3,"s3://crabby-images/b061f/b061fb91459cfd081cb166f1b06e0edf5b3c7cd1" alt=""
文章提出soft multilabel agreement来度量样本对之间的对比特征相似度:
data:image/s3,"s3://crabby-images/9a57f/9a57fb89b7136e03209e7d7ec1d80adf4b09fec9" alt=""
定义mining ratio为p,similar pair为pM
data:image/s3,"s3://crabby-images/3cec8/3cec8945c5725d33dc8cf707940e91cfa9476eae" alt=""
随即可得出soft Multilabel-guided Discriminative embedding Learning
data:image/s3,"s3://crabby-images/194d4/194d49b36c7301f97030a5acc19c4b22dfffc6b8" alt=""
cross-view consistent soft multilabel learning
motivation:从数据分布的角度来说,对于参考样本和目标样本,对比特征的分布应该只取决于人的外观在目标域的分布而独立于其摄像头。
data:image/s3,"s3://crabby-images/03ac1/03ac1dc0505119ff474cf73bbd3f02db6c60874b" alt=""
因此提出了Cross-view consistent soft Multilabel Learning loss:
data:image/s3,"s3://crabby-images/cadb6/cadb6a4b2834f694b5ddac32be54ce2338187982" alt=""
reference agent learning
Agent Learning loss
data:image/s3,"s3://crabby-images/1b954/1b954da5ac56e76542a94e284b9503fdf57209c5" alt=""
Deep soft multilabel reference learning
data:image/s3,"s3://crabby-images/50cde/50cde963402e1a8161b0e2955287799c79d203bf" alt=""
3.效果如何
-
对比实验
Comparison with sota
-
Ablation study(消融学习)
证明两点
- 1 soft multilabel guidance的有效性
- 2 CML和RAL对MAR的不可缺失
data:image/s3,"s3://crabby-images/1ea83/1ea8317e647c7ae6d37649094619da39017da34d" alt=""
视觉效果
data:image/s3,"s3://crabby-images/a100d/a100df0f701d1fef656640d48e4db266436cbc17" alt=""
改变refernce person的数量会影响实验结果
data:image/s3,"s3://crabby-images/1dd8f/1dd8f30ad38587a11539d6333e87fdc334f2948f" alt=""
改变lambda带来的变化
data:image/s3,"s3://crabby-images/0fd32/0fd32ba0d78831b1b7a0f04ff19a9d6d12b03014" alt=""
网友评论