Person Re-identification via Rec

作者: __Vision | 来源:发表于2017-03-24 17:44 被阅读0次

Person Re-identification via Rec
Deep Attributes Driven Multi-cam
[读paper]TAUDL-利用Tracklet进行无监督行人重
2018-03-08
Re-ranking Person Re-identificat
Re-ID 学习
Person Re-identification Overvie
Person Re-Identification by Supp
Person Re-Identification by Deep
Person Re-Identification by Deep

ECCV 2016 person re-identification相关第四篇

这篇文章大致的idea是把简单的比如color LBP之类的特征通过LSTM网络聚合成highly discriminative representation

优势：

First, it allows discriminative information of frame-wise data to propagate along the temporal direction, and discriminative information could be accumulated from the first LSTM node to the deepest one， thus yielding a highly discriminative sequence level human representation.
Second, during feature propagation, this framework can prevent non-informative information from reaching the deep nodes, therefore it is robust to noisy features
Third, the proposed fusion network is simple yet efficient, which is able to deal with sequences with variable length.

传统的方法，其中一个方面就是度量学习，但是这篇文章通过融合简单特征，生成深层特征，因此，简单的度量比如cos就能产生好的效果。这里我也认为，reID的工作更多的是应该放在如何提取high discriminative 的特征上去。

传统做reID，一般就是先用cnn或者直接提取single的底层特征，然后扔到度量学习层里面训练，然后用得到的模型和度量来做预测。但是这种做法在提取特征的时候没有考虑时间信息，只考虑了空间特征，所以不适合用视频的形式作为输入（multi shot）

本文在的特征采取的是手工特征（color、LBP），因为cnn需要大量的数据作为训练，而现有的训练数据集都比较小，容易过拟合。

主要思想

特征提取

LSTM的输入是手工特征，图片大小resize到128x64 kernal大小是16x8 重叠是8和4 所以对于一个frame出来15*15个结果 LBP有256维，加上HSV和Lab各三维，一共262维，每个time stamp（本文中一共是10个timestamp）是262x225维的输入每个time stamp的输出是512维的向量，根据下面的公式算出每个node的信息（i代表输入门 o代表输出门 f代表遗忘门）：