Paper | Tracking everything in

作者: 与阳光共进早餐 | 来源:发表于2023-12-21 04:09 被阅读0次

1 写在前面

https://arxiv.org/pdf/2207.12978.pdf
ECCV2022
task： large-scale Multiple Object Tracking （MOT）

2 introduction

MOT task: estimate the trajectory of objects in a video sequence.

limitation1: common MOT benchmarks [16,32,11] only consider tracking objects from very few pre-defined categories, e.g., pedestrian and car, existing MOT methods do not perform well on a large number of categories.

limitation2：the metrics of MOT can be better refined

Current MOT models and metrics are mainly designed for single-category multiple-object racking. When extending to large-scale multi-category MOT, methods simply detect and classify each object and achieve the association via the same labels. This relies heavily on the classification results.

Thus, when the classification is inaccurate e.g., in large-scale multi-category MOT, existing models and evaluation metrics should be improved.

This paper：
To expand tracking to a more general scenario, we propose that classification should be disentangled from tracking, in both evaluation and model design, for multi-category MOT.

design a new metric, Track Every Thing Accuracy (TETA)；
2）a new model, Track Every Thing tracker (TETer).

exp：
large-scale multi-category tracking datasets, TAO and BDD100K.

3 Tracking-Every-Thing Metric

3.1 Limitations for Large-scale MOT Evaluation

How to handle classification. 1. Simply associating objects via the same label relies on the correct classification results. 2. the most naive solution, ignoring the classification results, leads to the evaluation being dominated by the head classes in the long-tailed distribution dataset.

Incomplete Annotations: the large-scale datasets are not exhaustively annotated, so how can we identify and penalize false positive(FP) predictions?

3.2 Tracking-Every-Thing Accuracy (TETA)

TETA consists of three parts:

a localization score
an association score
a classification score

evaluate the different aspects properly.

To avoid false punishments, we ignore the predictions that are not assigned to any clusters during evaluation.

4 Tracking-Every-Thing Tracker

framework：

4.1 class-agnostic localization

This shows the bottleneck of the detection model lies in the classification

Thus, this paper first performs class-agnostic localization.

4.2 associating everything

common clues: location, appearance, and class
motion (location) is irregular (x)
many objects are not predefined (x)
while objects in different classes usually have different appearances (selected as the main cue)

Instead of using class information as "hard" prior, the class information is used in a "soft" way by contrastive learning.

With the CEM learned, association can be done by comparing the similarities

网友评论

本文标题：Paper | Tracking everything in

本文链接：https://www.haomeiwen.com/subject/efuxndtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！