美文网首页
Paper | Detecting Twenty-thousan

Paper | Detecting Twenty-thousan

作者: 与阳光共进早餐 | 来源:发表于2023-12-13 08:12 被阅读0次

写在前面

  • 文章出处: ECCV 2022
  • 模型名字: Detic
  • 整体概括:这篇文章跟最开始的OVD-Net一样,都是从pretraining的角度解决open vocabulary的问题,但是这篇文章的思路更加简单粗暴,直接加入imagenet的类别作为训练。本质上不是真正的open vocabulary,但是能够囊括2000类别;

1. Introduction:

  1. OD has two subtasks: 1) finding boxes (localization); 2) naming the boxes (classification)

  2. Previous works couple these two subtasks;

  3. however, the detection benchmarks are much smaller than the classification benchmark;

as in the fig, both the image number and the category number of LVIS (OD) are much smaller than ImageNet (CLS).

image.png

This paper:

propose a detector with image classes (Detic) that uses image-level supervision in addition to detection supervision.

  • decouple the localization and classification sub-problems;

  • use image-level labels to train the classifier and broaden the vocabulary of the detector;

illustration:

image.png

standard OD: need gt boxes and labels;

weakly supervised od: assign image-level labels to predicted boxes [error-prone]

this paper: assigns image-level labels to the max-size proposals.

2 Method

2.1 preliminary

  • detection dataset D_{det}, with class set C_{det}

  • image classification dataset D_{cls}, with class set C_{cls}

  • testing dataset with class set C_{test}.

  • C_{det}, C_{cls}, and C_{test} may or may not overlap.

tradional OD: C_{test} =C_{det},D_{cls} = \phi $

OVD: allows C_{test} \neq C_{det}

2.2 Detic

the whole idea is quite simple.

  • use both the detection dataset D_{det} and the classifiction dataset D_{cls} to train the detection model.
image.png
  1. sample a batch from both D_{det} and D_{cls}.

  2. if image belongs to D_{det}, then loss = typical od loss, rpn loss + rg loss + cls loss

  3. if image belongs to D_{cls}, then loss = max-size loss, max-size means the proposal has the max size is finally regarded as the region, then used to caculate the cls loss.

image.png

相关文章

网友评论

      本文标题:Paper | Detecting Twenty-thousan

      本文链接:https://www.haomeiwen.com/subject/mgsbgdtx.html