一 写在前面
未经允许,不得转载,谢谢~~
最近想看看long-tailed recognition是如何处理imbalanced dataset的,对查阅到比较有用的资料做了一个整理和记录。
二 overview
2.1 基本问题介绍
大多数我们用的benchmark都是类别均衡的(每个类别的标注样本数一致),但是事实上自然界中的物体很可能是一个类别均衡的分布,常见类别样本多,稀有类别样本少,更直观的解释可以看下面这张图。
long-tailed recognition解决的就是数据呈现这样长尾分布时候的识别问题。
2.2 资料推荐
这边推荐两个我觉得很不错的link
-
比较好的中文blog:
https://zhuanlan.zhihu.com/p/158638078 -
long-tailed paper link:
https://github.com/zwzhang121/Awesome-of-Long-Tailed-Recognition
三 typical paper list
根据现有的四大类方法(re-sampling,re-weighting,transfer learning,else),综合根据以上的资料和文章的引用量,code开源等情况整理了以下list,供需要的同学使用~
3.1 re-sampling
通过影响样本采样频率来达到balance,又可以分为头部类别欠采样(under-sampling)和尾部类别过采样(over-sampling)两个细分类别。
paper:ICLR2020: Decoupling Representation and Classifier for Long-Tailed Recognition, ICLR 2020 (star300+)
- paper:https://arxiv.org/abs/1910.09217
- code: https://github.com/facebookresearch/classifier-balancing
paper:BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition,CVPR 2020 (star300+)
3.2 re-weighting
此类方法主要表现在分类loss上,对loss进行加权。
paper:Class-Balanced Loss Based on Effective Number of Samples,CVPR 2019 (star300+)
paper:Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss,NIPS 2019(star300+)
3.3 transfer learning
希望将知识从头部类迁移到尾部类别。
paper:Large-Scale Long-Tailed Recognition in an Open World,CVPR 2019 (star500+)
paper:Deep Representation Learning on Long-tailed Data: A Learnable Embedding Augmentation Perspective,CVPR 2020
paper:Learning From Multiple Experts: Self-paced Knowledge Distillation for Long-tailed Classification,ECCV 2020
3.4 else
paper:Long-tailed Recognition by Routing Diverse Distribution-Aware Experts, arxiv 2020
paper:ResLT: Residual Learning for Long-tailed Recognition,arxiv2021
- https://arxiv.org/abs/2101.10633
- https://github.com/FPNAS/ResLT (还没有release,210224)
网友评论