Learning in the Frequency Domain

作者: Cat丹 | 来源:发表于2020-03-12 16:10 被阅读0次

将CNN的输入由RGB 3通道变成192个频域通道，结果会如何？阿里达摩院新出的paper《Learning in the Frequency Domain》实践了这个idea，并且在图像分类、实例分割任务上取得不错的表现。

Specifically for ImageNet clas- sification with the same input size, the proposed method achieves 1.41% and 0.66% top-1 accuracy improvements on ResNet-50 and MobileNetV2, respectively. Even with half input size, the proposed method still improves the top-1 accuracy on ResNet-50 by 1%. In addition, we observe a 0.8% average precision improvement on Mask R-CNN for instance segmentation on the COCO dataset.

要点：

优势：降低输入的size，减少传统CNN第一层下采样丢失的信息
算法流程如下：
RGB图->YCbCr->DCT-> 频域通道重排->频域通道选择->CNN

DCT.png
用选择的频域通道直接替换layer0的下采样操作：

resnet.png
通道选择通过一个小网络实现，直接把loss加到acc loss上

select.png

原文及解读

paper
解读

网友评论

本文标题：Learning in the Frequency Domain

本文链接：https://www.haomeiwen.com/subject/jyepjhtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Learning in the Frequency Domain

要点：

原文及解读

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读