美文网首页
Learning in the Frequency Domain

Learning in the Frequency Domain

作者: Cat丹 | 来源:发表于2020-03-12 16:10 被阅读0次

    将CNN的输入由RGB 3通道变成192个频域通道,结果会如何?阿里达摩院新出的paper《Learning in the Frequency Domain》实践了这个idea,并且在图像分类、实例分割任务上取得不错的表现。

    Specifically for ImageNet clas- sification with the same input size, the proposed method achieves 1.41% and 0.66% top-1 accuracy improvements on ResNet-50 and MobileNetV2, respectively. Even with half input size, the proposed method still improves the top-1 accuracy on ResNet-50 by 1%. In addition, we observe a 0.8% average precision improvement on Mask R-CNN for instance segmentation on the COCO dataset.

    要点:

    • 优势:降低输入的size,减少传统CNN第一层下采样丢失的信息

    • 算法流程如下:
      RGB图->YCbCr->DCT-> 频域通道重排->频域通道选择->CNN


      DCT.png
    • 用选择的频域通道直接替换layer0的下采样操作:


      resnet.png
    • 通道选择通过一个小网络实现,直接把loss加到acc loss上


      select.png

    原文及解读

    相关文章

      网友评论

          本文标题:Learning in the Frequency Domain

          本文链接:https://www.haomeiwen.com/subject/jyepjhtx.html