将CNN的输入由RGB 3通道变成192个频域通道,结果会如何?阿里达摩院新出的paper《Learning in the Frequency Domain》实践了这个idea,并且在图像分类、实例分割任务上取得不错的表现。
Specifically for ImageNet clas- sification with the same input size, the proposed method achieves 1.41% and 0.66% top-1 accuracy improvements on ResNet-50 and MobileNetV2, respectively. Even with half input size, the proposed method still improves the top-1 accuracy on ResNet-50 by 1%. In addition, we observe a 0.8% average precision improvement on Mask R-CNN for instance segmentation on the COCO dataset.
要点:
-
优势:降低输入的size,减少传统CNN第一层下采样丢失的信息
-
算法流程如下:
RGB图->YCbCr->DCT-> 频域通道重排->频域通道选择->CNN
DCT.png -
用选择的频域通道直接替换layer0的下采样操作:
resnet.png -
通道选择通过一个小网络实现,直接把loss加到acc loss上
select.png
网友评论