美文网首页
论文Pyramid Attention Network for

论文Pyramid Attention Network for

作者: chunleiml | 来源:发表于2018-06-29 10:52 被阅读127次

    论文地址:https://arxiv.org/abs/1805.10180
    Face++, 北理工, 北大近期联合发表的一篇关于语义分割的的金字塔注意力模型。

    这个模型适用于2D网络,因为里面用到了Global Pooling, 这个操作不适合3D网络,所以Keras里面也没有相应的3D模块,只有1D和2D的GlobalAveragePooling, GlobalMaxPooling。而且在这两个中作者发现GlobalAveragePooling的效果更好。

    这个模型主要由两部分组成:Feature Pyramid Attention(FPA)和 Global Attention Upsample(GAU)

    其中FPA和deeplab里面的Spatial Pyramid Pooling很相似


    FPA.png

    全局注意力上采样模块 (Global Attention Upsample,GAU),对低层次特征执行 3×3 的卷积操作,以减少 CNN 特征图的通道数。从高层次特征生成的全局上下文信息依次经过 1×1 卷积、批量归一化 和非线性变换操作 ,然后再与低层次特征相乘。最后,高层次特征与加权后的低层次特征相加并进行逐步的上采样过程。


    GAU.png
    整体架构结合特征金字塔注意力模块 (FPA) 和全局注意力上采样模块 (GAU)
    FAN.png

    对这两个模块的作用作者做了总结:FPA 模块能够提供像素级注意力信息并通过金字塔结构来扩大感受野的范围。GAU 模块能够利用高层次特征图来指导低层次特征恢复图像像素的定位。

    最后的实验结果表明,这篇论文所提出的方法在 PASCAL VOC 2012 语义分割任务实现了当前最佳的性能。

    代码实现:

        def Inception_dilation(self, inputs, f):        
            conv3 = Conv2D(f, (3, 3), padding='same', activation= 'selu', kernel_initializer = 'he_normal')(inputs)
    
            conv5 = Conv2D(f, (3, 3), padding='same', dilation_rate = (2, 2), activation= 'selu', kernel_initializer = 'he_normal')(inputs)
    
            conv7 = Conv2D(f, (3, 3), padding='same', dilation_rate = (4, 4), activation= 'selu', kernel_initializer = 'he_normal')(inputs)
    
    
            conv9 = Conv2D(f, (3, 3), padding='same', dilation_rate = (6, 6), activation= 'selu', kernel_initializer = 'he_normal')(inputs)
          
            merge2 = merge([conv3, conv5, conv7, conv9], mode='concat', concat_axis=3)
            return merge2
        def FeaturePyramidAttention(self, inputs, f):
            #f:通道数量
            
            conv1 = Conv2D(f, (1, 1), padding='same', activation= 'selu', kernel_initializer = 'he_normal')(inputs)
            
            conv7 = Conv2D(f, (3, 3), padding='same', dilation_rate = (4, 4), activation= 'selu', kernel_initializer = 'he_normal')(inputs)
            pool1 = MaxPooling2D(pool_size=(4, 4))(conv7)
            # conv7 = Conv2D(f, (3, 3), padding='same', dilation_rate = (4, 4), activation= 'selu', kernel_initializer = 'he_normal')(conv7)
            
            conv5 = Conv2D(f, (3, 3), padding='same', dilation_rate = (3, 3), activation= 'selu', kernel_initializer = 'he_normal')(pool1)
            pool2 = MaxPooling2D(pool_size=(4, 4))(conv5)
            # conv5 = Conv2D(f, (3, 3), padding='same', dilation_rate = (3, 3), activation= 'selu', kernel_initializer = 'he_normal')(conv5)
            
            conv3 = Conv2D(f, (3, 3), padding='same', dilation_rate = (2, 2), activation= 'selu', kernel_initializer = 'he_normal')(pool2)
            pool3 = MaxPooling2D(pool_size=(4, 4))(conv3)
            conv2 = Conv2D(f, (3, 3), padding='same', activation= 'selu', kernel_initializer = 'he_normal')(pool3)
            
            up1 = UpSampling2D(size=(4, 4))(conv2)
            up1 = Conv2D(f, (1, 1), padding='same', activation= 'selu', kernel_initializer = 'he_normal')(up1)
            up1 = merge([up1, conv3], mode='concat', concat_axis=3)
            
            up2 = UpSampling2D(size=(4, 4))(up1)
            up2 = Conv2D(f, (1, 1), padding='same', activation= 'selu', kernel_initializer = 'he_normal')(up2)
            up2 = merge([up2, conv5], mode='concat', concat_axis=3)
    
            up3 = UpSampling2D(size=(4, 4))(up2)
            up3 = Conv2D(f, (1, 1), padding='same', activation= 'selu', kernel_initializer = 'he_normal')(up3)
            up3 = merge([up3, conv7], mode='concat', concat_axis=3)
            out = merge([up3, conv1], mode='concat', concat_axis=3)
            return out
        
        def GlobalAttentionUpsample(self, inputs_low, inputs_high, f):
            #inputs_low:低层次信息输入
            #inputs_high:高层次信息输入
            print('inputs_high.shape---------',inputs_high.shape)
            conv3 = Conv2D(f*3, (3, 3), padding='same', activation= 'selu', kernel_initializer = 'he_normal')(inputs_low)
            gap = GlobalAveragePooling2D()(inputs_high)
            print('gap.shape------------', gap.shape)
    #        conv1 = Conv2D(f*4, (1, 1), padding='same', activation= 'selu', kernel_initializer = 'he_normal')(gap)
            conv1conv3 = Multiply()([gap, conv3])
    
            out = merge([conv1conv3, inputs_high], mode='concat', concat_axis=3)
            return out
        def PAN(self):
            
            inputs = Input((self.img_rows, self.img_cols,1))
            
            conv1 = self.Inception_dilation(inputs, 4)
            res1 = merge([inputs, conv1], mode='concat', concat_axis=3)
            conv2 = self.Inception_dilation(res1, 4)
            conv2 = self.Inception_dilation(conv2, 4)
            res2 = merge([res1, conv2], mode='concat', concat_axis=3)
            conv3 = self.Inception_dilation(res2, 4)
            conv3 = self.Inception_dilation(conv3, 4)
            res3 = merge([res2, conv3], mode='concat', concat_axis=3)
            conv4 = self.Inception_dilation(res3, 4)
            conv4 = self.Inception_dilation(conv4, 4)
            #res4 = merge([res3, conv4], mode='concat', concat_axis=3)
            
            
            FPA = self.FeaturePyramidAttention(conv4, 4)
            print('FPA.shape', FPA.shape)
            print('conv3.shape', conv3.shape)
            GAU1 = self.GlobalAttentionUpsample(conv3, FPA, 4)
            GF1 = merge([FPA, GAU1], mode='concat', concat_axis=3)
            
            GAU2 = self.GlobalAttentionUpsample(conv2, GF1, 12)
            GF2 = merge([GF1, GAU2], mode='concat', concat_axis=3)
            
            GAU3 = self.GlobalAttentionUpsample(conv1, GF2, 36)
            GF3 = merge([GF2, GAU3], mode='concat', concat_axis=3)
           
    
            conv8 = Conv2D(4, (1, 1), padding='same', activation= 'selu', kernel_initializer = 'he_normal')(GF3)
    #        conv9 = Conv2D(2, (3, 3, 3), padding='same', activation= 'selu', kernel_initializer = 'he_normal')(conv8)
            print("conv8 shape:", conv8.shape)
            conv9 = Conv2D(1, 1, activation = 'sigmoid')(conv8)
            print("conv9 shape:", conv9.shape)
            
            model = Model(inputs=inputs, outputs=conv9)
    #        plot_model(model, to_file = 'model_3dxception.png', show_shapes = True)         
            parallel_model = multi_gpu_model(model, gpus=2)
            parallel_model.compile(optimizer=Adam(lr=0.001), loss=self.dice_coef_loss, metrics=['accuracy'])
            with open('seg_liver2D_pan.json', 'w') as files:
                files.write(model.to_json())
            return parallel_model
    

    注:根据Keras最新版本,代码中merge操作建议改成concatenate
    例如:

    up2 = merge([up2, conv5], mode='concat', concat_axis=3)
    #改为
    up2 = concatenate([up2, conv5], axis=3)
    

    相关文章

      网友评论

          本文标题:论文Pyramid Attention Network for

          本文链接:https://www.haomeiwen.com/subject/qgioyftx.html