美文网首页
IR_DHDN_keras_从零实现_结构搭建篇

IR_DHDN_keras_从零实现_结构搭建篇

作者: zestloveheart | 来源:发表于2019-07-09 14:41 被阅读0次

    介绍

    代码实现git地址 | 原理分析

    工作描述

    最近在做基于深度学习的图像去噪的相关工作,粗略的查看了NITRE19的比赛。详细阅读了一篇论文Densely Connected Hierarchical Network for Image Denoising,更多论文可以在CVF Open access:CVPR2019下载。
    论文的具体原理分析见上一篇文章,这篇文章介绍如何使用keras(tensorflow)从零搭建一个去噪模型。主要包括以下内容:数据加载;模型搭建;评估验证。

    实现方法概述

    要实现一个模型,最简单的是先在github上找一个可以运行的模型代码(要包含以下基本功能:有数据集加载和预处理,可以训练和测试),然后把数据源更换,模型结构更换,就能无缝对接的运行。这样是改动最少、最有效率的做法。然而会因为实现的语言不同、实现的框架不同、公开模型的代码太复杂故不好直接改动 等原因,无法从现有的代码上搭起来,所以要自己从零实现一个模型。
    要实现一个网络,主要包括以下步骤:确定数据集格式,写出数据加载的方法;根据不同的任务对数据进行预处理,比如这里需要加噪声和扩增;详细阅读论文,搞清楚模型结构;根据评价指标写验证方法。

    DHDN结构实现

    DHDN的结构很整洁,全局来看就是一个UNet的扩展,有压缩路径和扩张路径,路径中以层为单元,多个层组成一个路径。论文中提到基本模块有DCR、降采样和上采样。在层中,以多个DCR组成;层间以降采样和上采样连接。
    分析之后,任务就很明确,包括4个基本结构(DCR,level,downsample,upsample)和1个主干网络(压缩路径和扩张路径)。
    定义基本模块时,用装饰器方法,以保证和keras的函数式API有相同的使用形式。
    在使用Tensorboard可视化模型时,使用tf.name_scope可以将基本模块缩到一起,让可视化更加直观。先看最后用tensorboard实现的结构图展示。


    DHDN结构图tensorboard.png

    导包

    from keras.layers import Input,PReLU,Conv2D,Add,Concatenate
    from keras.layers import MaxPooling2D,UpSampling2D
    from keras.models import Model
    from keras import backend as K
    import keras 
    import tensorflow as tf
    

    定义计数器

    为了让结构在tensorboard中显示的更清晰,使用一个字典用于保存每个结构出现的次数,将次数编号用作模块的名字。

    
    def init_name_counter():
        name_counter = {}
        name_counter['DCR'] = 0
        name_counter['level'] = 0
        name_counter['down'] = 0
        name_counter['up'] = 0
        return name_counter
    name_counter = init_name_counter()
    

    DCR模块

    DCR是可以参照DenseNet的Dense block实现,增长率设为原始filter的一半,后续接了一个short connection。

    def DCR(filter):
        def wrapper(inputs):
            with tf.name_scope('DCR'+str(name_counter['DCR'])):
                name_counter['DCR']+=1
                origin_input = inputs
                for _ in range(2):
                    x = Conv2D(filter//2,3,padding='same')(inputs)
                    x = PReLU(shared_axes=[1, 2])(x)
                    inputs = Concatenate()([inputs,x])
                x = Conv2D(filter,3,padding='same')(inputs)
                x = PReLU(shared_axes=[1, 2])(x)
                x = Add()([origin_input,x])
                x = PReLU(shared_axes=[1, 2])(x)
            return x
        return wrapper
    

    DCR0的结构图如下,初始输入给到dense块的conv和concat以及最后残差的add,输出接到下一个add。


    DCR

    层结构

    每一层由几个DCR组成,输入是降采样(上采样)后的特征,输出是降采样(上采样)前的特征。

    def level_block(filter):
        def wrapper(inputs):
            with tf.name_scope('level'+str(name_counter['level'])):
                for _ in range(2):
                    x = DCR(filter)(inputs)
                    inputs = Add()([inputs,x])
                name_counter['level']+=1
            return inputs
        return wrapper
    

    level0的结构图如下。初始的输入给到DCR和add中。输出一个给到降采样然后去下一个level,另一个给最后一个level做concat。


    level_block

    降采样

    降采样就使用普通的最大池化和卷积完成。

    def downsampling_block(filter,factor=2):
        def wrapper(inputs):
            with tf.name_scope('down'+str(name_counter['down'])):
                x = MaxPooling2D(factor,factor)(inputs)
                x = Conv2D(filter,3,padding='same')(x)
                x = PReLU(shared_axes=[1, 2])(x)
                name_counter['down']+=1
            return x
        return wrapper
    

    结构图如下


    downsample.png

    上采样

    上采样使用到了subpixel插值法,具体可以见论文 Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network,简单说来就是利用了tf.depth_to_space()方法,将通道上维度较高的信息转换到空间上,缩小通道数,增加空间大小。
    这里先给出一个subpixel的实现,出自git repo : fengwang/subpixel_conv2d,然后给出上采样块的实现。

    from keras.engine import Layer
    import tensorflow as tf
    
    # from keras.utils.generic_utils import get_custom_objects
    
    class SubpixelConv2D(Layer):
        """ Subpixel Conv2D Layer
    
        upsampling a layer from (h, w, c) to (h*r, w*r, c/(r*r)),
        where r is the scaling factor, default to 4
    
        # Arguments
        upsampling_factor: the scaling factor
    
        # Input shape
            Arbitrary. Use the keyword argument `input_shape`
            (tuple of integers, does not include the samples axis)
            when using this layer as the first layer in a model.
    
        # Output shape
            the second and the third dimension increased by a factor of
            `upsampling_factor`; the last layer decreased by a factor of
            `upsampling_factor^2`.
    
        # References
            Real-Time Single Image and Video Super-Resolution Using an Efficient
            Sub-Pixel Convolutional Neural Network Shi et Al. https://arxiv.org/abs/1609.05158
        """
    
        def __init__(self, upsampling_factor=4, **kwargs):
            super(SubpixelConv2D, self).__init__(**kwargs)
            self.upsampling_factor = upsampling_factor
    
        def build(self, input_shape):
            last_dim = input_shape[-1]
            factor = self.upsampling_factor * self.upsampling_factor
            if last_dim % (factor) != 0:
                raise ValueError('Channel ' + str(last_dim) + ' should be of '
                                 'integer times of upsampling_factor^2: ' +
                                 str(factor) + '.')
    
        def call(self, inputs, **kwargs):
            return tf.depth_to_space( inputs, self.upsampling_factor )
    
        def get_config(self):
            config = { 'upsampling_factor': self.upsampling_factor, }
            base_config = super(SubpixelConv2D, self).get_config()
            return dict(list(base_config.items()) + list(config.items()))
    
        def compute_output_shape(self, input_shape):
            factor = self.upsampling_factor * self.upsampling_factor
            input_shape_1 = None
            if input_shape[1] is not None:
                input_shape_1 = input_shape[1] * self.upsampling_factor
            input_shape_2 = None
            if input_shape[2] is not None:
                input_shape_2 = input_shape[2] * self.upsampling_factor
            dims = [ input_shape[0],
                     input_shape_1,
                     input_shape_2,
                     int(input_shape[3]/factor)
                   ]
            return tuple( dims )
    
    # get_custom_objects().update({'SubpixelConv2D': SubpixelConv2D})
    
    if __name__ == '__main__':
        from keras.layers import Input
        from keras.models import Model, load_model
        ip = Input(shape=(32, 32, 16))
        x = SubpixelConv2D(upsampling_factor=4)(ip)
        model = Model(ip, x)
        model.summary()
        # model.save( 'model.h5' )
    
        # print( '*'*80 )
        # nm = load_model( 'model.h5' )
        # print( 'new model loaded successfully' )
    
    
    

    上采样实现

    def upsampling_block(filter,factor=2):
        # from subpixel import SubpixelConv2D
        from model.subpixel import SubpixelConv2D
        def wrapper(inputs):
            with tf.name_scope('up'+str(name_counter['up'])):
                x = Conv2D(filter*4,3,padding='same')(inputs)
                x = PReLU(shared_axes=[1, 2])(x)
                # the paper is sub-pix interpolation, the implement is different, I haven't study the sub-pix's detail
                # maybe it will decrease the efficient
                x = SubpixelConv2D(upsampling_factor=2)(x)
                # x = UpSampling2D(factor)(x)
                name_counter['up']+=1
            return x
        return wrapper
    
    upsample

    全局结构

    首先设定了一系列参数,这里需要定义input_shape是因为PReLU需要预先设定好形状,不然会报错,不知道有没有办法避免形状的设定,使网络可以接受动态输入大小。
    然后依次定义输入层、压缩路径、底层路径(底层路径与其他路径略有不同)、扩张路径、最后的结束层。

    def DHDN():
        input_channel = 3
        input_shape = (64,64,input_channel)
        init_filter = 128
        level_number = 3
        
        inputs = Input(shape=input_shape)
        x = Conv2D(init_filter,1,padding='same')(inputs)
    
        # contracting path
        level_outputs = []
        for i in range(level_number):
            # c is every level's output to expanding path before downsampling
            c = level_block(init_filter*2**i)(x)
            level_outputs.append(c)
            x = downsampling_block(init_filter*2**(i+1),2)(c)
    
        # bottom level
        c = level_block(init_filter*2**(level_number))(x)
        x = Concatenate()([x,c])
    
        # expanding path
        for i in range(level_number-1,-1,-1):
            x = upsampling_block(init_filter*2**i,2)(x)
            x = Concatenate()([level_outputs[i],x])
            x = level_block(init_filter*2**(i+1))(x)
    
        # last level
        x = Conv2D(input_channel,1,padding='same')(x)
        outputs = x
    
        model = Model(input=inputs,output=outputs)
        model.summary()
        return model
    

    Tensorboard可视化

    首先调用模型,利用tf.summary.filewriter将模型图保存到E:/TensorBoard路径下,
    运行程序后,打开cmd或者bash,运行tensorboard --logdir=E:/TensorBoard
    打开localhost:6006,可以看到模型的结构,点击每一个block可以显示细节结构。

    if __name__ == "__main__":
        DHDN()
        with tf.Session() as sess:
            # sess.run(tf.global_variables_initializer())
            writer = tf.summary.FileWriter("E:/TensorBoard",sess.graph)
            writer.close()
    
    

    模型参数分析

    PReLU带有大量的可学习参数,调用PReLU(),总共参数量为236,024,835;调用PReLU(shared_axes=[1, 2]),总共参数量为217,767,939。
    论文中给出的参数量是168M。
    保存的模型大小为831M,约为参数量的4倍。参数以float32格式保存时,模型大小约为参数量的4倍。

    相关文章

      网友评论

          本文标题:IR_DHDN_keras_从零实现_结构搭建篇

          本文链接:https://www.haomeiwen.com/subject/pvtxkctx.html