介绍
工作描述
最近在做基于深度学习的图像去噪的相关工作,粗略的查看了NITRE19的比赛。详细阅读了一篇论文Densely Connected Hierarchical Network for Image Denoising,更多论文可以在CVF Open access:CVPR2019下载。
论文的具体原理分析见上一篇文章,这篇文章介绍如何使用keras(tensorflow)从零搭建一个去噪模型。主要包括以下内容:数据加载;模型搭建;评估验证。
实现方法概述
要实现一个模型,最简单的是先在github上找一个可以运行的模型代码(要包含以下基本功能:有数据集加载和预处理,可以训练和测试),然后把数据源更换,模型结构更换,就能无缝对接的运行。这样是改动最少、最有效率的做法。然而会因为实现的语言不同、实现的框架不同、公开模型的代码太复杂故不好直接改动 等原因,无法从现有的代码上搭起来,所以要自己从零实现一个模型。
要实现一个网络,主要包括以下步骤:确定数据集格式,写出数据加载的方法;根据不同的任务对数据进行预处理,比如这里需要加噪声和扩增;详细阅读论文,搞清楚模型结构;根据评价指标写验证方法。
DHDN结构实现
DHDN的结构很整洁,全局来看就是一个UNet的扩展,有压缩路径和扩张路径,路径中以层为单元,多个层组成一个路径。论文中提到基本模块有DCR、降采样和上采样。在层中,以多个DCR组成;层间以降采样和上采样连接。
分析之后,任务就很明确,包括4个基本结构(DCR,level,downsample,upsample)和1个主干网络(压缩路径和扩张路径)。
定义基本模块时,用装饰器方法,以保证和keras的函数式API有相同的使用形式。
在使用Tensorboard可视化模型时,使用tf.name_scope可以将基本模块缩到一起,让可视化更加直观。先看最后用tensorboard实现的结构图展示。
DHDN结构图tensorboard.png
导包
from keras.layers import Input,PReLU,Conv2D,Add,Concatenate
from keras.layers import MaxPooling2D,UpSampling2D
from keras.models import Model
from keras import backend as K
import keras
import tensorflow as tf
定义计数器
为了让结构在tensorboard中显示的更清晰,使用一个字典用于保存每个结构出现的次数,将次数编号用作模块的名字。
def init_name_counter():
name_counter = {}
name_counter['DCR'] = 0
name_counter['level'] = 0
name_counter['down'] = 0
name_counter['up'] = 0
return name_counter
name_counter = init_name_counter()
DCR模块
DCR是可以参照DenseNet的Dense block实现,增长率设为原始filter的一半,后续接了一个short connection。
def DCR(filter):
def wrapper(inputs):
with tf.name_scope('DCR'+str(name_counter['DCR'])):
name_counter['DCR']+=1
origin_input = inputs
for _ in range(2):
x = Conv2D(filter//2,3,padding='same')(inputs)
x = PReLU(shared_axes=[1, 2])(x)
inputs = Concatenate()([inputs,x])
x = Conv2D(filter,3,padding='same')(inputs)
x = PReLU(shared_axes=[1, 2])(x)
x = Add()([origin_input,x])
x = PReLU(shared_axes=[1, 2])(x)
return x
return wrapper
DCR0的结构图如下,初始输入给到dense块的conv和concat以及最后残差的add,输出接到下一个add。
DCR
层结构
每一层由几个DCR组成,输入是降采样(上采样)后的特征,输出是降采样(上采样)前的特征。
def level_block(filter):
def wrapper(inputs):
with tf.name_scope('level'+str(name_counter['level'])):
for _ in range(2):
x = DCR(filter)(inputs)
inputs = Add()([inputs,x])
name_counter['level']+=1
return inputs
return wrapper
level0的结构图如下。初始的输入给到DCR和add中。输出一个给到降采样然后去下一个level,另一个给最后一个level做concat。
level_block
降采样
降采样就使用普通的最大池化和卷积完成。
def downsampling_block(filter,factor=2):
def wrapper(inputs):
with tf.name_scope('down'+str(name_counter['down'])):
x = MaxPooling2D(factor,factor)(inputs)
x = Conv2D(filter,3,padding='same')(x)
x = PReLU(shared_axes=[1, 2])(x)
name_counter['down']+=1
return x
return wrapper
结构图如下
downsample.png
上采样
上采样使用到了subpixel插值法,具体可以见论文 Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network,简单说来就是利用了tf.depth_to_space()方法,将通道上维度较高的信息转换到空间上,缩小通道数,增加空间大小。
这里先给出一个subpixel的实现,出自git repo : fengwang/subpixel_conv2d,然后给出上采样块的实现。
from keras.engine import Layer
import tensorflow as tf
# from keras.utils.generic_utils import get_custom_objects
class SubpixelConv2D(Layer):
""" Subpixel Conv2D Layer
upsampling a layer from (h, w, c) to (h*r, w*r, c/(r*r)),
where r is the scaling factor, default to 4
# Arguments
upsampling_factor: the scaling factor
# Input shape
Arbitrary. Use the keyword argument `input_shape`
(tuple of integers, does not include the samples axis)
when using this layer as the first layer in a model.
# Output shape
the second and the third dimension increased by a factor of
`upsampling_factor`; the last layer decreased by a factor of
`upsampling_factor^2`.
# References
Real-Time Single Image and Video Super-Resolution Using an Efficient
Sub-Pixel Convolutional Neural Network Shi et Al. https://arxiv.org/abs/1609.05158
"""
def __init__(self, upsampling_factor=4, **kwargs):
super(SubpixelConv2D, self).__init__(**kwargs)
self.upsampling_factor = upsampling_factor
def build(self, input_shape):
last_dim = input_shape[-1]
factor = self.upsampling_factor * self.upsampling_factor
if last_dim % (factor) != 0:
raise ValueError('Channel ' + str(last_dim) + ' should be of '
'integer times of upsampling_factor^2: ' +
str(factor) + '.')
def call(self, inputs, **kwargs):
return tf.depth_to_space( inputs, self.upsampling_factor )
def get_config(self):
config = { 'upsampling_factor': self.upsampling_factor, }
base_config = super(SubpixelConv2D, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
def compute_output_shape(self, input_shape):
factor = self.upsampling_factor * self.upsampling_factor
input_shape_1 = None
if input_shape[1] is not None:
input_shape_1 = input_shape[1] * self.upsampling_factor
input_shape_2 = None
if input_shape[2] is not None:
input_shape_2 = input_shape[2] * self.upsampling_factor
dims = [ input_shape[0],
input_shape_1,
input_shape_2,
int(input_shape[3]/factor)
]
return tuple( dims )
# get_custom_objects().update({'SubpixelConv2D': SubpixelConv2D})
if __name__ == '__main__':
from keras.layers import Input
from keras.models import Model, load_model
ip = Input(shape=(32, 32, 16))
x = SubpixelConv2D(upsampling_factor=4)(ip)
model = Model(ip, x)
model.summary()
# model.save( 'model.h5' )
# print( '*'*80 )
# nm = load_model( 'model.h5' )
# print( 'new model loaded successfully' )
上采样实现
def upsampling_block(filter,factor=2):
# from subpixel import SubpixelConv2D
from model.subpixel import SubpixelConv2D
def wrapper(inputs):
with tf.name_scope('up'+str(name_counter['up'])):
x = Conv2D(filter*4,3,padding='same')(inputs)
x = PReLU(shared_axes=[1, 2])(x)
# the paper is sub-pix interpolation, the implement is different, I haven't study the sub-pix's detail
# maybe it will decrease the efficient
x = SubpixelConv2D(upsampling_factor=2)(x)
# x = UpSampling2D(factor)(x)
name_counter['up']+=1
return x
return wrapper
upsample
全局结构
首先设定了一系列参数,这里需要定义input_shape是因为PReLU需要预先设定好形状,不然会报错,不知道有没有办法避免形状的设定,使网络可以接受动态输入大小。
然后依次定义输入层、压缩路径、底层路径(底层路径与其他路径略有不同)、扩张路径、最后的结束层。
def DHDN():
input_channel = 3
input_shape = (64,64,input_channel)
init_filter = 128
level_number = 3
inputs = Input(shape=input_shape)
x = Conv2D(init_filter,1,padding='same')(inputs)
# contracting path
level_outputs = []
for i in range(level_number):
# c is every level's output to expanding path before downsampling
c = level_block(init_filter*2**i)(x)
level_outputs.append(c)
x = downsampling_block(init_filter*2**(i+1),2)(c)
# bottom level
c = level_block(init_filter*2**(level_number))(x)
x = Concatenate()([x,c])
# expanding path
for i in range(level_number-1,-1,-1):
x = upsampling_block(init_filter*2**i,2)(x)
x = Concatenate()([level_outputs[i],x])
x = level_block(init_filter*2**(i+1))(x)
# last level
x = Conv2D(input_channel,1,padding='same')(x)
outputs = x
model = Model(input=inputs,output=outputs)
model.summary()
return model
Tensorboard可视化
首先调用模型,利用tf.summary.filewriter将模型图保存到E:/TensorBoard路径下,
运行程序后,打开cmd或者bash,运行tensorboard --logdir=E:/TensorBoard
打开localhost:6006,可以看到模型的结构,点击每一个block可以显示细节结构。
if __name__ == "__main__":
DHDN()
with tf.Session() as sess:
# sess.run(tf.global_variables_initializer())
writer = tf.summary.FileWriter("E:/TensorBoard",sess.graph)
writer.close()
模型参数分析
PReLU带有大量的可学习参数,调用PReLU(),总共参数量为236,024,835;调用PReLU(shared_axes=[1, 2]),总共参数量为217,767,939。
论文中给出的参数量是168M。
保存的模型大小为831M,约为参数量的4倍。参数以float32格式保存时,模型大小约为参数量的4倍。
网友评论