风格迁移模型架构优化

作者: nonoka | 来源:发表于2019-01-23 14:31 被阅读0次

风格迁移模型架构优化
datawhale-task06/09（批量归一化和残差网络；凸
asp.net core系列 63 领域模型架构 eShopOn
风格迁移模型性能测试
[转]Mysql模型
人人都是毕加索
Git跨仓库迁移代码文件，并保留git历史记录
样式迁移（neural style）
4.8 django ORM模型迁移
EF6.X代码迁移

在移动端部署时，比较成熟的方案是使用单模型单风格（Per-Style-Per-model，PSPM）的模型，因为这类模型只有在训练时依赖VGG16，在部署时可以丢弃VGG16，只需部署单独的生成网络即可。

在Neural-Style-Transfer-Papers这个仓库中统计了许多图像风格迁移领域的论文，并按照单模型单风格、单模型多风格、单模型任意风格进行了分类，而在其中，单模型单风格的论文大致有4片，并且提供了实现的代码，接下来的工作是阅读所有的代码，对比其中的生成网络的架构，寻找优化的方向与可能性。

1. Perceptual Losses for Real-Time Style Transfer and Super-Resolution

第一篇论文来自斯坦福大学的Justin Johnson、Alexandre Alahi以及Li Fei-Fei，论文地址为[Paper] (ECCV 2016)，在Neural-Style-Transfer-Papers仓库中，收录了这篇论文的三种实现，这里主要看其TensorFlow实现。

其图像生成网络结构如下，其中preds的写法不是很明白：

def net(image):
    conv1 = _conv_layer(image, 32, 9, 1)
    conv2 = _conv_layer(conv1, 64, 3, 2)
    conv3 = _conv_layer(conv2, 128, 3, 2)
    resid1 = _residual_block(conv3, 3)
    resid2 = _residual_block(resid1, 3)
    resid3 = _residual_block(resid2, 3)
    resid4 = _residual_block(resid3, 3)
    resid5 = _residual_block(resid4, 3)
    conv_t1 = _conv_tranpose_layer(resid5, 64, 3, 2)
    conv_t2 = _conv_tranpose_layer(conv_t1, 32, 3, 2)
    conv_t3 = _conv_layer(conv_t2, 3, 9, 1, relu=False)
    preds = tf.nn.tanh(conv_t3) * 150 + 255./2
    return preds

这是卷积层，和其他网络中的实现差不多：

def _conv_layer(net, num_filters, filter_size, strides, relu=True):
    weights_init = _conv_init_vars(net, num_filters, filter_size)
    strides_shape = [1, strides, strides, 1]
    net = tf.nn.conv2d(net, weights_init, strides_shape, padding='SAME')
    net = _instance_norm(net)
    if relu:
        net = tf.nn.relu(net)
    return net

这是反卷积层，没有使用图像差值，应该会有棋盘效应：

def _conv_tranpose_layer(net, num_filters, filter_size, strides):
    weights_init = _conv_init_vars(net, num_filters, filter_size, transpose=True)
    batch_size, rows, cols, in_channels = [i.value for i in net.get_shape()]
    new_rows, new_cols = int(rows * strides), int(cols * strides)
    new_shape = [batch_size, new_rows, new_cols, num_filters]
    tf_shape = tf.stack(new_shape)
    strides_shape = [1,strides,strides,1]
    net = tf.nn.conv2d_transpose(net, weights_init, tf_shape, strides_shape, padding='SAME')
    net = _instance_norm(net)
    return tf.nn.relu(net)

这是残差块的实现，由两层卷积层组成：

def _residual_block(net, filter_size=3):
    tmp = _conv_layer(net, 128, filter_size, 1)
    return net + _conv_layer(tmp, 128, filter_size, 1, relu=False)

这是instance normolization的实现：

def _instance_norm(net, train=True):
    batch, rows, cols, channels = [i.value for i in net.get_shape()]
    var_shape = [channels]
    mu, sigma_sq = tf.nn.moments(net, [1,2], keep_dims=True)
    shift = tf.Variable(tf.zeros(var_shape))
    scale = tf.Variable(tf.ones(var_shape))
    epsilon = 1e-3
    normalized = (net-mu)/(sigma_sq + epsilon)**(.5)
    return scale * normalized + shift

2. Texture Networks: Feed-forward Synthesis of Textures and Stylized Images

第二篇论文来自Prisma team，是对原始论文、祖师爷Gatys的A Neural Algorithm of Artistic Style的改进，地址为[Paper] (ICML 2016)，在Neural-Style-Transfer-Papers仓库中，收录了这篇论文的Torch和TensorFlow实现，这里主要看其TensorFlow实现。

这篇论文和前面一篇差不多，都是将原来的求解全局最优解问题转换成用前向网络逼近最优解，Gatys的方法每次要将一幅内容图进行风格转换，就要进行不断的迭代，而这里列举的两篇论文都是先训练得到前向生成网络，以后再来一张内容图，直接输入到生成网络中，即可得到具有预先训练的风格的内容图。

其图像生成网络的架构如下，和第一篇论文的实现代码一样：

def network(input_image):
    ops = {}
    image = tf.placeholder(tf.float32, shape=None, name='image-placeholder')

    ops['preprocessing'] = tf.div(image, 255)
    ops['preprocessing'] = tf.expand_dims( ops['preprocessing'], 0)
    ops['pad_2'] = pad(ops['preprocessing'], 4)
    
    ops['conv_3'] = conv(ops['pad_2'], [1, 1, 1, 1], [9, 9, 3, 32])
    ops['norm_4'] = norm(ops['conv_3'], [32])
    ops['relu_5'] = tf.nn.relu( ops['norm_4'])

    ops['conv_6'] = conv(ops['relu_5'], [1, 2, 2, 1], [3, 3, 32, 64])
    ops['norm_7'] = norm(ops['conv_6'], [64])
    ops['relu_8'] = tf.nn.relu(ops['norm_7'])

    ops['conv_9'] = conv(ops['relu_8'], [1, 2, 2, 1], [3, 3, 64, 128])
    ops['norm_10'] = norm(ops['conv_9'], [128])
    ops['relu_11'] = tf.nn.relu(ops['norm_10'])

    ops['res_block_11'] = ops['relu_11']
    for i in range(12, 17):
        ops['res_block_' + str(i)] = res_block(ops['res_block_' + str(i-1)])
    
    ops['conv_transpose_17'] = conv_transpose(
        ops['res_block_16'], [1, 2, 2, 1], [3, 3, 64, 128], ops['conv_6'])
    ops['norm_18'] = norm(ops['conv_transpose_17'], [64])
    ops['relu_19'] = tf.nn.relu(ops['norm_18'])

    ops['conv_transpose_20'] = conv_transpose(
        ops['relu_19'], [1, 2, 2, 1], [3, 3, 32, 64], ops['conv_3'])
    ops['norm_21'] = norm(ops['conv_transpose_20'], [32])
    ops['relu_22'] = tf.nn.relu(ops['norm_21'])

    ops['pad_23'] = pad(ops['relu_22'], 1);
    ops['conv_24'] = conv(ops['pad_23'], [1, 1, 1, 1], [3, 3, 32, 3])

    ops['squeeze'] = tf.squeeze(ops['conv_24'])
    vgg_mean_0 = tf.constant(103.939)
    vgg_mean_1 = tf.constant(116.779)
    vgg_mean_2 = tf.constant(123.68)
    red, green, blue = tf.split(ops['squeeze'], num_or_size_splits=3, axis=2)
    ops['bgr'] = tf.concat([blue + vgg_mean_2, green + vgg_mean_1, red + vgg_mean_0], 2)

    # TensorBoard output
    tf.summary.FileWriter("./tb/", tf.get_default_graph()).close()

    # Run session
    sess = tf.Session()
    saver = tf.train.Saver()
    saver.restore(sess, 'model/texture_net.chkp')
    output = sess.run(ops['bgr'], feed_dict={image: input_image})
    sess.close()

    return output

卷积层定义如下，和之前的相同：

def conv(input, strides, shape_filter):
    filter = tf.Variable(tf.truncated_normal(shape_filter, stddev=0.1), name='filter')
    return tf.nn.conv2d(input, filter, strides, padding='VALID', use_cudnn_on_gpu=None)

用的是普通的Batch Normalization，代码如下：

def norm(input, shape_parameter):
    scale = tf.Variable(tf.truncated_normal(shape_parameter, stddev=0.1), name='scale')
    offset = tf.Variable(tf.truncated_normal(shape_parameter, stddev=0.1), name='offset')
    epsilon = 1e-5
    mean, var = tf.nn.moments(input, [1, 2], keep_dims=True)
    return tf.nn.batch_normalization(input, mean, var, offset, scale, epsilon)

同样是简单的反卷积：

def conv_transpose(input, strides, shape_filter, corresponding_tensor):
    filter = tf.Variable(tf.truncated_normal(shape_filter, stddev=0.1), name='filter')
    shape = tf.shape(corresponding_tensor)
    outputshape = tf.stack([shape[0], shape[1], shape[2], shape[3]])
    return tf.nn.conv2d_transpose(input, filter, outputshape, strides, padding='VALID')

3. Texture Networks: Feed-forward Synthesis of Textures and Stylized Images

第三篇论文的地址为 [Paper] (ECCV 2016)，和前两篇论文不同的是，它使用生成对抗网络来进行风格迁移，但效果好像不太好。在Neural-Style-Transfer-Papers仓库中，仅收录了这篇论文的Torch实现（说是torch，还以为是Python写的呢，竟然是lua）。

下面是该网络的生成器部分：

netG:add(nn.SpatialFullConvolution(opt.netEnco_vgg_nOutputPlane, opt.nf * 8, 3, 3, 1, 1, 1, 1)) -- x 1
netG:add(nn.SpatialBatchNormalization(opt.nf * 8)):add(nn.ReLU(true))
netG:add(nn.SpatialFullConvolution(opt.nf * 8, opt.nf * 4, 4, 4, 2, 2, 1, 1)) -- x 2
netG:add(nn.SpatialBatchNormalization(opt.nf * 4)):add(nn.ReLU(true))
netG:add(nn.SpatialFullConvolution(opt.nf * 4, opt.nf * 2, 4, 4, 2, 2, 1, 1)) -- x 4
netG:add(nn.SpatialBatchNormalization(opt.nf * 2)):add(nn.ReLU(true))
netG:add(nn.SpatialFullConvolution(opt.nf * 2, opt.nc, 4, 4, 2, 2, 1, 1)) -- x 8
netG:add(nn.Tanh())
netG:apply(weights_init)

下面好像是判别器部分？这个是真的看不明白。

table.insert(netS, nn.Sequential())
netS[i_netS]:add(nn.LeakyReLU(0.2, true))
netS[i_netS]:add(nn.SpatialConvolution(opt.netS_vgg_nOutputPlane[i_netS], opt.nf * 4, 4, 4, 2, 2, 1, 1)) -- x 1/2
netS[i_netS]:add(nn.SpatialBatchNormalization(opt.nf * 4)):add(nn.LeakyReLU(0.2, true))
netS[i_netS]:add(nn.SpatialConvolution(opt.nf * 4, opt.nf * 8, 4, 4, 2, 2, 1, 1)) -- x 1/4
netS[i_netS]:add(nn.SpatialBatchNormalization(opt.nf * 8)):add(nn.LeakyReLU(0.2, true))   
netS[i_netS]:add(nn.SpatialConvolution(opt.nf * 8, 1, 1, 1)) -- classify each neural patch using convolutional operation
netS[i_netS]:add(nn.Reshape(opt.batchSize * opt.netS_blocksize[i_netS] * opt.netS_blocksize[i_netS], 1, 1, 1, false)) -- reshape the classification result for computing loss
netS[i_netS]:add(nn.View(1):setNumInputDims(3))
netS[i_netS]:apply(weights_init)

4. Summary

单模型单风格的论文最新的也是2016年了（根据这个仓库的收录），其中，前两篇论文的实现大致相同，和我目前在用的代码也基本相同。若是从效果上来看，Prisma之类的应用比这些开源代码的效果还是好上一些，应该还是存在改进空间的，只是这些厂商没有公开。

风格迁移模型架构优化
在移动端部署时，比较成熟的方案是使用单模型单风格（Per-Style-Per-model，PSPM）的模型，因为这...
datawhale-task06/09（批量归一化和残差网络；凸
批量归一化和残差网络凸优化，梯度下降和优化算法进阶目标检测基础图像风格迁移
asp.net core系列 63 领域模型架构 eShopOn
一.DDD分层架构介绍本篇继续探讨web应用架构，讲基于DDD风格下最初的领域模型架构，不同于DDD风格下CQR...
风格迁移模型性能测试
为了优化模型的CPU占用以及运行时间，使用TensorFlow提供的测试脚本benchmark_model进行模型...
[转]Mysql模型
MySQL的架构模型看到大牛用户DB架构部的Keithlan《数据库性能优化之查询优化》，在学习过程发现很多不错...
人人都是毕加索
基于 Pytorch 和 VGG19 模型实现图片风格迁移。相关 Pytorch 官方教程相关 Github ...
Git跨仓库迁移代码文件，并保留git历史记录
背景：由于架构优化、组件下沉、仓库调整等原因，在工作中时常需要将仓库A中的代码迁移到仓库B中，同时希望代码迁移后...
样式迁移（neural style）
(一)样式迁移(neural style) 就是有两张图片，例如一张人像，一张油画，你想把优化的风格迁移到人像上。...
4.8 django ORM模型迁移
ORM模型迁移迁移命令： makemigrations：将模型生成迁移脚本。模型所在的app，必须放在setti...
EF6.X代码迁移
Code First 代码迁移是为了时刻保持领域模型与数据库架构同步,Entity Framework 作为幕后辛...