自己写个 Prisma

作者: 不会停的蜗牛 | 来源:发表于2016-10-01 07:00 被阅读4924次

自己写个 Prisma
PRISMA基础:入门二
Prisma API:概览
【推荐】这些app不全是游戏，不过我却可以玩一整天
技术前沿 | 莫扎特交响乐转换成贝多芬钢琴曲？语音版的Neura
使用Prisma构建GraphQL服务[目录]
服务配置:数据建模(SDL)
PRISMA开发环境准备
下一代orm prisma
PRISMA快速入门之Typescript

Sirajology的视频链接

前一段时间特别火的 Prisma 大家都玩了么，看了这篇文章后，你也可以自己写一个 Prisma 迷你版了。

这个 idea 最开始起源于 Google Research Blog
Here's the initial Google DeepDream blog post:

他们用大量的图片数据来训练深度神经网络，使这个网络可以判断出图片中的事物，然后投入一个新的图片，让图片识别，不仅仅是识别，还要把图片修正为网络学到的东西。

然后另一个团队发表了一篇相似的论文

他们用名画来训练模型，然后投入一个生活中的图片，通过强化一些 feature，将这个图片修正为更像名画风格的图片。

原理就是用一个 Convolutional Neural Network 学习一张图片的 style ，然后把另一张图片转换成这种 style。

用到的工具是 python 和 keras 包，文章后面有作者的源码的地址。

引入需要的包

from scipy.misc import imread, imresize, imsave
from scipy.optimize import fmin_l_bfgs_b
from sklearn.preprocessing import normalize
import numpy as np
import time
import os
import argparse
import h5py

from keras.models import Sequential
from keras.layers.convolutional import Convolution2D, ZeroPadding2D, AveragePooling2D
from keras import backend as K

定义三个图片变量

#Define base image, style image, and result image paths
args = parser.parse_args()
base_image_path = args.base_image_path
style_reference_image_path = args.style_reference_image_path
result_prefix = args.result_prefix

引用事先计算好的 weights vgg16

这是提前训练好的，可以识别生活中的图片，以它作为模型的起点。

#Get the weights file
weights_path = r"vgg16_weights.h5"

定义 booleans 决定是否 reshape 图片

#Init bools to decide whether or not to resize
rescale_image = strToBool(args.rescale_image)
maintain_aspect_ratio = strToBool(args.maintain_aspect_ratio)

然后初始化 style－content weights

什么是style－content weights？

在神经网络学习的过程中，不同的层学到的东西是不一样的，例如识别一个小狗，一层学到的是 edge，下一层学到的是 shape，再下一层是更复杂的 shape，最后学到的是整个的 dog。

在学习艺术风格的网络中发现，低层次学到的是 style，如纹理颜色框架等，高层次学到的是 content，如太阳等具体的物体，CNN会把 content 和 style 分离开，所以要达到不同的效果，需要不同的权重分配。

# Init variables for style and content weights. 
total_variation_weight = args.tv_weight
style_weight = args.style_weight * args.style_scale
content_weight = args.content_weight

然后设定图片维度，定义tensor代表三个图片 base image，style image，output image。

# Init dimensions of the generated picture.
img_width = img_height = args.img_size
assert img_height == img_width, 'Due to the use of the Gram matrix, width and height must match.'
img_WIDTH = img_HEIGHT = 0
aspect_ratio = 0

# get tensor representations of our images
base_image = K.variable(preprocess_image(base_image_path, True))
style_reference_image = K.variable(preprocess_image(style_reference_image_path))

# this will contain our generated image
combination_image = K.placeholder((1, 3, img_width, img_height))

再组合到一个 tensor 中

# combine the 3 images into a single Keras tensor
input_tensor = K.concatenate([base_image,
                              style_reference_image,
                              combination_image], axis=0)

放在一个 tensor 中，因为更容易被神经网络解析，这样一个高维的图片也可以有可以计算的复杂度。

建立 31 层的神经网络

# build the VGG16 network with our 3 images as input
first_layer = ZeroPadding2D((1, 1))
first_layer.set_input(input_tensor, shape=(3, 3, img_width, img_height))

model = Sequential()
model.add(first_layer)
model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1'))
model.add(ZeroPadding2D((1, 1)))
model.add(Convolution2D(64, 3, 3, activation='relu'))
model.add(AveragePooling2D((2, 2), strides=(2, 2)))

。。。

一共有3种：
convolution2D layer：拥有可学习的filters，这些filters有receptive field，用来将神经元连接到下一层的一个局部的区域，而不是连接到每一个神经元

ZeroPadding layer：用来控制 output 的大小

Pooling layer：只用图片的子集来计算，减少参数数量，用来避免 overfitting。

激活函数用的是 ReLU，比sigmoid更快一些。

各个层的参数分别是：

定义完模型后，引入 vgg16 的权重

# load the weights of the VGG16 networks
load_weights(weights_path, model)

定义 Loss Function：计算预测和实际的差别

# get the symbolic outputs of each "key" layer (we gave them unique names).
outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])

# get the loss (we combine style, content, and total variation loss into a single scalar)
loss = get_total_loss(outputs_dict)

得到 gradients

# get the gradients of the generated image wrt the loss
grads = K.gradients(loss, combination_image)

最后用 back propagation 训练模型，此处用到的算法是 limit－memory BFGS，可以最小化 loss function 而且空间效率较高。

#combine loss and gradient
f_outputs = combine_loss_and_gradient(loss, grads)

# Run scipy-based optimization (L-BFGS) over the pixels of the generated image to minimize the neural style loss
# 5 Step process
x, num_iter = prepare_image()
for i in range(num_iter):

    #Step 1 - Record iterations
    print('Start of iteration', (i+1))
    start_time = time.time()

    #Step 2 - Perform l_bfgs optimization function using loss and gradient
    x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(),
                                     fprime=evaluator.grads, maxfun=20)
    print('Current loss value:', min_val)

    #Step 3 - Get the generated image
    img = deprocess_image(x.reshape((3, img_width, img_height)))

    #Step 4 - Maintain aspect ratio
    if (maintain_aspect_ratio) & (not rescale_image):
        img_ht = int(img_width * aspect_ratio)
        print("Rescaling Image to (%d, %d)" % (img_width, img_ht))
        img = imresize(img, (img_width, img_ht), interp=args.rescale_method)
    if rescale_image:
        print("Rescaling Image to (%d, %d)" % (img_WIDTH, img_HEIGHT))
        img = imresize(img, (img_WIDTH, img_HEIGHT), interp=args.rescale_method)

最后，rescale 并且保存图片

    #Step 5 - Save the generated image
    fname = result_prefix + '_at_iteration_%d.png' % (i+1)
    imsave(fname, img)
    end_time = time.time()
    print('Image saved as', fname)
    print('Iteration %d completed in %ds' % (i+1, end_time - start_time))

这个算法也可以用到视频中。

另外还找到一篇《我是如何用TensorFlow 做出属于自己的Prisma的？》

感兴趣就动手写一下吧。

The code for this video is here:
Here's the initial Google DeepDream blog post:
A Deepdream web app:
The Neural Style Paper:

我是 不会停的蜗牛 Alice
85后全职主妇
喜欢人工智能，行动派
创造力，思考力，学习力提升修炼进行中
欢迎您的喜欢，关注和评论！

自己写个 Prisma
Sirajology的视频链接前一段时间特别火的 Prisma 大家都玩了么，看了这篇文章后，你也可以自己写一个...
PRISMA基础:入门二
本文属使用Prisma构建GraphQL服务系列。通过PRISMA基础:入门一已了解PRISMA的基础，现在我们...
Prisma API:概览
本文属使用Prisma构建GraphQL服务系列。什么是Prisma API Prisma服务公开一个基于部署的...
【推荐】这些app不全是游戏，不过我却可以玩一整天
1. 摄影类【prisma】今年夏天，【Prisma】的出现让...
技术前沿 | 莫扎特交响乐转换成贝多芬钢琴曲？语音版的Neura
技术前沿本文作者：萝卜兔 2016年，号称“废片拯救器”的Prisma横空出世，风靡全球。用Prisma将自己...
使用Prisma构建GraphQL服务[目录]
说明：此Prisma非图片处理的Prisma，通过下面一系列文章，你将能够在分分钟内构建自己的GraphQL 服务...
服务配置:数据建模(SDL)
本文属使用Prisma构建GraphQL服务系列。概述 Prisma使用GraphQL Schema Defi...
PRISMA开发环境准备
本文属使用Prisma构建GraphQL服务系列。必备环境 docker nodejs 安装PRISMA 注意，...
下一代orm prisma
prisma prisma 是新一代的orm系统，它主要有三部分组成。 Prisma 客户端：自动生成和类型安全的...
PRISMA快速入门之Typescript
本文属使用Prisma构建GraphQL服务系列。本文介绍如何使用typescript开发prisma服务。将使...

网友评论

庞贝船长:就等这个了:-)
73b646c01637:请问博主你跑过代码了吗？我在运行时候出错了。
Traceback (most recent call last):
File "Network.py", line 294, in <module>
load_weights(weights_path, model)
File "Network.py", line 85, in load_weights
model.layers[k].set_weights(weights)
File "/home/shiyanlou/Code/new_project/venv/local/lib/python2.7/site-packages/keras/engine/topology.py", line 889, in set_weights
'provided weight shape ' + str(w.shape))
Exception: Layer weight shape (64, 5, 3, 3) not compatible with provided weight shape (64, 3, 3, 3)
不知道这个该怎么解决呢？
31227bb0267a:@贤贤易色就是我你现在跑通了吗？我还是没有跑通，我用的是github上的代码，我的错误是这样的
Using Theano backend.
Traceback (most recent call last):
File "Network.py", line 296, in <module>
load_weights(weights_path, model)
File "Network.py", line 80, in load_weights
for k in range(f.attrs['nb_layers']):
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/feedstock_root/build_artefacts/work/h5py-2.6.0/h5py/_objects.c:2696)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/feedstock_root/build_artefacts/work/h5py-2.6.0/h5py/_objects.c:2654)
File "/home/qinzibo/anaconda2/envs/keras/lib/python2.7/site-packages/h5py/_hl/attrs.py", line 58, in __getitem__
attr = h5a.open(self._id, self._e(name))
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/feedstock_root/build_artefacts/work/h5py-2.6.0/h5py/_objects.c:2696)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/feedstock_root/build_artefacts/work/h5py-2.6.0/h5py/_objects.c:2654)
File "h5py/h5a.pyx", line 77, in h5py.h5a.open (/feedstock_root/build_artefacts/work/h5py-2.6.0/h5py/h5a.c:2184)
KeyError: "Can't open attribute (Can't locate attribute: 'nb_layers')"
cloudway:有个关键没讲啊，算style 算得是低层的correlation
闫_锋:补充一下Gram矩阵计算通道间的correlation，核心是loss function。
cloudway:@jxSnow 我自己跑的是lua写的那几个，不好意思诶这个没试过
jxSnow:@cloudway 请问你是否能跑通代码？我遇到了这个错误
Traceback (most recent call last):
File "Network.py", line 294, in <module>
load_weights(weights_path, model)
File "Network.py", line 85, in load_weights
model.layers[k].set_weights(weights)
File "/home/shiyanlou/Code/new_project/venv/local/lib/python2.7/site-packages/keras/engine/topology.py", line 889, in set_weights
'provided weight shape ' + str(w.shape))
Exception: Layer weight shape (64, 5, 3, 3) not compatible with provided weight shape (64, 3, 3, 3)

自己写个 Prisma

引入需要的包

定义三个图片变量

引用事先计算好的 weights vgg16

定义 booleans 决定是否 reshape 图片

然后初始化 style－content weights

然后设定图片维度，定义tensor代表三个图片 base image，style image，output image。

再组合到一个 tensor 中

建立 31 层的神经网络

激活函数用的是 ReLU，比sigmoid更快一些。

定义完模型后，引入 vgg16 的权重

定义 Loss Function：计算预测和实际的差别

得到 gradients

最后用 back propagation 训练模型，此处用到的算法是 limit－memory BFGS，可以最小化 loss function 而且空间效率较高。

最后，rescale 并且保存图片

相关文章

自己写个 Prisma

PRISMA基础:入门二

Prisma API:概览

【推荐】这些app不全是游戏，不过我却可以玩一整天

技术前沿 | 莫扎特交响乐转换成贝多芬钢琴曲？语音版的Neura

使用Prisma构建GraphQL服务[目录]

服务配置:数据建模(SDL)

PRISMA开发环境准备

下一代orm prisma

PRISMA快速入门之Typescript

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

NLP

数据科学家

书房就是我的全世界

程序员

深度学习·神经网络·计算机视觉