一. 介绍

1. Content reconstructions:

CNN被广泛运用于训练目标识别，随着训练的进行，它们能够有一种将图片抽象的能力。最终，图片被抽象为只含有实际内容的信息，而丢掉许多像素细节。越多层的CNN越能捕捉主要的图片内容。所以我们将深层的CNN抽象出来的内容称为content representation.

2. Style reconstructions:

为了获得图片的风格特征，我们利用feature space去捕捉图片的质地特征。这些feature space建立于每层网络的filter之上。它由特征图(feature map)的空间范围内的不同滤波器响应之间的相关性组成。

这篇论文的发现是style reconstructions和content reconstructions是可分的。那就是说，可以分别从不同的图片得到style和不同的图片得到content，然后将它们结合起来，构成新的图片。

二. 方法

1. 结构

(1) style reconstructions

We used the feature space provided by the 16 convolutional and 5 pooling layers of the 19 layer VGG-Network. We do not use any of the fully connected layers.

(2) image synthesis

we found that replacing the max-pooling operation by average pooling improves the gradient flow and one obtains slightly more appealing results, which is why the images shown were generated with average pooling.