读后感:Image Style Transfer Using C

读后感:Image Style Transfer Using C

作者: Jeffery_李俊峰 | 来源:发表于2018-12-04 11:31 被阅读0次


Figure 1. Image representations in a Convolutional Neural Network (CNN). A given input image is represented as a set of filtered images at each processing stage in the CNN. While the number of different filters increases along the processing hierarchy, the size of the filtered images is reduced by some downsampling mechanism (e.g.max-pooling) leading to a decrease in the total number of units per layer of the network. Content Reconstructions. We can visualise the information at different processing stages in the CNN by reconstructing the input image from only knowing the network’s responses in a particular layer. We reconstruct the input image from from layers ‘conv1 2’ (a), ‘conv2 2’ (b), ‘conv3 2’ (c), ‘conv4 2’ (d) and ‘conv5 2’ (e) of the original VGG-Network. We find that reconstruction from lower layers is almost perfect (a–c). In higher layers of the network, detailed pixel information is lost while the high-level content of the image is preserved (d,e). Style Reconstructions. On top of the original CNN activations we use a feature space that captures the texture information of an input image. The style representation computes correlations between the different features in different layers of the CNN. We reconstruct the style of the input image from a style representation built on different subsets of CNN layers ( ‘conv1 1’ (a), ‘conv1 1’ and ‘conv2 1’ (b), ‘conv1 1’, ‘conv2 1’ and ‘conv3 1’ (c), ‘conv1 1’, ‘conv2 1’, ‘conv3 1’ and ‘conv4 1’ (d), ‘conv1 1’, ‘conv2 1’, ‘conv3 1’, ‘conv4 1’ and ‘conv5 1’ (e). This creates images that match the style of a given image on an increasing scale while discarding information of the global arrangement of the scene.





这里与内容重构不同,风格特征是用多层的输出去产生的 这是优化目标函数里的各项变量的解释

通过这两个实验可以看出,在内容重构里,随着match的content representation的层级不同,越高层级获取到更高级的内容(关于图片的目标和安排),但是又不限制输入图片的实际像素值。这是我们需要的,因为我们要虽然要表示目标图片的内容,但是我们并不是简单地在目标图片上加一层颜色什么的,我们更需要对目标图片进行重构的同时要保留图片表达的内容。在风格重构的实验中,我们发现通过与越多同时越高层的风格特征的一个match,我们越能保留图片的局部图片结构,和产生一种顺滑和连续的视觉经验。


其实就是左边源图片输入进网络,得到风格特征(是由多层的输出去联合产生的,通过不同的filter responses之间的相关性去构建的一个空间去表示风格表征),右边目标图片输入到网络,得到内容特征(只用某一层的输出产生)。然后再通过构建目标函数:



这里是更改内容loss和风格loss的权重 这里是改变内容特征层 这里是研究初始化x图片的不同 这里做写实的风格转换实验



      本文标题:读后感:Image Style Transfer Using C
