[论文笔记]A guide to convolution ari

作者: 醒醒去睡吧 | 来源:发表于2018-10-25 00:57 被阅读0次

[论文笔记]A guide to convolution ari
opensource projects
Receptive field in CNNs
推荐系统论文阅读（三十四)-京东：解耦可替代性和互补性的DecG
EfficientNet
图像处理之vImage（二）——卷积
15组-Learning Spatiotemporal Feat
论文笔记 | 使用GCN建模关系数据
论文笔记《Character-level Convolution
笔记@Bundle Programming Guide

Chapter 1. Introduction

1.1 Discrete convolutions

$N$ : N-D
$n$ : number of output feature maps
$m$ : number of input feature maps
$k_j$ : kernel size along axis j
$i_j$ : input size along axis j
$s_j$ : stride (distance between two consecutive positions of the kernel) along axis j
$p_j$ : zero padding (number of zeros concatenated at the beginning and at the end of an axis) along axis j

1.2 Pooling

Pooling operations reduce the size of feature maps by using some function to summarize subregions, such as taking the average or the maximum value.

Chapter 2. Convolution arithmetic

The analysis of the relationship between convolutional layer properties is eased by the fact that they don’t interact across axes. Because of that, this chapter will focus on the following simplified setting:

2-D discrete convolutions ( $N = 2$ )
square inputs ( $i_1 = i_2 = i$ ),
square kernel size ( $k_1 = k_2 = k$ ),
same strides along both axes ( $s_1 = s_2 = s$ ),
same zero padding along both axes ( $p_1 = p_2 = p$ ).

Note: the results outlined here also generalize to the $N$ -D and non-square cases.

2.1 No zero padding, unit strides ( $p=0, s=1$ )

Relationship 1. For any $i$ and $k$ , and for $s = 1$ and $p = 0$ , $o = (i - k) + 1$ .

2.2 Zero padding, unit strides ( $p>0, s=1$ )

Relationship 2. For any $i$ , $k$ and $p$ , and for $s = 1$ ,
$o = (i - k) + 2p + 1$ .

2.2.1 Half (same) padding ( $p=\frac{k-1}{2}$ )

Relationship 3. For any $i$ and for $k$ odd ( $k = 2n + 1$ , $n\in N$ ), $s = 1$ and $p = n$ , $o= (i+2p)-k+1=(i+2n)-(2n+1)+1=i$

2.2.2 Full padding ( $p=k-1$ )

Relationship 4. For any $i$ and $k$ , and for $p = k - 1$ and $s = 1$ , $o = i + 2(k - 1) - (k - 1)= i + (k - 1)$ .

2.3 No zero padding, non-unit strides ( $p=0, s>1$ )

Relationship 5. For any $i$ , $k$ and $s$ , and for $p = 0$ ,
$o =\lfloor\frac{i - k}{s}\rfloor+1$ .

2.4 Zero padding, non-unit strides ( $p>0, s>1$ )

Relationship 6. For any $i$ , $k$ , $p$ and $s$ ,
$o =\lfloor\frac{i+2p-k}{s}\rfloor+1$ .

Chapter 3. Pooling arithmetic

Pooling does not involve zero padding ( $p=0$ ).

Relationship 7. For any $i$ , $k$ and $s$ ,
$o =\lfloor\frac{i - k}{s}\rfloor+1$ .

Chapter 4. Transposed convolution arithmetic

The need for transposed convolutions generally arises from the desire to use a transformation going in the opposite direction of a normal convolution.
Note: transposed convolution properties don’t interact across axes
We still use the same settings as chapter 2 in the following.

4.1 Convolution as a matrix operation

4.2 Transposed convolution

4.3 No zero padding, unit strides, transposed ( $p=0, s=1, C^T$ )

Relationship 8. A convolution described by $s = 1$ , $p = 0$ and $k$ has an associated transposed convolution described by $k' = k$ , $s' = s$ and $p' = k - 1$ and its output size is $o' =i' + (k - 1)$ :

$i\xrightarrow[]{\quad k, s=1, p=0\quad} o=i-k+1$
$i'=o=i-k+1\xrightarrow[]{\; k'=k, s'=s, p'=k-1\;} o'=i'+2p'-k'+1=i$

4.4 Zero padding, unit strides, transposed ( $p>0, s=1, C^T$ )

Relationship 9. A convolution described by $s = 1$ , $k$ and $p$ has an associated transposed convolution described by $k' = k$ , $s' = s$ and $p' = k - p - 1$ and its output size is
$o' = i + (k - 1) - 2p$

$i\xrightarrow[]{\quad k, s=1, p\quad} o=i+2p-k+1$
$i'=o\xrightarrow[]{\; k'=k, s'=s, p'\;} o'=i'+2p'-k'+1=i\implies p'=k-p-1$

4.4.1 Half (same) padding, transposed ( $p=\frac{k-1}{2}, C^T$ )

Relationship 10. A convolution described by $k = 2n+1,n\in N$ , $s = 1$ and $p = n$ has an associated transposed convolution described by $k'= k$ , $s'= s$ and $p' = k-p-1=(2p+1)-p-1=p$ and its output size is $o'=i'$ .

4.4.2 Full padding, transposed ( $p=k-1, C^T$ )

Relationship 11. A convolution described by $s = 1$ , $k$ and $p =k-1$ has an associated transposed convolution described by $k' = k$ , $s' = s$ and $p' = k-p-1=0$ and its output size is $o'=i'-(k-1)$ .

4.5 No zero padding, non-unit strides, transposed ( $p=0, s>1, C^T$ )

Relationship 12. A convolution described by $p=0$ , $k$ and $s$ and whose input size is such that $i-k$ is a multiple of $s$ , has an associated transposed convolution described by $\tilde{i'}$ , $k' = k$ , $s' = 1$ and $p' = k-1$ , where $\tilde{i'}$ is the size of the stretched input obtained by adding $s-1$ zeros between each input unit, and its output size is $o'= s(i'-1)+k$ .

$i\xrightarrow[]{\quad k, s, p=0\quad} o=\lfloor\frac{i-k}{s}\rfloor+1$
$\tilde{i'}=i'+(s-1)(i'-1)=s(i'-1)+1\xrightarrow[]{\; k'=k, s'=1, p'=k-1\;} o'=\lfloor\frac{\tilde{i'}+2p'-k'}{s'}\rfloor+1=s(i'-1)+k$

4.6 Zero padding, non-unit strides, transposed ( $p>0, s>1, C^T$ )

Relationship 13. A convolution described by $p$ , $k$ and $s$ and whose input size is such that $i+2p-k$ is a multiple of $s$ , has an associated transposed convolution described by $\tilde{i'}$ , $k' = k$ , $s' = 1$ and $p' = k-p-1$ , where $\tilde{i'}$ is the size of the stretched input obtained by adding $s-1$ zeros between each input unit, and its output size is $o'= s(i'-1)+k-2p$ .

$i\xrightarrow[]{\quad k, s, p\quad} o=\lfloor\frac{i+2p-k}{s}\rfloor+1$
$\tilde{i'}=i'+(s-1)(i'-1)=s(i'-1)+1\xrightarrow[]{\; k'=k, s'=1, p'=k-p-1\;} o'=\lfloor\frac{\tilde{i'}+2p'-k'}{s'}\rfloor+1=s(i'-1)+k-2p$

Relationship 14.A convolution described by $p$ , $k$ and $s$ has an associated transposed convolution described by $a$ , $\tilde{i'}$ , $k' = k$ , $s' = 1$ and $p' = k-p-1$ , where $\tilde{i'}$ is the size of the stretched input obtained by adding $s-1$ zeros between each input unit, and $a=i+2p-k$ mod $s$ represents the number of zeros added to the bottom and right
edges of the input, its output size is $o'= s(i'-1)+a+k-2p$ .

Chapter 5. Miscellaneous convolutions

5.1 Dilated convolutions

Dilated convolutions are used to cheaply increase the receptive field of output units without increasing the kernel size, there are usually d-1 spaces inserted between kernel elements such that d = 1 corresponds to a regular convolution.
A kernel of size k dilated by a factor d has an effective size $\hat{k}=k+(k-1)(d-1)$

Relationship 15. For any $i$ , $k$ , $p$ and $s$ , and for a dilation rate $d$ , $o=\lfloor\frac{i+2p-\hat{k}}{s}\rfloor+1=\lfloor\frac{i+2p-k-(k-1)(d-1)}{s}\rfloor+1$ .

[论文笔记]A guide to convolution ari
Github Chapter 1. Introduction 1.1 Discrete convolutions...
opensource projects
convolution network A guide to convolution arithmetic for...
Receptive field in CNNs
建议先阅读A guide to convolution arithmetic for deep learning ...
推荐系统论文阅读（三十四)-京东：解耦可替代性和互补性的DecG
论文：论文题目：《Decoupled Graph Convolution Network for Inferr...
EfficientNet
论文：EfficientNet: Rethinking Model Scaling for Convolution...
图像处理之vImage（二）——卷积
vImage学习笔记——卷积（Convolution）卷积（Convolution）是一个常用的图像处理技术，可...
15组-Learning Spatiotemporal Feat
论文名称：Learning Spatiotemporal Features with 3D Convolution...
论文笔记 | 使用GCN建模关系数据
本文主要复述论文["Modeling Relational Data with Graph Convolution...
论文笔记《Character-level Convolution
Introduction 一方，面目前文本分类技术主要考虑词或词的组合；另一方面，研究表明，卷积神经网络在从原始信...
笔记@Bundle Programming Guide
笔记@Bundle Programming Guide 来源：Bundle Programming Guide 概...