美文网首页
[论文笔记]A guide to convolution ari

[论文笔记]A guide to convolution ari

作者: 醒醒去睡吧 | 来源:发表于2018-10-25 00:57 被阅读0次

    Github

    Chapter 1. Introduction

    1.1 Discrete convolutions

    N: N-D
    n: number of output feature maps
    m: number of input feature maps
    k_j: kernel size along axis j
    i_j: input size along axis j
    s_j: stride (distance between two consecutive positions of the kernel) along axis j
    p_j: zero padding (number of zeros concatenated at the beginning and at the end of an axis) along axis j

    1.2 Pooling

    Pooling operations reduce the size of feature maps by using some function to summarize subregions, such as taking the average or the maximum value.

    Chapter 2. Convolution arithmetic

    The analysis of the relationship between convolutional layer properties is eased by the fact that they don’t interact across axes. Because of that, this chapter will focus on the following simplified setting:

    • 2-D discrete convolutions (N = 2)
    • square inputs (i_1 = i_2 = i),
    • square kernel size (k_1 = k_2 = k),
    • same strides along both axes (s_1 = s_2 = s),
    • same zero padding along both axes (p_1 = p_2 = p).

    Note: the results outlined here also generalize to the N-D and non-square cases.

    2.1 No zero padding, unit strides (p=0, s=1)

    Relationship 1. For any i and k, and for s = 1 and p = 0, o = (i - k) + 1.

    2.2 Zero padding, unit strides (p>0, s=1)

    Relationship 2. For any i, k and p, and for s = 1,
    o = (i - k) + 2p + 1.

    2.2.1 Half (same) padding (p=\frac{k-1}{2})

    Relationship 3. For any i and for k odd (k = 2n + 1,n\in N), s = 1 and p = n,o= (i+2p)-k+1=(i+2n)-(2n+1)+1=i

    2.2.2 Full padding (p=k-1)

    Relationship 4. For any i and k, and for p = k - 1 and s = 1, o = i + 2(k - 1) - (k - 1)= i + (k - 1).

    2.3 No zero padding, non-unit strides (p=0, s>1)

    Relationship 5. For any i, k and s, and for p = 0,
    o =\lfloor\frac{i - k}{s}\rfloor+1.

    2.4 Zero padding, non-unit strides (p>0, s>1)

    Relationship 6. For any i, k, p and s,
    o =\lfloor\frac{i+2p-k}{s}\rfloor+1.

    Chapter 3. Pooling arithmetic

    Pooling does not involve zero padding (p=0).

    Relationship 7. For any i, k and s,
    o =\lfloor\frac{i - k}{s}\rfloor+1.

    Chapter 4. Transposed convolution arithmetic

    The need for transposed convolutions generally arises from the desire to use a transformation going in the opposite direction of a normal convolution.
    Note: transposed convolution properties don’t interact across axes
    We still use the same settings as chapter 2 in the following.

    4.1 Convolution as a matrix operation

    4.2 Transposed convolution

    4.3 No zero padding, unit strides, transposed (p=0, s=1, C^T)

    Relationship 8. A convolution described by s = 1, p = 0 and k has an associated transposed convolution described by k' = k, s' = s and p' = k - 1 and its output size is o' =i' + (k - 1):

    i\xrightarrow[]{\quad k, s=1, p=0\quad} o=i-k+1
    i'=o=i-k+1\xrightarrow[]{\; k'=k, s'=s, p'=k-1\;} o'=i'+2p'-k'+1=i

    4.4 Zero padding, unit strides, transposed (p>0, s=1, C^T)

    Relationship 9. A convolution described by s = 1, k and p has an associated transposed convolution described by k' = k, s' = s and p' = k - p - 1 and its output size is
    o' = i + (k - 1) - 2p

    i\xrightarrow[]{\quad k, s=1, p\quad} o=i+2p-k+1
    i'=o\xrightarrow[]{\; k'=k, s'=s, p'\;} o'=i'+2p'-k'+1=i\implies p'=k-p-1

    4.4.1 Half (same) padding, transposed (p=\frac{k-1}{2}, C^T)

    Relationship 10. A convolution described by k = 2n+1,n\in N, s = 1 and p = n has an associated transposed convolution described by k'= k, s'= s and p' = k-p-1=(2p+1)-p-1=p and its output size is o'=i'.

    4.4.2 Full padding, transposed (p=k-1, C^T)

    Relationship 11. A convolution described by s = 1, k and p =k-1 has an associated transposed convolution described by k' = k, s' = s and p' = k-p-1=0 and its output size is o'=i'-(k-1).

    4.5 No zero padding, non-unit strides, transposed (p=0, s>1, C^T)

    Relationship 12. A convolution described by p=0, k and s and whose input size is such that i-k is a multiple of s, has an associated transposed convolution described by\tilde{i'},k' = k, s' = 1 and p' = k-1, where \tilde{i'} is the size of the stretched input obtained by adding s-1 zeros between each input unit, and its output size is o'= s(i'-1)+k.

    i\xrightarrow[]{\quad k, s, p=0\quad} o=\lfloor\frac{i-k}{s}\rfloor+1
    \tilde{i'}=i'+(s-1)(i'-1)=s(i'-1)+1\xrightarrow[]{\; k'=k, s'=1, p'=k-1\;} o'=\lfloor\frac{\tilde{i'}+2p'-k'}{s'}\rfloor+1=s(i'-1)+k

    4.6 Zero padding, non-unit strides, transposed (p>0, s>1, C^T)

    Relationship 13. A convolution described by p, k and s and whose input size is such that i+2p-k is a multiple of s, has an associated transposed convolution described by\tilde{i'},k' = k, s' = 1 and p' = k-p-1, where \tilde{i'} is the size of the stretched input obtained by adding s-1 zeros between each input unit, and its output size is o'= s(i'-1)+k-2p.

    i\xrightarrow[]{\quad k, s, p\quad} o=\lfloor\frac{i+2p-k}{s}\rfloor+1
    \tilde{i'}=i'+(s-1)(i'-1)=s(i'-1)+1\xrightarrow[]{\; k'=k, s'=1, p'=k-p-1\;} o'=\lfloor\frac{\tilde{i'}+2p'-k'}{s'}\rfloor+1=s(i'-1)+k-2p

    Relationship 14.A convolution described by p, k and s has an associated transposed convolution described bya, \tilde{i'},k' = k, s' = 1 and p' = k-p-1, where \tilde{i'} is the size of the stretched input obtained by adding s-1 zeros between each input unit, and a=i+2p-k mod s represents the number of zeros added to the bottom and right
    edges of the input, its output size is o'= s(i'-1)+a+k-2p.

    Chapter 5. Miscellaneous convolutions

    5.1 Dilated convolutions

    Dilated convolutions are used to cheaply increase the receptive field of output units without increasing the kernel size, there are usually d-1 spaces inserted between kernel elements such that d = 1 corresponds to a regular convolution.
    A kernel of size k dilated by a factor d has an effective size \hat{k}=k+(k-1)(d-1)

    Relationship 15. For any i, k, p and s, and for a dilation rate d, o=\lfloor\frac{i+2p-\hat{k}}{s}\rfloor+1=\lfloor\frac{i+2p-k-(k-1)(d-1)}{s}\rfloor+1.

    相关文章

      网友评论

          本文标题:[论文笔记]A guide to convolution ari

          本文链接:https://www.haomeiwen.com/subject/tvazzftx.html