美文网首页
Winograd Convolution 推导 - 从1D到2D

Winograd Convolution 推导 - 从1D到2D

作者: MatrixOnEarth | 来源:发表于2020-03-05 09:00 被阅读0次

1D Winograd 卷积

1D Winograd算法已经有很多文章讨论了,讨论得都比较清楚,这里就不再赘述,仅列出结论。

输入:四维信号 \vec{d} = [d_{0}, d_{1}, d_{2}, d_{3}]^{T}
卷积核: 三维向量 \vec{k} = [k_{0}, k_{1}, k_{2}]^{T}
输出: 二维信号 \vec{r} = [r_0, r_1]^T
\vec{r}可表示为:
\vec{r} = A^T[(G\vec{k}) \odot (B^T\vec{d})]
其中:
G = \left[ \begin{matrix} 1 & 0 & 0 \\ \frac{1}{2} & \frac{1}{2} & \frac{1}{2} \\ \frac{1}{2} & -\frac{1}{2} & \frac{1}{2} \\ 0 & 0 & 1 \end{matrix} \right]_{4\times3}
B^{T} = \left[ \begin{matrix} 1 & 0 & -1 & 0 \\ 0 & 1 & 1 & 0 \\ 0 & -1 & 1 & 0 \\ 0 & 1 & 0 & -1 \end{matrix} \right]_{4\times4}
A^{T}= \left[ \begin{matrix} 1 & 1 & 1 & 0 \\ 0 & 1 & -1 & -1 \end{matrix} \right]_{2\times4}

2D Winograd卷积

2D Winograd可以由1D Winograd外推得到,因此为解决2D Winograd问题,首先要重温1D 卷积解决的问题。在此复述一遍:
假设一个卷积核尺寸为3的一维卷积,假设每次我们输出2个卷积点,则我们形式化此问题:F(2, 3)。
因为输出为2,卷积核大小为3,对应的输入点数应该为4,则此问题表述为:

输入:四维信号 \vec{d} = [d_{0}, d_{1}, d_{2}, d_{3}]^{T}
卷积核: 三维向量 \vec{k} = [k_{0}, k_{1}, k_{2}]^{T}
因此,此卷积的矩阵乘形式应为:
\left[ \begin{matrix} d_{0} & d_{1} & d_{2} \\ d_{1} & d_{2} & d_{3} \end{matrix} \right] \left[ \begin{matrix} k_{0} \\ k_{1} \\ k_{2} \end{matrix} \right] = \left[ \begin{matrix} r_{0} \\ r_{1} \end{matrix} \right] = D\vec{k}

请记住这个形式是Winograd算法解决的问题,后续2D算法将化归为这个问题。
下面我们来定义2D 卷积问题,将1D卷积扩展一维:
假设一个卷积核尺寸为3x3的二维卷积,假设每次我们输出2x2个卷积点,则我们形式化此问题:F(2x2, 3x3)。
因为输出为2x2,卷积核大小为3x3,对应的输入点数应该为4x4,则此问题表述为:

输入D = \left[ \begin{matrix} d_{00} & d_{01} & d_{02} & d_{03} \\ d_{10} & d_{11} & d_{12} & d_{13} \\ d_{20} & d_{21} & d_{22} & d_{23} \\ d_{30} & d_{31} & d_{32} & d_{33} \end{matrix}\right]
卷积核K = \left[ \begin{matrix} k_{00} & k_{01} & k_{02} \\ k_{10} & k_{11} & k_{12} \\ k_{20} & k_{21} & k_{22} \end{matrix}\right]
因此,此卷积的矩阵乘形式应为:
\left[ \begin{matrix} d_{00} & d_{01} & d_{02} & d_{10} & d_{11} & d_{12} & d_{20} & d_{21} & d_{22} \\ d_{01} & d_{02} & d_{03} & d_{11} & d_{12} & d_{13} & d_{21} & d_{22} & d_{23} \\ d_{10} & d_{11} & d_{12} & d_{20} & d_{21} & d_{22} & d_{30} & d_{31} & d_{32} \\ d_{11} & d_{12} & d_{13} & d_{21} & d_{22} & d_{23} & d_{31} & d_{32} & d_{33} \\ \end{matrix} \right] \left[ \begin{matrix} k_{00} \\ k_{01} \\ k_{02} \\ k_{10} \\ k_{11} \\ k_{12} \\ k_{20} \\ k_{21} \\ k_{22} \\ \end{matrix} \right] = \left[ \begin{matrix} r_{00} \\ r_{01} \\ r_{10} \\ r_{11} \\ \end{matrix} \right]

从这个式子里,我们可以看到1D卷积的影子,这个影子在我们对矩阵作了分块后会更加明显。
\left[\begin{array}{ccc|ccc|ccc} d_{00} & d_{01} & d_{02} & d_{10} & d_{11} & d_{12} & d_{20} & d_{21} & d_{22} \\ d_{01} & d_{02} & d_{03} & d_{11} & d_{12} & d_{13} & d_{21} & d_{22} & d_{23} \\ \hline d_{10} & d_{11} & d_{12} & d_{20} & d_{21} & d_{22} & d_{30} & d_{31} & d_{32} \\ d_{11} & d_{12} & d_{13} & d_{21} & d_{22} & d_{23} & d_{31} & d_{32} & d_{33} \\ \end{array} \right] \left[ \begin{matrix} k_{00} \\ k_{01} \\ k_{02} \\ \hline k_{10} \\ k_{11} \\ k_{12} \\ \hline k_{20} \\ k_{21} \\ k_{22} \\ \end{matrix} \right] = \left[ \begin{matrix} r_{00} \\ r_{01} \\ \hline r_{10} \\ r_{11} \\ \end{matrix} \right]
再明显一点,我们写成分块矩阵乘的形式:
\left[ \begin{matrix} D_{00} & D_{10} & D_{20}\\ D_{10} & D_{20} & D_{30} \end{matrix} \right] \left[ \begin{matrix} \vec{k_{0}} \\ \vec{k_{1}} \\ \vec{k_{2}} \end{matrix} \right] = \left[ \begin{matrix} \vec{r_{0}} \\ \vec{r_{1}} \end{matrix} \right]
至此,我们对2D卷积推导出了跟1D形式一致的公式,只不过1D中的标量在2D中变成了小矩阵或者向量。

实操粉

对实操粉而言,到这个形式为止,已经可以写代码了。
由1D Winograd可知,我们可以将该式改写为Winograd形式, 如下:
\left[ \begin{matrix} D_{00} & D_{10} & D_{20}\\ D_{10} & D_{20} & D_{30} \end{matrix} \right] \left[ \begin{matrix} \vec{k_{0}} \\ \vec{k_{1}} \\ \vec{k_{2}} \end{matrix} \right] = \left[ \begin{matrix} \vec{r_{0}} \\ \vec{r_{1}} \end{matrix} \right] = \left[ \begin{matrix} M_{0} + M_{1} + M_{2} \\ M_{1} - M_{2} - M_{3} \end{matrix} \right]
其中:
\begin{align*} & M_{0} = (D_{00} - D_{20})\vec{k_{0}} \\ & M_{1} = (D_{10} + D_{20})\frac{\vec{k_{0}} + \vec{k_{1}} + \vec{k_{2}}}{2} \\ & M_{2} = (D_{20} - D_{10})\frac{\vec{k_{0}} - \vec{k_{1}} + \vec{k_{2}}}{2} \\ & M_{3} = (D_{10} - D_{30})\vec{k_{2}} \end{align*}
注意,这四个M的计算又可以用一维的F(2, 3) Winograd来做,因此2D Winograd是个嵌套(nested)的算法。

理论粉

对一个有追求的理论粉来说,只是得到可以写程序的递归表达肯定是不完美的,他们还是希望有一个最终的解析表达的。其实也很简单,我们把上面的式子规整规整,使得输出成为一个标准的2x2矩阵,有:
\left[\begin{matrix} \vec{r_{0}} , \vec{r_{1}} \end{matrix} \right] = \left[ \begin{matrix} M_{0} + M_{1} + M_{2}, M_{1} - M_{2} - M_{3} \end{matrix}\right]
可以写为:
\left[ \begin{matrix} \vec{r_{0}} , \vec{r_{1}} \end{matrix} \right] = \left[ \begin{matrix} M_{0}, M_{1}, M_{2}, M_{3} \end{matrix} \right] \left[ \begin{matrix} 1 & 0 \\ 1 & 1 \\ 1 & -1 \\ 0 & -1 \end{matrix} \right]
依1D Winograd公式\vec{r} = A^T[(G\vec{k}) \odot (B^T\vec{d})], 并结合各M的公式,有下式。
\begin{align*} \left[ \begin{matrix} \vec{r_{0}} , \vec{r_{1}} \end{matrix} \right] &= \left[ \begin{matrix} M_{0}, M_{1}, M_{2}, M_{3} \end{matrix} \right] A \\ &= \left[ \begin{matrix}A^T[(G\vec{k_0}) \odot (B^T(\vec{d_0} - \vec{d_2}))], A^T[(G\frac{\vec{k_0} + \vec{k_1} + \vec{k_2}}{2}) \odot (B^T(\vec{d_1} + \vec{d_2}))], A^T[(G\frac{\vec{k_0} - \vec{k_1} + \vec{k_2}}{2}) \odot (B^T(\vec{d_2} - \vec{d_1}))], A^T[(G\vec{k_2}) \odot (B^T(\vec{d_1} - \vec{d_3}))] \end{matrix} \right]A \\ &=A^T\left[ \begin{matrix}(G\vec{k_0}) \odot (B^T(\vec{d_0} - \vec{d_2})), (G\frac{\vec{k_0} + \vec{k_1} + \vec{k_2}}{2}) \odot (B^T(\vec{d_1} + \vec{d_2})), (G\frac{\vec{k_0} - \vec{k_1} + \vec{k_2}}{2}) \odot (B^T(\vec{d_2} - \vec{d_1})), (G\vec{k_2}) \odot (B^T(\vec{d_1} - \vec{d_3})) \end{matrix} \right]A \end{align*}
注意到像(G\vec{k_0})这些都是2维列向量,hadamard product和concat可以交换而不影响结果,因此:
\begin{align*} \left[\begin{matrix} \vec{r_{0}} , \vec{r_{1}} \end{matrix} \right] &=A^T\left[ \begin{matrix}(G\vec{k_0}) \odot (B^T(\vec{d_0} - \vec{d_2})), (G\frac{\vec{k_0} + \vec{k_1} + \vec{k_2}}{2}) \odot (B^T(\vec{d_1} + \vec{d_2})), (G\frac{\vec{k_0} - \vec{k_1} + \vec{k_2}}{2}) \odot (B^T(\vec{d_2} - \vec{d_1})), (G\vec{k_2}) \odot (B^T(\vec{d_1} - \vec{d_3})) \end{matrix} \right]A\\ &=A^T\left[ \begin{matrix}(G[ \vec{k_0}, \frac{\vec{k_0} + \vec{k_1} + \vec{k_2}}{2}, \frac{\vec{k_0} - \vec{k_1} + \vec{k_2}}{2}, \vec{k_2}]) \odot (B^T[\vec{d_0} - \vec{d_2}, \vec{d_1} + \vec{d_2}, \vec{d_2} - \vec{d_1}, \vec{d_1} - \vec{d_3}]) \end{matrix} \right]A \\ &=A^T\left[ \begin{matrix}(G[ \vec{k_0}, \vec{k_1}, \vec{k_2}] \left[ \begin{matrix} 1 & \frac{1}{2} & \frac{1}{2} & 0 \\ 0 & \frac{1}{2} & -\frac{1}{2} & 0 \\ 0 & \frac{1}{2} & \frac{1}{2} & 1 \end{matrix} \right]) \odot (B^T[\vec{d_0}, \vec{d_1}, \vec{d_2}, \vec{d_3}]\left[ \begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & -1 & 1 \\ -1 & 1 & 1 & 0 \\ 0 & 0 & 0 & -1 \end{matrix} \right]) \end{matrix} \right]A \\ &=A^T\left[ \begin{matrix}(G[ \vec{k_0}, \vec{k_1}, \vec{k_2}]G^T) \odot (B^T[\vec{d_0}, \vec{d_1}, \vec{d_2}, \vec{d_3}]B) \end{matrix} \right]A\\ &=A^T\left[ \begin{matrix}(GKG^T) \odot (B^TDB) \end{matrix} \right]A\end{align*}
至此证得。

参考文献

  1. Fast Algorithms for Convolutional Neural Networks
  2. Fast Algorithms for Signal Processing
  3. Going beyond Full Utilization: The Inside Scoop on Nervana’s Winograd Kernels
  4. 卷积神经网络中的Winograd快速卷积算法 注:本文关于2D Winograd的公式推导是错误的。

相关文章

  • Winograd Convolution 推导 - 从1D到2D

    1D Winograd 卷积 1D Winograd算法已经有很多文章讨论了,讨论得都比较清楚,这里就不再赘述,仅...

  • 卷积层

    卷积层 1d/2d/3d Convolution 卷积核:又称为过滤器(filter),可认为是某种模式,某种特征...

  • 糖尿病

    马迷2D,波旁2D,柠尤1D,中国肉桂1D,取2D配咖啡勺月见草油服用。早上中午,餐前服用,20天停10天。然后新...

  • 云南游记 【大理篇】

    在年底终于去了趟云南,六天五夜,大理(2D)-束河(1D)-泸沽湖(2D)-丽江(1D),虽然没有去成香格里拉,但...

  • sklearn 错误集合

    ValueError: Expected 2D array, got 1D array instead: 问题: ...

  • Python-导出到csv(2019-02-20)

    import csvitem = [['1','2','3'],['a','b','c'],['1d','2d',...

  • numpy: np.newaxis

    用于增加array的dimension比如:1D 会变成 2D,2D会变成3D…… 未完

  • Array

    LT6 Array (1D and 2D) Definition Sequence of data items o...

  • 错误sklearn的Expected 2D array, got

    转:sklearn的Expected 2D array, got 1D array instead: 和resha...

  • 快速卷积算法winograd原理推导

    最近看到文章中说采用winograd快速卷积算法可以减少神经网络中图像卷积的乘法次数,因为之前做过cnn,当时卷积...

网友评论

      本文标题:Winograd Convolution 推导 - 从1D到2D

      本文链接:https://www.haomeiwen.com/subject/kqqtrhtx.html