美文网首页
矩阵求导与最小二乘法

矩阵求导与最小二乘法

作者: 热爱生活的大川 | 来源:发表于2019-09-28 20:56 被阅读0次

    一、矩阵求导

    1. 矩阵求导就是对内部每一项求导
      F \in R^{a*b}, X \in R^{m*n}
      \frac{\partial{F}}{\partial{X}} = \left[\frac{\partial{F}}{\partial{x_{ij}}}\right]_{m*n}\frac{\partial{F}}{\partial{x}} = \left[\frac{\partial{f_{ij}}}{\partial{x}}\right]_{a*b}

    2. 矩阵的迹有如下性质:

      • tr(AB)=tr(BA)
      • tr(A^T)=tr(A)

    因而可推出如下性质:设x=(x_{ij})_{m*1}

    1. \frac{\partial{x^TA}}{\partial{x}} = A
    2. \frac{\partial{tr(AB)}}{\partial{A}} = B^T
    3. \frac{\partial{tr(ABA^TC)}}{\partial{A}} = C^TAB^T+CAB,相当于分别对AA^T取偏导后相加
    4. \frac{\partial{x^TAy}}{\partial{A}} = \frac{\partial{tr(x^TAy)}}{\partial{A}} = xy^T,分子为标量可看做矩阵的迹

    二、最小二乘法

    已知X \in R^{m*n}为参数矩阵,对应标签值为y \in R^{m*1}
    引入参数\theta \in R^{n*1},构造\hat{y}=X\theta,令最小化目标函数为L=\frac{1}{2}(y-X\theta)^T(y-X\theta),可求出\theta=(X^TX)^{-1}X^Ty.
    推导方法:
    \begin{align} \frac{\partial{L}}{\partial\theta} & = \frac{\partial{tr((y-X\theta)^T(y-X\theta))}}{2\partial{\theta}} \\ & = \frac{\partial{tr(\theta^TX^TX\theta)}-\partial{tr(2\theta^TX^Ty)}}{2\partial{\theta}} \\ & = X^TX\theta - X^Ty \\ let & = 0 \\ \theta & =(X^TX)^{-1}X^Ty \end{align}

    相关文章

      网友评论

          本文标题:矩阵求导与最小二乘法

          本文链接:https://www.haomeiwen.com/subject/xjitpctx.html