美文网首页
线性代数基础

线性代数基础

作者: 越狱_29c6 | 来源:发表于2019-12-01 12:54 被阅读0次

    <h1><a id="_1"></a>线性代数</h1>
    <h2><a id="_3"></a>一、基本知识</h2>
    <ol>
    <li>本书中所有的向量都是列向量的形式:</li>
    </ol>
    <p><br> \mathbf{\vec x}=(x_1,x_2,\cdots,x_n)^T=\begin{bmatrix}x_1\x_2\\ \vdots<br> \x_n\end{bmatrix}<br></p>
    <p>本书中所有的矩阵 \mathbf X\in \mathbb R^{m\times n} 都表示为:</p>
    <p><br> \mathbf X = \begin{bmatrix}<br> x_{1,1} &amp; x_{1,2} &amp; \cdots &amp; x_{1,n}\<br> x_{2,1} &amp; x_{2,2} &amp; \cdots &amp; x_{2,n}\<br> \vdots &amp; \vdots &amp; \ddots &amp; \vdots\<br> x_{m,1} &amp; x_{m,2} &amp; \cdots &amp; x_{m,n}\<br> \end{bmatrix}<br></p>
    <p>简写为:(x_{i,j})<em>{m\times n} 或者 [x</em>{i,j}]_{m\times n} 。</p>
    <ol start="2">
    <li>矩阵的<code>F</code>范数:设矩阵 \mathbf A=(a_{i,j})<em>{m\times n},则其<code>F</code> 范数为:||\mathbf A||<em>F=\sqrt{\sum</em>{i,j}a</em>{i,j}^{2}} 。</li>
    </ol>
    <p>它是向量的 L_2 范数的推广。</p>
    <ol start="3">
    <li>矩阵的迹:设矩阵 \mathbf A=(a_{i,j})<em>{m\times n},则 \mathbf A的迹为: tr(\mathbf A)=\sum</em>{i}a_{i,i}。</li>
    </ol>
    <p>迹的性质有:</p>
    <ul>
    <li>\mathbf A 的<code>F</code> 范数等于\mathbf A\mathbf A^T 的迹的平方根:</li>
    </ul>
    <p>||\mathbf A||_F=\sqrt{tr(\mathbf A \mathbf A^{T})}</p>
    <ul>
    <li>\mathbf A 的迹等于\mathbf A^T 的迹:</li>
    </ul>
    <p>tr(\mathbf A)=tr(\mathbf A^{T})</p>
    <ul>
    <li>交换律:假设</li>
    </ul>
    <p>\mathbf A\in \mathbb R^{m\times n},\mathbf B\in \mathbb R^{n\times m}</p>
    <p>,则有:tr(\mathbf A\mathbf B)=tr(\mathbf B\mathbf A) 。</p>
    <ul>
    <li>结合律:</li>
    </ul>
    <p>tr(\mathbf A\mathbf B\mathbf C)=tr(\mathbf C\mathbf A\mathbf B)=tr(\mathbf<br> B\mathbf C\mathbf A)</p>
    <h2><a id="_55"></a>二、向量操作</h2>
    <ol>
    <li>一组向量 \mathbf{\vec v}_1,\mathbf{\vec v}_2,\cdots,\mathbf{\vec v}_n 是线性相关的:指存在一组不全为零的实数 a_1,a_2,\cdots,a_n,使得:</li>
    </ol>
    <p>\sum_{i=1}^{n}a_i\mathbf{\vec v}_i=\mathbf{\vec 0}</p>
    <p>一组向量 \mathbf{\vec v}_1,\mathbf{\vec v}_2,\cdots,\mathbf{\vec v}_n 是线性无关的,当且仅当 a_i=0,i=1,2,\cdots,n 时,才有:</p>
    <p>\sum_{i=1}^{n}a_i\mathbf{\vec v}_i=\mathbf{\vec 0}</p>
    <ol start="2">
    <li>
    <p>一个向量空间所包含的最大线性无关向量的数目,称作该向量空间的维数。</p>
    </li>
    <li>
    <p>三维向量的点积:</p>
    </li>
    </ol>
    <p>\mathbf{\vec u}\cdot\mathbf{\vec v} =u _xv_x+u_yv_y+u_zv_z = |\mathbf{\vec<br> u}| | \mathbf{\vec v}| \cos(\mathbf{\vec u},\mathbf{\vec v})</p>
    <ol start="4">
    <li>三维向量的叉积:</li>
    </ol>
    <p><br> \mathbf{\vec w}=\mathbf{\vec u}\times \mathbf{\vec v}=<br> \begin{bmatrix}<br> \mathbf{\vec i} &amp; \mathbf{\vec j} &amp; \mathbf{\vec k}\<br> u_x &amp; u_y &amp; u_z\<br> v_x &amp; v_y &amp; v_z\<br> \end{bmatrix}</p>
    <p>其中 \mathbf{\vec i}, \mathbf{\vec j},\mathbf{\vec k} 分别为 x,y,z 轴的单位向量。</p>
    <p>\mathbf{\vec u}=u_x\mathbf{\vec i}+u_y\mathbf{\vec j}+u_z\mathbf{\vec<br> k},\quad \mathbf{\vec v}=v_x\mathbf{\vec i}+v_y\mathbf{\vec j}+v_z\mathbf{\vec<br> k}</p>
    <ul>
    <li>\mathbf{\vec u}\mathbf{\vec v} 的叉积垂直于 \mathbf{\vec u},\mathbf{\vec v} 构成的平面,其方向符合右手规则。</li>
    <li>叉积的模等于 \mathbf{\vec u},\mathbf{\vec v} 构成的平行四边形的面积</li>
    </ul>
    <p>\mathbf{\vec u}\times \mathbf{\vec v}=-\mathbf{\vec v}\times \mathbf{\vec<br> u}</p>
    <p>\mathbf{\vec u}\times( \mathbf{\vec v} \times \mathbf{\vec w})=(\mathbf{\vec<br> u}\cdot \mathbf{\vec w})\mathbf{\vec v}-(\mathbf{\vec u}\cdot \mathbf{\vec<br> v})\mathbf{\vec w}</p>
    <ol start="5">
    <li>三维向量的混合积:</li>
    </ol>
    <p>[\mathbf{\vec u} ;\mathbf{\vec v} ;\mathbf{\vec w}]=(\mathbf{\vec u}\times<br> \mathbf{\vec v})\cdot \mathbf{\vec w}= \mathbf{\vec u}\cdot (\mathbf{\vec v}<br> \times \mathbf{\vec w})\<br> =\begin{vmatrix}<br> u_x &amp; u_y &amp; u_z\<br> v_x &amp; v_y &amp; v_z\<br> w_x &amp; w_y &amp; w_z<br> \end{vmatrix}<br> =\begin{vmatrix}<br> u_x &amp; v_x &amp; w_x\<br> u_y &amp; v_y &amp; w_y\<br> u_z &amp; v_z &amp; w_z<br> \end{vmatrix}<br></p>
    <p>其物理意义为:以 \mathbf{\vec u} ,\mathbf{\vec v} ,\mathbf{\vec w}

    为三个棱边所围成的平行六面体的体积。 当 \mathbf{\vec u} ,\mathbf{\vec v} ,\mathbf{\vec w}

    构成右手系时,该平行六面体的体积为正号。</p>
    <ol start="6">
    <li>两个向量的并矢:给定两个向量

    \mathbf {\vec x}=(x_1,x_2,\cdots,x_n)^{T}, \mathbf {\vec y}=<br> (y_1,y_2,\cdots,y_m)^{T}</li>
    </ol>
    <p>,则向量的并矢记作:</p>
    <p>\mathbf {\vec x}\mathbf {\vec y}<br> =\begin{bmatrix}<br> x_1y_1 &amp; x_1y_2 &amp; \cdots &amp; x_1y_m\<br> x_2y_1 &amp; x_2y_2 &amp; \cdots &amp; x_2y_m\<br> \vdots &amp; \vdots &amp; \ddots &amp; \vdots\<br> x_ny_1 &amp; x_ny_2 &amp;\cdots &amp; x_ny_m\<br> \end{bmatrix}<br></p>
    <p>也记作 \mathbf {\vec x}\otimes\mathbf {\vec y} 或者 \mathbf {\vec x} \mathbf<br> {\vec y}^{T}。</p>
    <h2><a id="_145"></a>三、矩阵运算</h2>
    <ol>
    <li>给定两个矩阵

    \mathbf A=(a_{i,j}) \in \mathbb R^{m\times n},\mathbf B=(b_{i,j}) \in<br> \mathbb R^{m\times n}</li>
    </ol>
    <p>,定义:</p>
    <ul>
    <li>阿达马积Hadamard product(又称作逐元素积):</li>
    </ul>
    <p>\mathbf A \circ \mathbf B<br> =\begin{bmatrix}<br> a_{1,1}b_{1,1} &amp; a_{1,2}b_{1,2} &amp; \cdots &amp; a_{1,n}b_{1,n}\<br> a_{2,1}b_{2,1} &amp; a_{2,2}b_{2,2} &amp; \cdots &amp; a_{2,n}b_{2,n}\<br> \vdots$&amp;$\vdots &amp; \ddots &amp; \vdots\<br> a_{m,1}b_{m,1} &amp; a_{m,2}b_{m,2} &amp; \cdots &amp; a_{m,n}b_{m,n}<br> \end{bmatrix}<br></p>
    <ul>
    <li>克罗内积Kronnecker product:</li>
    </ul>
    <p><br> \mathbf A \otimes \mathbf B<br> =\begin{bmatrix}<br> a_{1,1}\mathbf B &amp; a_{1,2}\mathbf B &amp; \cdots &amp; a_{1,n}\mathbf B\<br> a_{2,1}\mathbf B &amp; a_{2,2}\mathbf B &amp; \cdots &amp; a_{2,n}\mathbf B\<br> \vdots &amp; \vdots &amp; \ddots &amp; \vdots\<br> a_{m,1}\mathbf B &amp; a_{m,2}\mathbf B &amp; \cdots &amp; a_{m,n}\mathbf B<br> \end{bmatrix}<br></p>
    <ol start="2">
    <li>设

    \mathbf {\vec x},\mathbf {\vec a},\mathbf {\vec b},\mathbf {\vec c}</li>
    </ol>
    <p>为 n 阶向量, \mathbf A,\mathbf B,\mathbf C,\mathbf X 为 n 阶方阵,则有:</p>
    <p>\frac{\partial(\mathbf {\vec a}^{T}\mathbf {\vec x}) }{\partial \mathbf<br> {\vec x} }=\frac{\partial(\mathbf {\vec x}^{T}\mathbf {\vec a}) }{\partial<br> \mathbf {\vec x} } =\mathbf {\vec a}</p>
    <p>\frac{\partial(\mathbf {\vec a}^{T}\mathbf X\mathbf {\vec b}) }{\partial<br> \mathbf X }=\mathbf {\vec a}\mathbf {\vec b}^{T}=\mathbf {\vec<br> a}\otimes\mathbf {\vec b}\in \mathbb R^{n\times n}</p>
    <p>\frac{\partial(\mathbf {\vec a}^{T}\mathbf X^{T}\mathbf {\vec b}) }{\partial<br> \mathbf X }=\mathbf {\vec b}\mathbf {\vec a}^{T}=\mathbf {\vec<br> b}\otimes\mathbf {\vec a}\in \mathbb R^{n\times n}</p>
    <p>\frac{\partial(\mathbf {\vec a}^{T}\mathbf X\mathbf {\vec a}) }{\partial<br> \mathbf X }=\frac{\partial(\mathbf {\vec a}^{T}\mathbf X^{T}\mathbf {\vec a})<br> }{\partial \mathbf X }=\mathbf {\vec a}\otimes\mathbf {\vec a}</p>
    <p>\frac{\partial(\mathbf {\vec a}^{T}\mathbf X^{T}\mathbf X\mathbf {\vec b})<br> }{\partial \mathbf X }=\mathbf X(\mathbf {\vec a}\otimes\mathbf {\vec<br> b}+\mathbf {\vec b}\otimes\mathbf {\vec a})</p>
    <p>\frac{\partial[(\mathbf A\mathbf {\vec x}+\mathbf {\vec a})^{T}\mathbf<br> C(\mathbf B\mathbf {\vec x}+\mathbf {\vec b})]}{\partial \mathbf {\vec<br> x}}=\mathbf A^{T}\mathbf C(\mathbf B\mathbf {\vec x}+\mathbf {\vec b})+\mathbf<br> B^{T}\mathbf C(\mathbf A\mathbf {\vec x}+\mathbf {\vec a})</p>
    <p>\frac{\partial (\mathbf {\vec x}^{T}\mathbf A \mathbf {\vec x})}{\partial<br> \mathbf {\vec x}}=(\mathbf A+\mathbf A^{T})\mathbf {\vec x}</p>
    <p>\frac{\partial[(\mathbf X\mathbf {\vec b}+\mathbf {\vec c})^{T}\mathbf<br> A(\mathbf X\mathbf {\vec b}+\mathbf {\vec c})]}{\partial \mathbf X}=(\mathbf<br> A+\mathbf A^{T})(\mathbf X\mathbf {\vec b}+\mathbf {\vec c})\mathbf {\vec<br> b}^{T}</p>
    <p>\frac{\partial (\mathbf {\vec b}^{T}\mathbf X^{T}\mathbf A \mathbf X\mathbf<br> {\vec c})}{\partial \mathbf X}=\mathbf A^{T}\mathbf X\mathbf {\vec b}\mathbf<br> {\vec c}^{T}+\mathbf A\mathbf X\mathbf {\vec c}\mathbf {\vec b}^{T}</p>
    <ol start="3">
    <li>如果 f 是一元函数,则:</li>
    </ol>
    <pre><code>* 其逐元向量函数为:
    </code></pre>
    <p>f(\mathbf{\vec x}) =(f(x_1),f(x_2),\cdots,f(x_n))^{T}</p>
    <ul>
    <li>其逐矩阵函数为:</li>
    </ul>
    <p>f(\mathbf X)=\begin{bmatrix}<br> f(x_{1,1}) &amp; f(x_{1,2}) &amp; \cdots &amp; f(x_{1,n})\<br> f(x_{2,1}) &amp; f(x_{2,2}) &amp; \cdots &amp; f(x_{2,n})\<br> \vdots &amp; \vdots &amp; \ddots &amp; \vdots\<br> f(x_{m,1}) &amp; f(x_{m,2}) &amp; \cdots &amp; f(x_{m,n})\<br> \end{bmatrix}<br></p>
    <ul>
    <li>其逐元导数分别为:</li>
    </ul>
    <p>f^{\prime}(\mathbf{\vec x})=(f<sup>{\prime}(x1),f</sup>{\prime}(x2),\cdots,f<sup>{\prime}(x_n))</sup>{T}</p>
    <p>f^{\prime}(\mathbf X)=\begin{bmatrix}<br> f^{\prime}(x_{1,1}) &amp; f^{\prime}(x_{1,2}) &amp; \cdots &amp; f^{\prime}(x_{1,n})\<br> f^{\prime}(x_{2,1}) &amp; f^{\prime}(x_{2,2}) &amp; \cdots &amp; f^{\prime}(x_{2,n})\<br> \vdots &amp; \vdots &amp; \ddots &amp; \vdots\<br> f^{\prime}(x_{m,1}) &amp; f^{\prime}(x_{m,2}) &amp; \cdots &amp; f^{\prime}(x_{m,n})\<br> \end{bmatrix}</p>
    <ol start="4">
    <li>各种类型的偏导数:</li>
    </ol>
    <ul>
    <li>
    <p>标量对标量的偏导数: \frac{\partial u}{\partial v} 。</p>
    </li>
    <li>
    <p>标量对向量(n 维向量)的偏导数 :</p>
    </li>
    </ul>
    <p>\frac{\partial u}{\partial \mathbf {\vec v}}=(\frac{\partial u}{\partial<br> v_1},\frac{\partial u}{\partial v_2},\cdots,\frac{\partial u}{\partial<br> v_n})^{T}</p>
    <ul>
    <li>标量对矩阵(m\times n 阶矩阵)的偏导数:</li>
    </ul>
    <p><br> \frac{\partial u}{\partial \mathbf V}=\begin{bmatrix}<br> \frac{\partial u}{\partial V_{1,1}} &amp; \frac{\partial u}{\partial V_{1,2}} &amp; \cdots &amp;<br> \frac{\partial u}{\partial V_{1,n}}\<br> \frac{\partial u}{\partial V_{2,1}} &amp; \frac{\partial u}{\partial V_{2,2}} &amp; \cdots &amp; \frac{\partial u}{\partial V_{2,n}}\<br> \vdots &amp; \vdots &amp; \ddots &amp; \vdots\<br> \frac{\partial u}{\partial V_{m,1}} &amp; \frac{\partial u}{\partial V_{m,2}} &amp; \cdots &amp; \frac{\partial u}{\partial V_{m,n}}<br> \end{bmatrix}<br></p>
    <ul>
    <li>向量(m 维向量)对标量的偏导数:

    \frac{\partial \mathbf {\vec u}}{\partial v}=(\frac{\partial u_1}{\partial<br> v},\frac{\partial u_2}{\partial v},\cdots,\frac{\partial u_m}{\partial<br> v})^{T}</li>
    </ul>
    <p>。</p>
    <ul>
    <li>向量(m 维向量)对向量 (n 维向量) 的偏导数(雅可比矩阵,行优先)</li>
    </ul>
    <p>\frac{\partial \mathbf {\vec u}}{\partial \mathbf {\vec v}}=\begin{bmatrix}<br> \frac{\partial u_1}{\partial v_1} &amp; \frac{\partial u_1}{\partial v_2} &amp; \cdots &amp;<br> \frac{\partial u_1}{\partial v_n}\<br> \frac{\partial u_2}{\partial v_1} &amp; \frac{\partial u_2}{\partial v_2} &amp; \cdots &amp; \frac{\partial u_2}{\partial v_n}\<br> \vdots &amp; \vdots &amp; \ddots &amp; \vdots\<br> \frac{\partial u_m}{\partial v_1} &amp; \frac{\partial u_m}{\partial v_2} &amp; \cdots &amp; \frac{\partial u_m}{\partial v_n}<br> \end{bmatrix}</p>
    <p>如果为列优先,则为上面矩阵的转置。</p>
    <ul>
    <li>矩阵(m\times n 阶矩阵)对标量的偏导数</li>
    </ul>
    <p>\frac{\partial \mathbf U}{\partial v}=\begin{bmatrix}<br> \frac{\partial U_{1,1}}{\partial v} &amp; \frac{\partial U_{1,2}}{\partial v} &amp; \cdots &amp;<br> \frac{\partial U_{1,n}}{\partial v}\<br> \frac{\partial U_{2,1}}{\partial v} &amp; \frac{\partial U_{2,2}}{\partial v} &amp; \cdots &amp; \frac{\partial U_{2,n}}{\partial v}\<br> \vdots &amp; \vdots &amp; \ddots &amp; \vdots\<br> \frac{\partial U_{m,1}}{\partial v} &amp; \frac{\partial U_{m,2}}{\partial v} &amp; \cdots &amp; \frac{\partial U_{m,n}}{\partial v}<br> \end{bmatrix}</p>
    <ol start="5">
    <li>对于矩阵的迹,有下列偏导数成立:</li>
    </ol>
    <p>\frac{\partial [tr(f(\mathbf X))]}{\partial \mathbf X }=(f^{\prime}(\mathbf<br> X))^{T}</p>
    <p>\frac{\partial [tr(\mathbf A\mathbf X\mathbf B)]}{\partial \mathbf X<br> }=\mathbf A^{T}\mathbf B^{T}</p>
    <p>\frac{\partial [tr(\mathbf A\mathbf X^{T}\mathbf B)]}{\partial \mathbf X<br> }=\mathbf B\mathbf A</p>
    <p>\frac{\partial [tr(\mathbf A\otimes\mathbf X )]}{\partial \mathbf X<br> }=tr(\mathbf A)\mathbf I</p>
    <p>\frac{\partial [tr(\mathbf A\mathbf X \mathbf B\mathbf X)]}{\partial \mathbf<br> X }=\mathbf A^{T}\mathbf X^{T}\mathbf B^{T}+\mathbf B^{T}\mathbf X \mathbf<br> A^{T}</p>
    <p>\frac{\partial [tr(\mathbf X^{T} \mathbf B\mathbf X \mathbf C)]}{\partial<br> \mathbf X }=\mathbf B\mathbf X \mathbf C +\mathbf B^{T}\mathbf X \mathbf<br> C^{T}</p>
    <p>\frac{\partial [tr(\mathbf C^{T}\mathbf X^{T} \mathbf B\mathbf X \mathbf<br> C)]}{\partial \mathbf X }=(\mathbf B^{T}+\mathbf B)\mathbf X \mathbf C \mathbf<br> C^{T}</p>
    <p>\frac{\partial [tr(\mathbf A\mathbf X \mathbf B\mathbf X^{T} \mathbf<br> C)]}{\partial \mathbf X }= \mathbf A^{T}\mathbf C^{T}\mathbf X\mathbf<br> B^{T}+\mathbf C \mathbf A \mathbf X \mathbf B</p>
    <p>\frac{\partial [tr((\mathbf A\mathbf X\mathbf B+\mathbf C)(\mathbf A\mathbf<br> X\mathbf B+\mathbf C))]}{\partial \mathbf X }= 2\mathbf A ^{T}(\mathbf<br> A\mathbf X\mathbf B+\mathbf C)\mathbf B^{T}</p>
    <ol start="6">
    <li>假设 \mathbf U= f(\mathbf X) 是关于 \mathbf X 的矩阵值函数(

    f:\mathbb R^{m\times n}\rightarrow \mathbb R^{m\times n}</li>
    </ol>
    <p>),且 g(\mathbf U) 是关于 \mathbf U 的实值函数(g:\mathbb R^{m\times n}\rightarrow<br> \mathbb R),则下面链式法则成立:</p>
    <p>\frac{\partial g(\mathbf U)}{\partial \mathbf X}= \left(\frac{\partial<br> g(\mathbf U)}{\partial x_{i,j}}\right)<em>{m\times n}=\begin{bmatrix}<br> \frac{\partial g(\mathbf U)}{\partial x</em>{1,1}} &amp; \frac{\partial g(\mathbf U)}{\partial x_{1,2}} &amp; \cdots &amp; \frac{\partial g(\mathbf U)}{\partial x_{1,n}}\<br> \frac{\partial g(\mathbf U)}{\partial x_{2,1}} &amp; \frac{\partial g(\mathbf U)}{\partial x_{2,2}} &amp; \cdots &amp; \frac{\partial g(\mathbf U)}{\partial x_{2,n}}\<br> \vdots &amp; \vdots &amp; \ddots &amp;\vdots\<br> \frac{\partial g(\mathbf U)}{\partial x_{m,1}} &amp; \frac{\partial g(\mathbf U)}{\partial x_{m,2}} &amp; \cdots \frac{\partial g(\mathbf U)}{\partial x_{m,n}}\<br> \end{bmatrix}\<br> =\left(\sum_{k}\sum_{l}\frac{\partial g(\mathbf U)}{\partial u_{k,l}}\frac{\partial u_{k,l}}{\partial x_{i,j}}\right)<em>{m\times n}<br> =\left(tr\left[\left(\frac{\partial g(\mathbf U)}{\partial \mathbf U}\right)^{T} \frac{\partial \mathbf U}{\partial x</em>{i,j}}\right]\right)_{m\times n}<br></p>
    <h2><a id="_351"></a>四、特殊函数</h2>
    <ol>
    <li>这里给出机器学习中用到的一些特殊函数。</li>
    </ol>
    <h3><a id="41_sigmoid__355"></a>4.1 sigmoid 函数</h3>
    <ol>
    <li>sigmoid函数:</li>
    </ol>
    <p>\sigma(x)=\frac{1}{1+\exp(-x)}</p>
    <ul>
    <li>该函数可以用于生成二项分布的 \phi 参数。</li>
    <li>当 x 很大或者很小时,该函数处于饱和状态。此时函数的曲线非常平坦,并且自变量的一个较大的变化只能带来函数值的一个微小的变化,即:导数很小。</li>
    </ul>
    <h3><a id="42_softplus__365"></a>4.2 softplus 函数</h3>
    <ol>
    <li>softplus函数:

    \zeta(x)=\log(1+\exp(x))</li>
    </ol>
    <ul>
    <li>该函数可以生成正态分布的 \sigma^{2} 参数。</li>
    <li>它之所以称作softplus,因为它是下面函数的一个光滑逼近:x^{+}=\max(0,x) 。</li>
    </ul>
    <ol start="2">
    <li>如果定义两个函数:</li>
    </ol>
    <p>x<sup>{+}=\max(0,x)\ x</sup>{-}=\max(0,-x)</p>
    <p>则它们分布获取了 y=x 的正部分和负部分。</p>
    <p>根据定义有:x=x<sup>{+}-x</sup>{-} 。而 \zeta(x) 逼近的是 x^{+}\zeta(-x) 逼近的是

    x^{-},于是有:</p>
    <p>\zeta(x)-\zeta(-x)=x</p>
    <ol start="3">
    <li>sigmoidsoftplus函数的性质:</li>
    </ol>
    <p>\sigma(x)=\frac{\exp(x)}{\exp(x)+\exp(0)} \<br> \frac {d}{dx}\sigma(x)=\sigma(x)(1-\sigma(x)) \<br> 1-\sigma(x)=\sigma(-x) \<br> \log\sigma(x)=-\zeta(-x) \<br> \frac{d}{dx}\zeta(x)=\sigma(x) \<br> \forall x\in(0,1),\sigma^{-1}(x)=\log(\frac{x}{1-x}) \<br> \forall x \gt 0,\zeta^{-1}(x)=\log(\exp(x)-1) \<br> \zeta(x)=\int_{-\infty}^{x}\sigma(y)dy \<br> \zeta(x)-\zeta(-x)=x \<br></p>
    <p>其中 f^{-1}(\cdot) 为反函数。\sigma^{-1}(x) 也称作logit函数。</p>
    <h3><a id="43__400"></a>4.3 伽马函数</h3>
    <ol>
    <li>伽马函数定义为:</li>
    </ol>
    <p><br> \Gamma(x)=\int_0^{+\infty} t<sup>{x-1}e</sup>{-t}dt\quad,x\in \mathbb R\<br> or.<br> \quad\Gamma(z)=\int_0^{+\infty} t<sup>{z-1}e</sup>{-t}dt\quad,z\in \mathbb Z\<br></p>
    <p>性质为:</p>
    <ul>
    <li>
    <p>对于正整数 n 有: \Gamma(n)=(n-1)! 。</p>
    </li>
    <li>
    <p>\Gamma(x+1)=x\Gamma(x) ,因此伽马函数是阶乘在实数域上的扩展。</p>
    </li>
    <li>
    <p>与贝塔函数的关系:</p>
    </li>
    </ul>
    <p>B(m,n)=\frac{\Gamma(m)\Gamma(n)}{\Gamma(m+n)}</p>
    <ul>
    <li>对于 x \in (0,1) 有:</li>
    </ul>
    <p>\Gamma(1-x)\Gamma(x)=\frac{\pi}{\sin\pi x}</p>
    <p>则可以推导出重要公式: \Gamma(\frac 12)=\sqrt\pi 。</p>
    <ul>
    <li>对于 x\gt 0,伽马函数是严格凹函数。</li>
    </ul>
    <ol start="2">
    <li>当 x 足够大时,可以用Stirling 公式来计算Gamma函数值:

    \Gamma(x) \sim \sqrt{2\pi} e<sup>{-x}x</sup>{x+1/2}</li>
    </ol>
    <h3><a id="44__432"></a>4.4 贝塔函数</h3>
    <ol>
    <li>对于任意实数 m,n \gt 0 ,定义贝塔函数:</li>
    </ol>
    <p>B(m,n)=\int_0^1 x<sup>{m-1}(1-x)</sup>{n-1} dx</p>
    <p>其它形式的定义:</p>
    <p>B(m,n)=2\int_0^{\frac \pi 2}\sin ^{2m-1}(x) \cos ^{2n-1}(x) dx\\ B(m,n) =<br> \int_0<sup>{+\infty}\frac{x</sup>{m-1}}{(1+x)^{m+n}} dx\<br> B(m,n)=\int_0<sup>1\frac{x</sup>{m-1}+x<sup>{n-1}}{(1+x)</sup>{m+n}}dx</p>
    <ol start="2">
    <li>性质:</li>
    </ol>
    <ul>
    <li>
    <p>连续性:贝塔函数在定义域 m\gt0,n\gt0 内连续。</p>
    </li>
    <li>
    <p>对称性:B(m,n)=B(n,m) 。</p>
    </li>
    <li>
    <p>递个公式:</p>
    </li>
    </ul>
    <p><br> B(m,n) = \frac{n-1}{m+n-1}B(m,n-1),\quad m\gt0,n\gt1\<br> B(m,n) = \frac{m-1}{m+n-1}B(m-1,n),\quad m\gt1,n\gt0\<br> B(m,n) = \frac{(m-1)(n-1)}{(m+n-1)(m+n-2)}B(m-1,n-1),\quad m\gt1,n\gt1<br></p>
    <ul>
    <li>当 m,n 较大时,有近似公式:

    B(m,n)=\frac{\sqrt{(2\pi)m<sup>{m-1/2}n</sup>{n-1/2}}}{(m+n)^{m+n-1/2}}</li>
    <li>与伽马函数关系:
    <ul>
    <li>对于任意正实数 m,n ,有:

    B(m,n)=\frac{\Gamma(m)\Gamma(n)}{\Gamma(m+n)}</li>
    <li>B(m,1-m)=\Gamma(m)\Gamma(1-m) 。</li>
    </ul>
    </li>
    </ul>
    <p><a href="http://www.huaxiaozhuan.com/%E6%95%B0%E5%AD%A6%E5%9F%BA%E7%A1%80/chapters/1_algebra.html" target="_blank">参考文献</a></p>
    <pre><code class="lang-python">
    </code></pre>

    相关文章

      网友评论

          本文标题:线性代数基础

          本文链接:https://www.haomeiwen.com/subject/tvezwctx.html