Andrew Ng ML(1)——basic knowledge

作者: tmax | 来源:发表于2018-12-19 17:01 被阅读0次

Andrew Ng ML(1)——basic knowledge
Andrew Ng ML学习总结
ML学习建议(Andrew Ng)
ML.Andrew Ng 4.24
AI数学基础16——训练/验证/测试集
Andrew Ng ML(2)——linear regressi
ML. Andrew Ng. 4.28 Ⅱ
Andrew Ng ML(3)——Logistic回归
Andrew Ng ML(5)——神经网络
ML.Andrew Ng.04-25

introduction

supervised learning(with labels)
regressing
classification
unsupervised learning(no labels or same label)
clustering

univariate (one variable) linear regressing (supervised learning)

m: numbers of training examples
x's: input variable/features
y's: output variable/targets variable
e.g.
$(x,y)$ :single training example
$(x^{(i)},y^{(i)})$ : $i^{th}$ training example
regressing

Hypothesis: $h_{\Theta}(x)=\Theta_0 +\Theta_1x$
Parameters: $\Theta_{i's}$
cost function: $J(\Theta_0,\Theta_1)=\frac {1} {2m}\sum_1^m (h_{\Theta}(x^{(i)})-y^{(i)})^2$ (←this is a square error function，also the most commonly used one for regression problems)
goal: $\displaystyle minimize_{\Theta_0,\Theta_1} \ J(\Theta_0,\Theta_1)$

simplify hypothesis as $h_{\Theta}(x)=\Theta_1x$
$\Downarrow$

each value of Theta1 corresponds a different hypothesis

hypothesis as $h_{\Theta}(x)=\Theta_0+\Theta_1x$
$\Downarrow$

cost function(function J) when the hypothesis have two parameters

Right:contour plot(等高线图) of cost function

"Batch"Gradient descent("Batch"梯度下降) with one variable

Batch:每一步梯度下降均用到了整个样本（ $J(\Theta_0 ,\Theta_1)$ 中有对均方误差的累加）
have some functions $J(\Theta_0 ,\Theta_1.... \Theta_n)$
want min $J(\Theta_0 ,\Theta_1.... \Theta_n)$
outline:1.start with some $\Theta_0 ,\Theta_1.... \Theta_n$ (commonly they are all zeros) 2.keep changing $\Theta_0 ,\Theta_1.... \Theta_n$ to reduce $J(\Theta_0 ,\Theta_1.... \Theta_n)$ until we hopefully end up at a mininum

Gradient descent algorithm (P.S. := 表示赋值， = 表示比较，需要同时更新两个parameters)
simplify hypothesis as

梯度下降公式中，导数项的含义

α的取值对梯度下降的影响（如果Θ已经取到局部最小值，由于导数项为0，解将一直保持在局部最小值）

simplify hypothesis as $h_{\Theta}(x)=\Theta_0+\Theta_1x$
$\Downarrow$

cost function and Gradient descent algorithm when hypothesis have two parameters

导数项计算

将导数项代会上图中的梯度下降算法

最后，将梯度下降算法中得到的parameters $\Theta_0,\Theta_1$ 代入 $h_{\Theta}(x)$ ，就能得到最优解线性拟合函数

Matrices and vectors（回顾）

Vector: An n x 1 matrix (in this course)
e.g. $y=\begin{bmatrix} 460\\ 232\\ 315\\ 178\\ \end{bmatrix}$ $y_i = i^{th}$ element,( $y_1=460$ )
matrices addition (略)
scalar multiplication
$3\begin{bmatrix} 1&0\\ 2&5\\ 3&1\\ \end{bmatrix}=\begin{bmatrix} 3&0\\ 6&15\\ 9&3\\ \end{bmatrix}$ $\begin{bmatrix} 4&0\\ 6&3\\ \end{bmatrix}/4=\begin{bmatrix} 1&0\\ 3/2&3/4\\ \end{bmatrix}$
matrices multiplication

calculate all of predicted prices at the same time（单个假设函数）
$\Downarrow$
Houses sizes:
2104
1416
1534
852

hypothesis:
$h_\Theta(x)=-40+0.25x$

$\begin{bmatrix} 1&2104\\ 1&1416\\ 1&1534\\ 1&852\\ \end{bmatrix}* \begin{bmatrix} -40\\ 0.25 \end{bmatrix}= \begin{bmatrix} -40*1+2104*0.25\\ ...\\ ...\\ -40*1+852*0.25 \end{bmatrix}$
(prediction = DataMatrix * parameters)

多个假设函数
$\Downarrow$

多假设函数

properties of matrices multiplication
$A\times B \not= B\times A$ in general, expect $A \times I = I \times A$
$A \times B \times C = A \times (B \times C)$
matrices inverse (逆矩阵)

if A is an m x m matrix, and if it has an inverse
$A（A^{-1}）= A^{-1}A=I$
如果一个矩阵没有逆矩阵，贼该矩阵为奇异矩阵（singular）、退化矩阵（degenerate）
如何手工求解逆矩阵？
$A=\begin{bmatrix} a11 & a12\\ a12 & a21 \end{bmatrix}$ ， $A^{-1}=\frac{1}{det(B)}A^*$
行列式: $det(B)=a11a22-a12a21$
伴随矩阵： $A^*= \begin{bmatrix} A^*_{11}& A^*_{12}\\ A^*_{21}& A^*_{22}\\ \end{bmatrix} =\begin{bmatrix} (-1)^{1+1} \times a21 & (-1)^{2+1} \times a12\\ (-1)^{2+1} \times a12 & (-1)^{2+2} \times a11\\ \end{bmatrix}$

代入求解逆矩阵，但是一般用库求解
matrix transpose(转置矩阵) 略

Andrew Ng ML(1)——basic knowledge
introduction supervised learning(with labels)regressingcl...
Andrew Ng ML学习总结
概述断断续续，一个月的时间，把吴老师的机器学习视频教程看完，收获很多，从一无所知到概念的理解、公式的推导、算法的探...
ML学习建议(Andrew Ng)
贴上吴恩达的建议（渣译）来自Bilibili机器学习视频148楼NewConstance用户在社交媒体上关注那些...
ML.Andrew Ng 4.24
机器学习定义近代的定义如下：一个计算机程序叫做机器学习，如果它从任务T的经验E中学习，该程序依赖某种指标P.并且...
AI数学基础16——训练/验证/测试集
参考文献：Andrew Ng《Setting up your ML application》应用型机器学习，是一...
Andrew Ng ML(2)——linear regressi
linear regressing with multiple variables(supervised lear...
ML. Andrew Ng. 4.28 Ⅱ
Andrew会花很多时间来教授案例，这样的好处是往往能够避免我们去走别人已经走过的弯路，以便于我们开发机器学习系统...
Andrew Ng ML(3)——Logistic回归
Logistic回归（）——分类算法 (Logistic function/sigmoid function)由图...
Andrew Ng ML(5)——神经网络
对图形进行识别判断，图片是否为汽车，输入一张50*50pixel的图片，n=2500，平方项的feature就接近...
ML.Andrew Ng.04-25
1-5 无监督学习首先是一个鸡尾酒会的例子，在一个鸡尾酒会上，有两个人分别从不同的距离向两个位置不同的麦克风说话...

Andrew Ng ML(1)——basic knowledge

introduction

univariate (one variable) linear regressing (supervised learning)

"Batch"Gradient descent("Batch"梯度下降) with one variable

最后，将梯度下降算法中得到的parameters $\Theta_0,\Theta_1$ 代入 $h_{\Theta}(x)$ ，就能得到最优解线性拟合函数

Matrices and vectors（回顾）

matrices inverse (逆矩阵)

代入求解逆矩阵，但是一般用库求解

相关文章