2. 代价函数

作者: 玄语梨落 | 来源:发表于2020-08-15 08:32 被阅读0次

2. 代价函数
梯度消失问题的出现和解决
一些知识点总结
2. 模型和代价函数
代价函数
Day5 Chapter5.9-5.10
线性回归
Tensorflow(3)
交叉熵代价函数
机器学习（5）——代价函数误差分析

Cost Function

Model Representation

m = Number of training examples
x's = "input" variable / features
y's = "output" variable / "targer" variable

h instand for hypothesis h represent a function

How do we represent h?

$h_\Theta (x)=\Theta_0+\Theta_1 x$

Cost Function

minimize
代价函数（平方误差函数，平方误差代价函数）解决回归问题最好的手段

Hypothesis: $h_\Theta(x)=\Theta_0+\Theta_1x$
Cost Function: $J(\Theta_0,\Theta_1)=\frac{1}{2m}\sum_{i=1}^{m}(h_\Theta(x^{(i)})-y^{(i)})^2$
Goal: $\mathop{minimize}\limits_{\Theta_0,\Theta_1}J(\Theta_0,\Theta_1)$

Cost Function

Simplified:
$h_\Theta(x)=\Theta_1 x$

对于线性回归来说

一维的代价函数是一个弓形
三维的代价函数和一维的代价函数一样都是一个弓形

contour plot or contour figure:轮廓图

Gradient descent

Gradient descent can be used in more common cost function ,not only in two parameter

Outline:
Start with some $\Theta_0,\Theta_1$
Keep changing $\Theta_0,\Theta_1$ to reduce J( $\Theta_0,\Theta_1$ )
Until we hopefully end up at a minimum

Gradient descent algorithm:

repeta until convergence { $\Theta_j:=\Theta_j-\alpha\frac{\partial}{\partial\Theta_j}J(\Theta_0,\Theta_1)$ (for $j$ =0 and $j$ =1)}

$\alpha$ is a mumber called learning rate, which control the length of our step in Gradient descent.

Correct: Simultaneous update

temp0:= $\Theta_0 - \alpha\frac{\partial}{\partial\Theta_0}J(\Theta_0,\Theta_1)$
temp1:= $\Theta_1 - \alpha\frac{\partial}{\partial\Theta_1}J(\Theta_0,\Theta_1)$
$\Theta_0:=temp0$
$\Theta_1:=temp1$

Incorrect:

temp0:= $\Theta_0 - \alpha\frac{\partial}{\partial\Theta_0}J(\Theta_0,\Theta_1)$
$\Theta_0:=temp0$
temp1:= $\Theta_1 - \alpha\frac{\partial}{\partial\Theta_1}J(\Theta_0,\Theta_1)$
$\Theta_1:=temp1$

同时更新

Gradient descent's characteristics

$\Theta_1:=\Theta_1-\alpha\frac{\partial}{\partial\Theta_1}J(\Theta_1)$

if $\alpha$ is too small, gradient descent can be slow.
if $\alpha$ is too large, gradient descent can overshoot the minimum. It ma fail to coverge ,or even diverge.
As we approach a local minimum, gradient descent will automatically take smaller steps. So, no need to decraser $\alpha$ over time.

Gradient Descent For Liner Reg

convex function
"Batch" Gradient Descent: Each step of gradient descent uses all the training examples.

2. 代价函数
Cost Function Model Representation m = Number of training...
梯度消失问题的出现和解决
1. 使用二次代价函数引起梯度消失的原因 2. 交叉熵代价函数什么是交叉熵代价函数解决梯度消失问题的原理 3....
一些知识点总结
损失函数（代价函数）： 1.均方误差 2.交叉熵 3. 对数似然函数激活函数： 1. sigmoid函数 2. ...
2. 模型和代价函数
本人在学习斯坦福大学的机器学习课程，特记录课程概要内容。课程地址: Andrew Ng机器学习课程模型表示我们...
代价函数
代价函数 1、代价函数是什么？理解的代价函数就是用于找到最优解的目的函数，这也是代价函数的作用。损失函数（...
Day5 Chapter5.9-5.10
5.9 随机梯度下降 1. 机器学习算法中的代价函数通常可以分解成每个样本的代价函数的总和 2. 对于相加而成的代...
线性回归
代价函数代价函数梯度正规方程正规方程步骤
Tensorflow(3)
优化MNIST分类简单版本二次代价函数VS交叉商代价函数防止过拟合 1. 二次代价函数 VS 交叉商代价函数 1...
交叉熵代价函数
交叉熵代价函数（cross-entropy cost function） 1.从方差代价函数说起代价函数经常用方...
机器学习（5）——代价函数误差分析
代价函数简介代价函数(Cost Function),通常也被称为损失函数(Loss Function)。这类...