1.hypothesis
one variable
multivariable$
general:
2.cost function
:the number of training samples
:the number of features
:the set of training samples (
:the output of training samples (
)
:hypthesis
:parameter
we should to find suitable to make
minimize,there are two methods.
3.normal equation
Derivation
to get the minimum, we can get the formula below:
further:
we know:
the equation (2) will be when A is a symmetric matrix. the proof of equation 2
obviously,
use the equation (1), we can get
use the equation (3),we can get
use the equation (2),we can get
in summary:
4.gradient descent
repeat until converge{
}
reperat until converge{
}
feature scaling and mean normalize
which is the mean of
(feature j) and
is the Standard deviation or (max-min)
网友评论