Classification
Logistic Regression:
Hypothesis Representation
Want
Hypothesis:
Sigmoid funciton() = Logistic funciton
probability that given , parameterized
Decision Boundary
,means , then
the line which represnt names decision boundary.
The decision boundary is a property of the hypothesis not a property of the data set.
We use data set to find a ,each defines a decision boundary.
Cost Function
How to fit the parameters theta for logitic Regression.
Liner Regression:
When used for logistic regression, this function will be a no-convex function.
The topic of convexity analysis is beyond the scope of this course.
Simplified cost function and gradient descent
the priniclple of maximum likelihood estimation
代价函数:
Want :
Repetat {
}
Because of the of the logistic regression is not as same as the liner regression ,the algorithms are no the same thing.
Advanced Optimization
Given , we have code that can compute
Optimizatin algorithms:
- Gradient descent
- Conjugate gradient
- BFGS
- L-BFGS
Advantages:
- No need to manuallly pick
- Often faster than gradient descent.
Disadvantages:
- more complex
An example:
- the function
function[jVal,gradient] = costFunction(theta)
jVal = (theta(1)-5)^2+(theta(2)-5)^2
gradient = zeros(2,1);
gradient(1) = 2*(theta(1)-5);
gradient(2) = 2*(theta(2)-5);
code in Octave:
Options = optimset('GradObj','on','MaxIter','100');
initialTheta = zeros(2,1)
[optTheta,functionVal,exitFlag] = fminunc(@costFunction,initialTheta,options)
The function 'fminunc' is not a gradient descent, but can be seen as it.
There must be at lest 2 paragramers or 2 dimension () in using 'fminunc'. To get more information, use 'help fminunc'.
Multiclass classification
one versus all classification (one versus reset)
Train a logistic regression classifier for each class to predict the probability that .
On a new input , to make a prediction, pick the class that maximizes.
网友评论