12. Support Vector Machines

作者: 玄语梨落 | 来源:发表于2020-08-21 08:10 被阅读0次

12. Support Vector Machines
SVM
SVM
支持向量机系列（一）——线性可分情形下的SVM
SVM explained-well
数据挖掘：支持向量机SVM
机器学习论文Read List
支持向量机SVM
a09.Andrew-ML07-SVM
Andrew Ng讲SVM(讲得是贼鸡儿好!)

Support Vector Machines

Optimization objective

SVM hypothesis:
logistic regression:

$y = \frac{1}{1+e^{-\theta^Tx}}$

cost function:

$\min_\theta C\sum_{i=1}^m[y^{(i)}cost_1(\theta^Tx^{(i)})+(1-y^{(i)})cost_0(\theta^Tx^{(i)})]+\frac{1}{2}\sum_{i=1}^n\theta_j^2$

Large Margin Intution

$\min_\theta C\sum_{i=1}^m[y^{(i)}cost_1(\theta^Tx^{(i)})+(1-y^{(i)})cost_0(\theta^Tx^{(i)})]+\frac{1}{2}\sum_{i=1}^n\theta_j^2$

If $y=1$ , we want $\theta^Tx\ge1$ (not just $\ge0$ )
If $y=0$ , we want $\theta^Tx\le-1$ (not just $\le0$ )

If C is too large, the deasion boundary will be sensitive by outliers

The mathematics behind large margin classification (optional)

Vector Inner Product

SVM Decision Boundary

$\min\limits_\theta\frac{1}{2}\sum\limits_{j=1}^n\theta_j^2=\frac{1}{2}||\theta||^2$

Kernels I

Non-liner decision boundary:

Given x, compute new feature feature depending on proximity to landmarks defined manually.

Kernels and Similarity (Gaussian kernel):

$f_1=similarity(x,l^{(1)})=\exp (-\frac{||x-l^{(1)}||^2}{2\sigma^2})=\exp(-\frac{\sum_{j=1}^n(x_j-l_j^{(1)})^2}{2\sigma^2})$

If $x\approx l^{(1)}\qquad f_1\approx 1$
If $x$ far from $l^{(1)}$ $f_1\approx0$

Kernels II

Choosing the land marks:
Where to get l ?
Give $(x^{(1)},y^{(i)}),(x^{(2)},y^{(2)}),...(x^{(n)},y^{(n)}$
choose $l^{(1)}=x^{(1)},l^{(2)}=x^{(2)},...,l^{(n)}=x^{(n)}$

For training examples $(x^{(i)},y^{(i)})$

$f_m^{(i)} = sim(x^{(i)},l^{(m)}) f_0 =1$

SVM with Kernels

Hypothesis: Given $x$ , compute features $f\in R^{m+1}$
Predict 'y=1' if $\theta^Tf\ge0$
Training: $\min\limits_\theta C\sum\limits_{i=1}^my^{(i)}cost_1(\theta^Tf^{(i)})+(1-y^{(i)})cost_0(\theta^Tf^{(i)})+\frac{1}{2}\sum\limits_{j=1}^n\theta_j^2\quad (n=m)$

Kernels ususally were used with SVM, although it can be used with logistic regressin, it runs slowly.

SVM parameters

C :

Large C: Lower bias, high variance.
Small C: Higher bias, low variance.

$\sigma^2$ :

Larger $\sigma^2$ : Features $f_i$ vary more smoothly. Higher bias, lower variance.(Underfit)
Small $\sigma^2$ : Feaugers $f_i$ vary less smoothly. Lower bias, higher variance. (Overfit)

Using an SVM

Need to specify:

Choice of parameter C
Choice of kernel (similarity function)

Note: Do perform feature scaling before using the Gaussian kernel.

Other choices of kernel

Not all similarity functions $similarity(x,l)$ make valid kernels. (Need to satisfy technical condition called "Mercer's Theorem") to make sure SVM packages' optimizations run correctly, and do not diverge.

Many off-the-shelf kernels avaliable:

Polynomial kernel: $k(x,l) = (x^Tl+constant)^degree$
String kernel
chi-square kernel
histogram intersection kernel

Multi-class classification

Many SVM packages already have build-in multi-class classification functionality.

Logistic regression vs. SVM

n = number of features, m = number of training examples.

If n is large (relative m):
Use logistc regression, or SVM without a kernel.
If n is small m is intermediate:
Use SVM with Gaussian kernel
If n is small, m is large:
Create/add more features, then use logistic regression or SVM without a kernel.

12. Support Vector Machines
Support Vector Machines Optimization objective SVM hypoth...
SVM
1.4. Support Vector Machines 支持向量机 Support vector machine...
SVM
1.4. Support Vector Machines 支持向量机 Support vector machine...
支持向量机系列（一）——线性可分情形下的SVM
Linear Support Vector Machines in the Linearly Separable ...
SVM explained-well
Support vector machines (SVM) User copperking stepped up ...
数据挖掘：支持向量机SVM
@[toc] Support Vector Machines We want to maximize: Which...
机器学习论文Read List
1.LIBSVM: a library for support vector machines https://w...
支持向量机SVM
1.Youtube视频：Support Vector Machines: A Visual Explanation...
a09.Andrew-ML07-SVM
支持向量机（Support Vector Machines）优化目标（Optimization objectiv...
Andrew Ng讲SVM(讲得是贼鸡儿好!)
CS229 Lecture notes Part V: Support Vector Machines http:...