Support Vector Machines
Description
svm is used to train a support vector machine. It can be used to carry out general regression and classification (of nu and epsilon-type), as well as density-estimation. A formula interface is provided.
Usage
## S3 method for class 'formula'
svm(formula, data = NULL, ..., subset, na.action =
na.omit, scale = TRUE)
## Default S3 method:
svm(x, y = NULL, scale = TRUE, type = NULL, kernel =
"radial", degree = 3, gamma = if (is.vector(x)) 1 else 1 / ncol(x),
coef0 = 0, cost = 1, nu = 0.5,
class.weights = NULL, cachesize = 40, tolerance = 0.001, epsilon = 0.1,
shrinking = TRUE, cross = 0, probability = FALSE, fitted = TRUE,
..., subset, na.action = na.omit)
Arguments
formula
a symbolic description of the model to be fit.
data
an optional data frame containing the variables in the model. By default the variables are taken from the environment which ‘svm’ is called from.
x
a data matrix, a vector, or a sparse matrix (object of class Matrix provided by the Matrix package, or of class matrix.csr provided by the SparseM package, or of class simple_triplet_matrix provided by the slam package).
y
a response vector with one label for each row/component of x. Can be either a factor (for classification tasks) or a numeric vector (for regression).
scale
A logical vector indicating the variables to be scaled. If scale is of length 1, the value is recycled as many times as needed. Per default, data are scaled internally (both x and y variables) to zero mean and unit variance. The center and scale values are returned and used for later predictions.
type
svm can be used as a classification machine, as a regression machine, or for novelty detection. Depending of whether y is a factor or not, the default setting for type is C-classification or eps-regression, respectively, but may be overwritten by setting an explicit value.
Valid options are:
C-classification
nu-classification
one-classification (for novelty detection)
eps-regression
nu-regression
kernel
the kernel used in training and predicting. You might consider changing some of the following parameters, depending on the kernel type.
linear:
u'*v
polynomial:
(gamma*u'*v + coef0)^degree
radial basis:
exp(-gamma*|u-v|^2)
sigmoid:
tanh(gamma*u'*v + coef0)
degree
parameter needed for kernel of type polynomial (default: 3)
gamma
parameter needed for all kernels except linear (default: 1/(data dimension))
coef0
parameter needed for kernels of type polynomial and sigmoid (default: 0)
cost
cost of constraints violation (default: 1)—it is the ‘C’-constant of the regularization term in the Lagrange formulation.
nu
parameter needed for nu-classification, nu-regression, and one-classification
class.weights
a named vector of weights for the different classes, used for asymmetric class sizes. Not all factor levels have to be supplied (default weight: 1). All components have to be named. Specifying "inverse" will choose the weights inversely proportional to the class distribution.
cachesize
cache memory in MB (default 40)
tolerance
tolerance of termination criterion (default: 0.001)
epsilon
epsilon in the insensitive-loss function (default: 0.1)
shrinking
option whether to use the shrinking-heuristics (default: TRUE)
cross
if a integer value k>0 is specified, a k-fold cross validation on the training data is performed to assess the quality of the model: the accuracy rate for classification and the Mean Squared Error for regression
fitted
logical indicating whether the fitted values should be computed and included in the model or not (default: TRUE)
probability
logical indicating whether the model should allow for probability predictions.
...
additional parameters for the low level fitting function svm.default
subset
An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
na.action
A function to specify the action to be taken if NAs are found. The default action is na.omit, which leads to rejection of cases with missing values on any required variable. An alternative is na.fail, which causes an error if NA cases are found. (NOTE: If given, this argument must be named.)
m1=svm(x=as.matrix(train.x),y=as.factor(train.y), gamma = gamma_list[i], probability = T)
网友评论