Fit a support vector machine regression model
fitrsvm
trains or cross-validates a support vector machine (SVM) regression model on a low- through moderate-dimensional predictor data set.fitrsvm
supports mapping the predictor data using kernel functions, and supports SMO, ISDA, orL1 soft-margin minimization via quadratic programming for objective-function minimization.
To train a linear SVM regression model on a high-dimensional data set, that is, data sets that include many predictor variables, usefitrlinear
instead.
To train an SVM model for binary classification, seefitcsvm
for low- through moderate-dimensional predictor data sets, orfitclinear
for high-dimensional data sets.
再保险turns a full, trained support vector machine (SVM) regression modelMdl
= fitrsvm(Tbl
,ResponseVarName
)Mdl
trained using the predictors values in the tableTbl
and the response values inTbl.ResponseVarName
.
再保险turns an SVM regression model with additional options specified by one or more name-value pair arguments, using any of the previous syntaxes. For example, you can specify the kernel function or train a cross-validated model.Mdl
= fitrsvm(___,Name,Value
)
Train a support vector machine (SVM) regression model using sample data stored in matrices.
Load thecarsmall
data set.
loadcarsmallrng'default'% For reproducibility
SpecifyHorsepower
andWeight
as the predictor variables (X
) andMPG
as the response variable (Y
).
X = [Horsepower,Weight]; Y = MPG;
Train a default SVM regression model.
Mdl = fitrsvm(X,Y)
Mdl = RegressionSVM ResponseName: 'Y' CategoricalPredictors: [] ResponseTransform: 'none' Alpha: [75x1 double] Bias: 57.3958 KernelParameters: [1x1 struct] NumObservations: 93 BoxConstraints: [93x1 double] ConvergenceInfo: [1x1 struct] IsSupportVector: [93x1 logical] Solver: 'SMO' Properties, Methods
Mdl
is a trainedRegressionSVM
model.
Check the model for convergence.
Mdl.ConvergenceInfo.Converged
ans =logical0
0
indicates that the model did not converge.
Retrain the model using standardized data.
MdlStd = fitrsvm(X,Y,'Standardize',true)
MdlStd = RegressionSVM ResponseName:“Y”确定alPredictors: [] ResponseTransform: 'none' Alpha: [77x1 double] Bias: 22.9131 KernelParameters: [1x1 struct] Mu: [109.3441 2.9625e+03] Sigma: [45.3545 805.9668] NumObservations: 93 BoxConstraints: [93x1 double] ConvergenceInfo: [1x1 struct] IsSupportVector: [93x1 logical] Solver: 'SMO' Properties, Methods
Check the model for convergence.
MdlStd.ConvergenceInfo.Converged
ans =logical1
1
indicates that the model did converge.
Compute the resubstitution (in-sample) mean-squared error for the new model.
lStd = resubLoss(MdlStd)
lStd = 17.0256
Train a support vector machine regression model using the abalone data from the UCI Machine Learning Repository.
Download the data and save it in your current folder with the name'abalone.csv'
.
url ='https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/abalone.data'; websave('abalone.csv',url);
Read the data into a table. Specify the variable names.
varnames = {'Sex';'Length';'Diameter';'Height';'Whole_weight';...'Shucked_weight';'Viscera_weight';'Shell_weight';'Rings'}; Tbl = readtable('abalone.csv','Filetype','text','ReadVariableNames',false); Tbl.Properties.VariableNames = varnames;
The sample data contains 4177 observations. All the predictor variables are continuous except forSex
, which is a categorical variable with possible values'M'
(for males),'F'
(for females), and'I'
(for infants). The goal is to predict the number of rings (stored inRings
) on the abalone and determine its age using physical measurements.
Train an SVM regression model, using a Gaussian kernel function with an automatic kernel scale. Standardize the data.
rngdefault% For reproducibilityMdl = fitrsvm(Tbl,'Rings',“克恩elFunction','gaussian',“克恩elScale','auto',...'Standardize',true)
Mdl = RegressionSVM PredictorNames: {'Sex' 'Length' 'Diameter' 'Height' 'Whole_weight' 'Shucked_weight' 'Viscera_weight' 'Shell_weight'} ResponseName: 'Rings' CategoricalPredictors: 1 ResponseTransform: 'none' Alpha: [3635×1 double] Bias: 10.8144 KernelParameters: [1×1 struct] Mu: [0 0 0 0.5240 0.4079 0.1395 0.8287 0.3594 0.1806 0.2388] Sigma: [1 1 1 0.1201 0.0992 0.0418 0.4904 0.2220 0.1096 0.1392] NumObservations: 4177 BoxConstraints: [4177×1 double] ConvergenceInfo: [1×1 struct] IsSupportVector: [4177×1 logical] Solver: 'SMO' Properties, Methods
The Command Window shows thatMdl
is a trainedRegressionSVM
model and displays a property list.
Display the properties ofMdl
using dot notation. For example, check to confirm whether the model converged and how many iterations it completed.
conv = Mdl.ConvergenceInfo.Converged
conv =logical1
它er = Mdl.NumIterations
它er = 2759
The returned results indicate that the model converged after 2759 iterations.
Load thecarsmall
data set.
loadcarsmallrng'default'% For reproducibility
SpecifyHorsepower
andWeight
as the predictor variables (X
) andMPG
as the response variable (Y
).
X =(马力重量);Y = MPG;
Cross-validate two SVM regression models using 5-fold cross-validation. For both models, specify to standardize the predictors. For one of the models, specify to train using the default linear kernel, and the Gaussian kernel for the other model.
MdlLin = fitrsvm(X,Y,'Standardize',true,'KFold',5)
MdlLin = RegressionPartitionedSVM CrossValidatedModel: 'SVM' PredictorNames: {'x1' 'x2'} ResponseName: 'Y' NumObservations: 94 KFold: 5 Partition: [1x1 cvpartition] ResponseTransform: 'none' Properties, Methods
MdlGau = fitrsvm(X,Y,'Standardize',true,'KFold',5,“克恩elFunction','gaussian')
MdlGau = RegressionPartitionedSVM CrossValidatedModel: 'SVM' PredictorNames: {'x1' 'x2'} ResponseName: 'Y' NumObservations: 94 KFold: 5 Partition: [1x1 cvpartition] ResponseTransform: 'none' Properties, Methods
MdlLin.Trained
ans=5×1 cell array{1x1 classreg.learning.regr.CompactRegressionSVM} {1x1 classreg.learning.regr.CompactRegressionSVM} {1x1 classreg.learning.regr.CompactRegressionSVM} {1x1 classreg.learning.regr.CompactRegressionSVM} {1x1 classreg.learning.regr.CompactRegressionSVM}
MdlLin
andMdlGau
areRegressionPartitionedSVM
cross-validated models. TheTrained
property of each model is a 5-by-1 cell array ofCompactRegressionSVM
models. The models in the cell store the results of training on 4 folds of observations, and leaving one fold of observations out.
Compare the generalization error of the models. In this case, the generalization error is the out-of-sample mean-squared error.
mseLin = kfoldLoss(MdlLin)
mseLin = 17.4417
mseGau = kfoldLoss(MdlGau)
mseGau = 16.7355
The SVM regression model using the Gaussian kernel performs better than the one using the linear kernel.
Create a model suitable for making predictions by passing the entire data set tofitrsvm
, and specify all name-value pair arguments that yielded the better-performing model. However, do not specify any cross-validation options.
MdlGau = fitrsvm(X,Y,'Standardize',true,“克恩elFunction','gaussian');
To predict the MPG of a set of cars, passMdl
and a table containing the horsepower and weight measurements of the cars topredict
.
This example shows how to optimize hyperparameters automatically usingfitrsvm
.The example uses thecarsmall
data.
Load thecarsmall
data set.
loadcarsmall
SpecifyHorsepower
andWeight
as the predictor variables (X
) andMPG
as the response variable (Y
).
X =(马力重量);Y = MPG;
Find hyperparameters that minimize five-fold cross-validation loss by using automatic hyperparameter optimization.
For reproducibility, set the random seed and use the'expected-improvement-plus'
acquisition function.
rngdefaultMdl = fitrsvm(X,Y,'OptimizeHyperparameters','auto',...'HyperparameterOptimizationOptions',struct('AcquisitionFunctionName',...'expected-improvement-plus'))
|====================================================================================================================| | Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | BoxConstraint| KernelScale | Epsilon | | | result | log(1+loss) | runtime | (observed) | (estim.) | | | | |====================================================================================================================| | 1 | Best | 6.8077 | 8.9946 | 6.8077 | 6.8077 | 0.35664 | 0.043031 | 0.30396 | | 2 | Best | 2.9108 | 0.079288 | 2.9108 | 3.1259 | 70.67 | 710.65 | 1.6369 | | 3 | Accept | 4.1884 | 0.061383 | 2.9108 | 3.1211 | 14.367 | 0.0059144 | 442.64 | | 4 | Accept | 4.159 | 0.072431 | 2.9108 | 3.0773 | 0.0030879 | 715.31 | 2.6045 | | 5 | Best | 2.902 | 0.19692 | 2.902 | 2.9015 | 969.07 | 703.1 | 0.88614 | | 6 | Accept | 4.1884 | 0.063713 | 2.902 | 2.9017 | 993.93 | 919.26 | 22.16 | | 7 | Accept | 2.9307 | 0.09127 | 2.902 | 2.9018 | 219.88 | 613.28 | 0.015526 | | 8 | Accept | 2.9537 | 0.38273 | 2.902 | 2.9017 | 905.17 | 395.74 | 0.021914 | | 9 | Accept | 2.9073 | 0.13752 | 2.902 | 2.9017 | 24.242 | 647.2 | 0.17855 | | 10 | Accept | 2.9044 | 0.2345 | 2.902 | 2.9017 | 117.27 | 173.98 | 0.73387 | | 11 | Accept | 2.9035 | 0.084693 | 2.902 | 2.9016 | 1.3516 | 131.19 | 0.0093404 | | 12 | Accept | 4.0917 | 0.10013 | 2.902 | 2.902 | 0.012201 | 962.58 | 0.0092777 | | 13 | Accept | 2.9525 | 0.88983 | 2.902 | 2.902 | 77.38 | 65.508 | 0.0093299 | | 14 | Accept | 2.9352 | 0.10519 | 2.902 | 2.9019 | 21.591 | 166.43 | 0.035214 | | 15 | Accept | 2.9341 | 0.12667 | 2.902 | 2.9019 | 45.286 | 207.56 | 0.009379 | | 16 | Accept | 2.9104 | 0.072284 | 2.902 | 2.9018 | 0.064315 | 23.313 | 0.0093341 | | 17 | Accept | 2.9056 | 0.11728 | 2.902 | 2.9018 | 0.33909 | 40.311 | 0.053394 | | 18 | Accept | 2.9335 | 0.22476 | 2.902 | 2.8999 | 0.9904 | 41.169 | 0.0099688 | | 19 | Accept | 2.9929 | 0.1796 | 2.902 | 2.8995 | 0.0010811 | 33.401 | 0.017694 | | 20 | Accept | 4.1884 | 0.081198 | 2.902 | 2.9 | 0.0014524 | 1.9514 | 856.49 | |====================================================================================================================| | Iter | Eval | Objective: | Objective | BestSoFar | BestSoFar | BoxConstraint| KernelScale | Epsilon | | | result | log(1+loss) | runtime | (observed) | (estim.) | | | | |====================================================================================================================| | 21 | Accept | 2.904 | 0.11233 | 2.902 | 2.8831 | 88.487 | 405.92 | 0.44372 | | 22 | Accept | 2.9107 | 0.096647 | 2.902 | 2.884 | 344.34 | 992 | 0.28418 | | 23 | Accept | 2.904 | 0.1051 | 2.902 | 2.8841 | 0.92028 | 70.985 | 0.52233 | | 24 | Best | 2.859 | 0.93928 | 2.859 | 2.8606 | 18.319 | 27.763 | 3.008 | | 25 | Accept | 2.9177 | 3.1086 | 2.859 | 2.8612 | 39.154 | 24.119 | 0.67121 | | 26 | Accept | 2.9059 | 0.14666 | 2.859 | 2.8609 | 0.067541 | 15.019 | 1.192 | | 27 | Accept | 4.1884 | 0.093034 | 2.859 | 2.8622 | 987.04 | 3.1666 | 70.752 | | 28 | Accept | 2.8936 | 0.17454 | 2.859 | 2.8744 | 2.2395 | 36.089 | 1.6775 | | 29 | Accept | 2.9156 | 0.067328 | 2.859 | 2.875 | 0.0027368 | 12.221 | 0.10637 | | 30 | Accept | 2.9105 | 0.074655 | 2.859 | 2.8757 | 0.05895 | 21.326 | 0.2563 |
__________________________________________________________ Optimization completed. MaxObjectiveEvaluations of 30 reached. Total function evaluations: 30 Total elapsed time: 32.4033 seconds Total objective function evaluation time: 17.2141 Best observed feasible point: BoxConstraint KernelScale Epsilon _____________ ___________ _______ 18.319 27.763 3.008 Observed objective function value = 2.859 Estimated objective function value = 2.8727 Function evaluation time = 0.93928 Best estimated feasible point (according to models): BoxConstraint KernelScale Epsilon _____________ ___________ _______ 2.2395 36.089 1.6775 Estimated objective function value = 2.8757 Estimated function evaluation time = 0.20807
Mdl = RegressionSVM ResponseName: 'Y' CategoricalPredictors: [] ResponseTransform: 'none' Alpha: [62x1 double] Bias: 45.4806 KernelParameters: [1x1 struct] NumObservations: 93 HyperparameterOptimizationResults: [1x1 BayesianOptimization] BoxConstraints: [93x1 double] ConvergenceInfo: [1x1 struct] IsSupportVector: [93x1 logical] Solver: 'SMO' Properties, Methods
The optimization searched overBoxConstraint
,KernelScale
, andEpsilon
.The output is the regression with the minimum estimated cross-validation loss.
Tbl
—Predictor dataSample data used to train the model, specified as a table. Each row ofTbl
corresponds to one observation, and each column corresponds to one predictor variable. Optionally,Tbl
can contain one additional column for the response variable. Multicolumn variables and cell arrays other than cell arrays of character vectors are not allowed.
IfTbl
contains the response variable, and you want to use all remaining variables inTbl
as predictors, then specify the response variable usingResponseVarName
.
IfTbl
contains the response variable, and you want to use only a subset of the remaining variables inTbl
as predictors, then specify a formula usingformula
.
IfTbl
does not contain the response variable, then specify a response variable usingY
.The length of response variable and the number of rows ofTbl
must be equal.
If a row ofTbl
or an element ofY
contains at least oneNaN
, thenfitrsvm
再保险moves those rows and elements from both arguments when training the model.
To specify the names of the predictors in the order of their appearance inTbl
, use thePredictorNames
name-value pair argument.
Data Types:table
ResponseVarName
—Response variable nameTbl
响应变量名称,指定的名称variable inTbl
.The response variable must be a numeric vector.
You must specifyResponseVarName
as a character vector or string scalar. For example, ifTbl
stores the response variableY
asTbl.Y
, then specify it as'Y'
.否则,软件将所有列Tbl
, includingY
, as predictors when training the model.
Data Types:char
|string
formula
—Explanatory model of response variable and subset of predictor variablesExplanatory model of the response variable and a subset of the predictor variables, specified as a character vector or string scalar in the form"Y~x1+x2+x3"
.In this form,Y
再保险presents the response variable, andx1
,x2
, andx3
再保险present the predictor variables.
To specify a subset of variables inTbl
as predictors for training the model, use a formula. If you specify a formula, then the software does not use any variables inTbl
that do not appear informula
.
The variable names in the formula must be both variable names inTbl
(Tbl.Properties.VariableNames
) and valid MATLAB®identifiers. You can verify the variable names inTbl
by using theisvarname
功能ion. If the variable names are not valid, then you can convert them by using thematlab.lang.makeValidName
功能ion.
Data Types:char
|string
Y
—Response dataResponse data, specified as ann-by-1 numeric vector. The length ofY
and the number of rows ofTbl
orX
must be equal.
If a row ofTbl
orX
, or an element ofY
, contains at least oneNaN
, thenfitrsvm
再保险moves those rows and elements from both arguments when training the model.
To specify the response variable name, use theResponseName
name-value pair argument.
Data Types:single
|double
X
—Predictor dataPredictor data to which the SVM regression model is fit, specified as ann-by-pnumeric matrix.nis the number of observations andpis the number of predictor variables.
The length ofY
and the number of rows ofX
must be equal.
If a row ofX
or an element ofY
contains at least oneNaN
, thenfitrsvm
再保险moves those rows and elements from both arguments.
To specify the names of the predictors in the order of their appearance inX
, use thePredictorNames
name-value pair argument.
Data Types:single
|double
Specify optional comma-separated pairs ofName,Value
arguments.Name
is the argument name andValue
is the corresponding value.Name
must appear inside quotes. You can specify several name and value pair arguments in any order asName1,Value1,...,NameN,ValueN
.
“克恩elFunction','gaussian','Standardize',true,'CrossVal','on'
trains a 10-fold cross-validated SVM regression model using a Gaussian kernel and standardized training data.
Note
You cannot use any cross-validation name-value argument together with the'OptimizeHyperparameters'
name-value argument. You can modify the cross-validation for'OptimizeHyperparameters'
only by using the'HyperparameterOptimizationOptions'
name-value argument.
BoxConstraint
—Box constraintBox constraint for the alpha coefficients, specified as the comma-separated pair consisting of'BoxConstraint'
and a positive scalar value.
The absolute value of theAlpha
coefficients cannot exceed the value ofBoxConstraint
.
The defaultBoxConstraint
value for the'gaussian'
or'rbf'
kernel function isiqr(Y)/1.349
, whereiqr(Y)
is the interquartile range of response variableY
.For all other kernels, the defaultBoxConstraint
value is 1.
Example:BoxConstraint,10
Data Types:single
|double
KernelFunction
—Kernel function'linear'
(default) |'gaussian'
|'rbf'
|'polynomial'
|功能ion nameKernel function used to compute theGram matrix, specified as the comma-separated pair consisting of“克恩elFunction'
and a value in this table.
Value | Description | Formula |
---|---|---|
'gaussian' or'rbf' |
Gaussian or Radial Basis Function (RBF) kernel |
|
'linear' |
Linear kernel |
|
'polynomial' |
Polynomial kernel. Use'PolynomialOrder', to specify a polynomial kernel of orderq . |
|
You can set your own kernel function, for example,kernel
, by setting“克恩elFunction','kernel'
.kernel
must have the following form:
功能ionG = kernel(U,V)
U
is anm-by-pmatrix.
V
is ann-by-pmatrix.
G
is anm-by-nGram matrix of the rows ofU
andV
.
Andkernel.m
must be on the MATLAB path.
It is good practice to avoid using generic names for kernel functions. For example, call a sigmoid kernel function'mysigmoid'
rather than'sigmoid'
.
Example:“克恩elFunction','gaussian'
Data Types:char
|string
KernelScale
—Kernel scale parameter1
(default) |'auto'
|positive scalarKernel scale parameter, specified as the comma-separated pair consisting of“克恩elScale'
and'auto'
or a positive scalar. The software divides all elements of the predictor matrixX
by the value ofKernelScale
.Then, the software applies the appropriate kernel norm to compute the Gram matrix.
If you specify'auto'
, then the software selects an appropriate scale factor using a heuristic procedure. This heuristic procedure uses subsampling, so estimates can vary from one call to another. Therefore, to reproduce results, set a random number seed usingrng
before training.
If you specifyKernelScale
and your own kernel function, for example,“克恩elFunction','kernel'
, then the software throws an error. You must apply scaling withinkernel
.
Example:“克恩elScale','auto'
Data Types:double
|single
|char
|string
PolynomialOrder
—Polynomial kernel function order3
(default) |positive integerPolynomial kernel function order, specified as the comma-separated pair consisting of'PolynomialOrder'
and a positive integer.
If you set'PolynomialOrder'
andKernelFunction
is not'polynomial'
, then the software throws an error.
Example:'PolynomialOrder',2
Data Types:double
|single
KernelOffset
—Kernel offset parameterKernel offset parameter, specified as the comma-separated pair consisting of“克恩elOffset'
and a nonnegative scalar.
The software addsKernelOffset
to each element of the Gram matrix.
The defaults are:
0
if the solver is SMO (that is, you set'Solver','SMO'
)
0.1
if the solver is ISDA (that is, you set'Solver','ISDA'
)
Example:“克恩elOffset',0
Data Types:double
|single
Epsilon
—Half the width of epsilon-insensitive bandiqr(Y)/13.49
(default) |nonnegative scalar valueHalf the width of the epsilon-insensitive band, specified as the comma-separated pair consisting of'Epsilon'
and a nonnegative scalar value.
The defaultEpsilon
value isiqr(Y)/13.49
, which is an estimate of a tenth of the standard deviation using the interquartile range of the response variableY
.Ifiqr(Y)
is equal to zero, then the defaultEpsilon
value is 0.1.
Example:'Epsilon',0.3
Data Types:single
|double
Standardize
—Flag to standardize predictor datafalse
(default) |true
Flag to standardize the predictor data, specified as the comma-separated pair consisting of'Standardize'
andtrue
(1
) orfalse
(0)
.
If you set'Standardize',true
:
The software centers and scales each column of the predictor data (X
) by the weighted column mean and standard deviation, respectively (for details on weighted standardizing, seeAlgorithms). MATLAB does not standardize the data contained in the dummy variable columns generated for categorical predictors.
The software trains the model using the standardized predictor matrix, but stores the unstandardized data in the model propertyX
.
Example:'Standardize',true
Data Types:logical
Solver
—Optimization routine'ISDA'
|'L1QP'
|'SMO'
Optimization routine, specified as the comma-separated pair consisting of'Solver'
and a value in this table.
Value | Description |
---|---|
'ISDA' |
Iterative Single Data Algorithm (see[30]) |
'L1QP' |
Usesquadprog (Optimization Toolbox)to implementL1通过二次programmin soft-margin最小化g. This option requires an Optimization Toolbox™ license. For more details, seeQuadratic Programming Definition(Optimization Toolbox). |
'SMO' |
Sequential Minimal Optimization (see[17]) |
The defaults are:
'ISDA'
if you set'OutlierFraction'
to a positive value
'SMO'
otherwise
Example:'Solver','ISDA'
Alpha
—Initial estimates of alpha coefficientsInitial estimates of alpha coefficients, specified as the comma-separated pair consisting of'Alpha'
and a numeric vector. The length ofAlpha
must be equal to the number of rows ofX
.
Each element ofAlpha
corresponds to an observation inX
.
Alpha
cannot contain anyNaN
s.
If you specifyAlpha
and any one of the cross-validation name-value pair arguments ('CrossVal'
,'CVPartition'
,'Holdout'
,'KFold'
, or'Leaveout'
), then the software returns an error.
IfY
contains any missing values, then remove all rows ofY
,X
, andAlpha
that correspond to the missing values. That is, enter:
idx = ~isnan(Y); Y = Y(idx); X = X(idx,:); alpha = alpha(idx);
Y
,X
, andalpha
as the response, predictors, and initial alpha estimates, respectively.
The default iszeros(size(Y,1))
.
Example:'Alpha',0.1*ones(size(X,1),1)
Data Types:single
|double
CacheSize
—缓存大小1000
(default) |'maximal'
|positive scalar缓存大小, specified as the comma-separated pair consisting of'CacheSize'
and'maximal'
or a positive scalar.
IfCacheSize
is'maximal'
, then the software reserves enough memory to hold the entiren-by-nGram matrix.
IfCacheSize
is a positive scalar, then the software reservesCacheSize
megabytes of memory for training the model.
Example:'CacheSize','maximal'
Data Types:double
|single
|char
|string
ClipAlphas
—Flag to clip alpha coefficientstrue
(default) |false
Flag to clip alpha coefficients, specified as the comma-separated pair consisting of'ClipAlphas'
and eithertrue
orfalse
.
Suppose that the alpha coefficient for observationjisαjand the box constraint of observationjisCj,j= 1,...,n, wherenis the training sample size.
Value | Description |
---|---|
true |
At each iteration, ifαjis near 0 or nearCj, then MATLAB setsαjto 0 or toCj, respectively. |
false |
MATLAB does not change the alpha coefficients during optimization. |
MATLAB stores the final values ofαin theAlpha
property of the trained SVM model object.
ClipAlphas
can affect SMO and ISDA convergence.
Example:'ClipAlphas',false
Data Types:logical
NumPrint
—Number of iterations between optimization diagnostic message output1000
(default) |nonnegative integerNumber of iterations between optimization diagnostic message output, specified as the comma-separated pair consisting of'NumPrint'
和一个非负整数。
If you specify'Verbose',1
and'NumPrint',numprint
, then the software displays all optimization diagnostic messages from SMO and ISDA everynumprint
它erations in the Command Window.
Example:'NumPrint',500
Data Types:double
|single
OutlierFraction
—Expected proportion of outliers in training dataExpected proportion of outliers in training data, specified as the comma-separated pair consisting of'OutlierFraction'
and a numeric scalar in the interval [0,1).fitrsvm
再保险moves observations with large gradients, ensuring thatfitrsvm
再保险moves the fraction of observations specified byOutlierFraction
by the time convergence is reached. This name-value pair is only valid when'Solver'
is'ISDA'
.
Example:'OutlierFraction',0.1
Data Types:single
|double
RemoveDuplicates
—Flag to replace duplicate observations with single observationsfalse
(default) |true
Flag to replace duplicate observations with single observations in the training data, specified as the comma-separated pair consisting of'RemoveDuplicates'
andtrue
orfalse
.
IfRemoveDuplicates
istrue
, thenfitrsvm
再保险places duplicate observations in the training data with a single observation of the same value. The weight of the single observation is equal to the sum of the weights of the corresponding removed duplicates (seeWeights
).
Tip
If your data set contains many duplicate observations, then specifying'RemoveDuplicates',true
can decrease convergence time considerably.
Data Types:logical
Verbose
—Verbosity level0
(default) |1
|2
Verbosity level, specified as the comma-separated pair consisting of'Verbose'
and0
,1
, or2
.The value ofVerbose
controls the amount of optimization information that the software displays in the Command Window and saves the information as a structure toMdl.ConvergenceInfo.History
.
This table summarizes the available verbosity level options.
Value | Description |
---|---|
0 |
The software does not display or save convergence information. |
1 |
The software displays diagnostic messages and saves convergence criteria everynumprint 它erations, wherenumprint is the value of the name-value pair argument'NumPrint' . |
2 |
The software displays diagnostic messages and saves convergence criteria at every iteration. |
Example:'Verbose',1
Data Types:double
|single
CategoricalPredictors
—Categorical predictors list'all'
Categorical predictors list, specified as one of the values in this table.
Value | Description |
---|---|
Vector of positive integers | Each entry in the vector is an index value indicating that the corresponding predictor is categorical. The index values are between 1 and If |
逻辑向量 | A |
Character matrix | Each row of the matrix is the name of a predictor variable. The names must match the entries inPredictorNames .Pad the names with extra blanks so each row of the character matrix has the same length. |
String array or cell array of character vectors | Each element in the array is the name of a predictor variable. The names must match the entries inPredictorNames . |
"all" |
All predictors are categorical. |
By default, if the predictor data is in a table (Tbl
),fitrsvm
assumes that a variable is categorical if it is a logical vector, categorical vector, character array, string array, or cell array of character vectors. If the predictor data is a matrix (X
),fitrsvm
assumes that all predictors are continuous. To identify any other predictors as categorical predictors, specify them by using the'CategoricalPredictors'
name-value argument.
For the identified categorical predictors,fitrsvm
creates dummy variables using two different schemes, depending on whether a categorical variable is unordered or ordered. For an unordered categorical variable,fitrsvm
creates one dummy variable for each level of the categorical variable. For an ordered categorical variable,fitrsvm
creates one less dummy variable than the number of categories. For details, seeAutomatic Creation of Dummy Variables.
Example:'CategoricalPredictors','all'
Data Types:single
|double
|logical
|char
|string
|cell
PredictorNames
—Predictor variable namesPredictor variable names, specified as a string array of unique names or cell array of unique character vectors. The functionality ofPredictorNames
depends on the way you supply the training data.
If you supplyX
andY
, then you can usePredictorNames
to assign names to the predictor variables inX
.
The order of the names inPredictorNames
must correspond to the column order ofX
.That is,PredictorNames{1}
is the name ofX(:,1)
,PredictorNames{2}
is the name ofX(:,2)
, and so on. Also,size(X,2)
andnumel(PredictorNames)
must be equal.
By default,PredictorNames
is{'x1','x2',...}
.
If you supplyTbl
, then you can usePredictorNames
to choose which predictor variables to use in training. That is,fitrsvm
uses only the predictor variables inPredictorNames
and the response variable during training.
PredictorNames
must be a subset ofTbl.Properties.VariableNames
and cannot include the name of the response variable.
By default,PredictorNames
contains the names of all predictor variables.
A good practice is to specify the predictors for training using eitherPredictorNames
orformula
, but not both.
Example:"PredictorNames",["SepalLength","SepalWidth","PetalLength","PetalWidth"]
Data Types:string
|cell
ResponseName
—Response variable name"Y"
(default) |character vector|string scalarResponse variable name, specified as a character vector or string scalar.
If you supplyY
, then you can useResponseName
to specify a name for the response variable.
If you supplyResponseVarName
orformula
, then you cannot useResponseName
.
Example:"ResponseName","response"
Data Types:char
|string
ResponseTransform
—Response transformation'none'
(default) |功能ion handleResponse transformation, specified as either'none'
or a function handle. The default is'none'
, which means@(y)y
, or no transformation. For a MATLAB function or a function you define, use its function handle for the response transformation. The function handle must accept a vector (the original response values) and return a vector of the same size (the transformed response values).
Example:Suppose you create a function handle that applies an exponential transformation to an input vector by usingmyfunction = @(y)exp(y)
.Then, you can specify the response transformation as'ResponseTransform',myfunction
.
Data Types:char
|string
|功能ion_handle
Weights
—Observation weightsones(size(X,1),1)
(default) |vector of numeric valuesObservation weights, specified as the comma-separated pair consisting of'Weights'
and a vector of numeric values. The size ofWeights
must equal the number of rows inX
.fitrsvm
normalizes the values ofWeights
to sum to 1.
Data Types:single
|double
CrossVal
—Cross-validation flag'off'
(default) |'on'
Cross-validation flag, specified as the comma-separated pair consisting of'CrossVal'
and either'on'
or'off'
.
If you specify'on'
, then the software implements 10-fold cross-validation.
To override this cross-validation setting, use one of these name-value pair arguments:CVPartition
,Holdout
,KFold
, orLeaveout
.To create a cross-validated model, you can use one cross-validation name-value pair argument at a time only.
Alternatively, you can cross-validate the model later using thecrossval
method.
Example:'CrossVal','on'
CVPartition
—Cross-validation partition[]
(default) |cvpartition
partition objectCross-validation partition, specified as acvpartition
partition object created bycvpartition
.The partition object specifies the type of cross-validation and the indexing for the training and validation sets.
To create a cross-validated model, you can specify only one of these four name-value arguments:CVPartition
,Holdout
,KFold
, orLeaveout
.
Example:Suppose you create a random partition for 5-fold cross-validation on 500 observations by usingcvp = cvpartition(500,'KFold',5)
.Then, you can specify the cross-validated model by using'CVPartition',cvp
.
Holdout
—Fraction of data for holdout validationFraction of the data used for holdout validation, specified as a scalar value in the range (0,1). If you specify'Holdout',p
, then the software completes these steps:
Randomly select and reservep*100
% of the data as validation data, and train the model using the rest of the data.
Store the compact, trained model in theTrained
property of the cross-validated model.
To create a cross-validated model, you can specify only one of these four name-value arguments:CVPartition
,Holdout
,KFold
, orLeaveout
.
Example:'Holdout',0.1
Data Types:double
|single
KFold
—Number of folds10
(default) |positive integer value greater than 1Number of folds to use in a cross-validated model, specified as a positive integer value greater than 1. If you specify'KFold',k
, then the software completes these steps:
Randomly partition the data intok
sets.
For each set, reserve the set as validation data, and train the model using the otherk
– 1sets.
Store thek
compact, trained models in ak
-by-1 cell vector in theTrained
property of the cross-validated model.
To create a cross-validated model, you can specify only one of these four name-value arguments:CVPartition
,Holdout
,KFold
, orLeaveout
.
Example:'KFold',5
Data Types:single
|double
Leaveout
—Leave-one-out cross-validation flag'off'
(default) |'on'
Leave-one-out cross-validation flag, specified as'on'
or'off'
.If you specify'Leaveout','on'
, then for each of thenobservations (wherenis the number of observations, excluding missing observations, specified in theNumObservations
property of the model), the software completes these steps:
Reserve the one observation as validation data, and train the model using the othern– 1 observations.
Store thencompact, trained models in ann-by-1 cell vector in theTrained
property of the cross-validated model.
To create a cross-validated model, you can specify only one of these four name-value arguments:CVPartition
,Holdout
,KFold
, orLeaveout
.
Example:'Leaveout','on'
DeltaGradientTolerance
—对梯度差异对梯度差异between upper and lower violators obtained by SMO or ISDA, specified as the comma-separated pair consisting of'DeltaGradientTolerance'
and a nonnegative scalar.
Example:'DeltaGradientTolerance',1e-4
Data Types:single
|double
GapTolerance
—Feasibility gap tolerance1e-3
(default) |nonnegative scalarFeasibility gap tolerance obtained by SMO or ISDA, specified as the comma-separated pair consisting of'GapTolerance'
and a nonnegative scalar.
IfGapTolerance
is0
, thenfitrsvm
does not use this parameter to check convergence.
Example:'GapTolerance',1e-4
Data Types:single
|double
IterationLimit
—Maximal number of numerical optimization iterations1e6
(default) |positive integerMaximal number of numerical optimization iterations, specified as the comma-separated pair consisting of'IterationLimit'
and a positive integer.
The software returns a trained model regardless of whether the optimization routine successfully converges.Mdl.ConvergenceInfo
contains convergence information.
Example:'IterationLimit',1e8
Data Types:double
|single
KKTTolerance
—Tolerance for KKT violationTolerance for Karush-Kuhn-Tucker (KKT) violation, specified as the comma-separated pair consisting of'KKTTolerance'
and a nonnegative scalar value.
This name-value pair applies only if'Solver'
is'SMO'
or'ISDA'
.
IfKKTTolerance
is0
, thenfitrsvm
does not use this parameter to check convergence.
Example:'KKTTolerance',1e-4
Data Types:single
|double
ShrinkagePeriod
—Number of iterations between reductions of active set0
(default) |nonnegative integerNumber of iterations between reductions of the active set, specified as the comma-separated pair consisting of'ShrinkagePeriod'
和一个非负整数。
If you set'ShrinkagePeriod',0
, then the software does not shrink the active set.
Example:'ShrinkagePeriod',1000
Data Types:double
|single
OptimizeHyperparameters
—Parameters to optimize'none'
(default) |'auto'
|'all'
|string array or cell array of eligible parameter names|vector ofoptimizableVariable
objectsParameters to optimize, specified as the comma-separated pair consisting of'OptimizeHyperparameters'
and one of the following:
'none'
— Do not optimize.
'auto'
— Use{'BoxConstraint','KernelScale','Epsilon'}
.
'all'
— Optimize all eligible parameters.
String array or cell array of eligible parameter names.
Vector ofoptimizableVariable
objects, typically the output ofhyperparameters
.
The optimization attempts to minimize the cross-validation loss (error) forfitrsvm
by varying the parameters. To control the cross-validation type and other aspects of the optimization, use theHyperparameterOptimizationOptions
name-value pair.
Note
The values of'OptimizeHyperparameters'
override any values you specify using other name-value arguments. For example, setting'OptimizeHyperparameters'
to'auto'
causesfitrsvm
to optimize hyperparameters corresponding to the'auto'
option and to ignore any specified values for the hyperparameters.
The eligible parameters forfitrsvm
are:
BoxConstraint
—fitrsvm
searches among positive values, by default log-scaled in the range[1e-3,1e3]
.
KernelScale
—fitrsvm
searches among positive values, by default log-scaled in the range[1e-3,1e3]
.
Epsilon
—fitrsvm
searches among positive values, by default log-scaled in the range[1e-3,1e2]*iqr(Y)/1.349
.
KernelFunction
—fitrsvm
searches among'gaussian'
,'linear'
, and'polynomial'
.
PolynomialOrder
—fitrsvm
searches among integers in the range[2,4]
.
Standardize
—fitrsvm
searches among'true'
and'false'
.
Set nondefault parameters by passing a vector ofoptimizableVariable
objects that have nondefault values. For example,
loadcarsmallparams = hyperparameters('fitrsvm',[Horsepower,Weight],MPG); params(1).Range = [1e-4,1e6];
Passparams
as the value ofOptimizeHyperparameters
.
By default, the iterative display appears at the command line, and plots appear according to the number of hyperparameters in the optimization. For the optimization and plots, the objective function islog(1 + cross-validation loss).To control the iterative display, set theVerbose
field of the'HyperparameterOptimizationOptions'
name-value argument. To control the plots, set theShowPlots
field of the'HyperparameterOptimizationOptions'
name-value argument.
For an example, seeOptimize SVM Regression.
Example:'OptimizeHyperparameters','auto'
HyperparameterOptimizationOptions
—Options for optimizationOptions for optimization, specified as a structure. This argument modifies the effect of theOptimizeHyperparameters
name-value argument. All fields in the structure are optional.
Field Name | Values | Default |
---|---|---|
Optimizer |
|
'bayesopt' |
AcquisitionFunctionName |
Acquisition functions whose names include |
'expected-improvement-per-second-plus' |
MaxObjectiveEvaluations |
Maximum number of objective function evaluations. | 30 for'bayesopt' and'randomsearch' , and the entire grid for'gridsearch' |
MaxTime |
Time limit, specified as a positive real scalar. The time limit is in seconds, as measured by |
Inf |
NumGridDivisions |
For'gridsearch' , the number of values in each dimension. The value can be a vector of positive integers giving the number of values for each dimension, or a scalar that applies to all dimensions. This field is ignored for categorical variables. |
10 |
ShowPlots |
Logical value indicating whether to show plots. Iftrue , this field plots the best observed objective function value against the iteration number. If you use Bayesian optimization (Optimizer is'bayesopt' ), then this field also plots the best estimated objective function value. The best observed objective function values and best estimated objective function values correspond to the values in theBestSoFar (observed) andBestSoFar (estim.) columns of the iterative display, respectively. You can find these values in the propertiesObjectiveMinimumTrace andEstimatedObjectiveMinimumTrace ofMdl.HyperparameterOptimizationResults .If the problem includes one or two optimization parameters for Bayesian optimization, thenShowPlots also plots a model of the objective function against the parameters. |
true |
SaveIntermediateResults |
Logical value indicating whether to save results whenOptimizer is'bayesopt' .Iftrue , this field overwrites a workspace variable named'BayesoptResults' at each iteration. The variable is aBayesianOptimization object. |
false |
Verbose |
Display at the command line:
For details, see the |
1 |
UseParallel |
Logical value indicating whether to run Bayesian optimization in parallel, which requires Parallel Computing Toolbox™. Due to the nonreproducibility of parallel timing, parallel Bayesian optimization does not necessarily yield reproducible results. For details, seeParallel Bayesian Optimization. | false |
Repartition |
Logical value indicating whether to repartition the cross-validation at every iteration. If this field is The setting |
false |
Use no more than one of the following three options. | ||
CVPartition |
Acvpartition object, as created bycvpartition |
'Kfold',5 if you do not specify a cross-validation field |
Holdout |
A scalar in the range(0,1) 再保险presenting the holdout fraction |
|
Kfold |
An integer greater than 1 |
Example:'HyperparameterOptimizationOptions',struct('MaxObjectiveEvaluations',60)
Data Types:struct
Mdl
— Trained SVM regression modelRegressionSVM
model |RegressionPartitionedSVM
cross-validated modelTrained SVM regression model, returned as aRegressionSVM
model orRegressionPartitionedSVM
cross-validated model.
If you set any of the name-value pair argumentsKFold
,Holdout
,Leaveout
,CrossVal
, orCVPartition
, thenMdl
is aRegressionPartitionedSVM
cross-validated model. Otherwise,Mdl
is aRegressionSVM
model.
fitrsvm
supports low- through moderate-dimensional data sets. For high-dimensional data set, usefitrlinear
instead.
Unless your data set is large, always try to standardize the predictors (seeStandardize
). Standardization makes predictors insensitive to the scales on which they are measured.
It is good practice to cross-validate using theKFold
name-value pair argument. The cross-validation results determine how well the SVM model generalizes.
Sparsity in support vectors is a desirable property of an SVM model. To decrease the number of support vectors, set theBoxConstraint
name-value pair argument to a large value. This action also increases the training time.
For optimal training time, setCacheSize
as high as the memory limit on your computer allows.
If you expect many fewer support vectors than observations in the training set, then you can significantly speed up convergence by shrinking the active-set using the name-value pair argument'ShrinkagePeriod'
.It is good practice to use'ShrinkagePeriod',1000
.
Duplicate observations that are far from the regression line do not affect convergence. However, just a few duplicate observations that occur near the regression line can slow down convergence considerably. To speed up convergence, specify'RemoveDuplicates',true
if:
Your data set contains many duplicate observations.
You suspect that a few duplicate observations can fall near the regression line.
However, to maintain the original data set during training,fitrsvm
must temporarily store separate data sets: the original and one without the duplicate observations. Therefore, if you specifytrue
for data sets containing few duplicates, thenfitrsvm
consumes close to double the memory of the original data.
After training a model, you can generate C/C++ code that predicts responses for new data. Generating C/C++ code requiresMATLAB Coder™.For details, seeIntroduction to Code Generation.
For the mathematical formulation of linear and nonlinear SVM regression problems and the solver algorithms, see理解支持矢量金宝apptor Machine Regression.
NaN
,
, empty character vector (''
), empty string (""
), and
values indicate missing data values.fitrsvm
再保险moves entire rows of data corresponding to a missing response. When normalizing weights,fitrsvm
ignores any weight corresponding to an observation with at least one missing predictor. Consequently, observation box constraints might not equalBoxConstraint
.
fitrsvm
再保险moves observations that have zero weight.
If you set'Standardize',true
and'Weights'
, thenfitrsvm
standardizes the predictors using their corresponding weighted means and weighted standard deviations. That is,fitrsvm
standardizes predictorj(xj) using
xjkis observationk(row) of predictorj(column).
If your predictor data contains categorical variables, then the software generally uses full dummy encoding for these variables. The software creates one dummy variable for each level of each categorical variable.
ThePredictorNames
property stores one element for each of the original predictor variable names. For example, assume that there are three predictors, one of which is a categorical variable with three levels. ThenPredictorNames
is a 1-by-3 cell array of character vectors containing the original names of the predictor variables.
TheExpandedPredictorNames
property stores one element for each of the predictor variables, including the dummy variables. For example, assume that there are three predictors, one of which is a categorical variable with three levels. ThenExpandedPredictorNames
is a 1-by-5 cell array of character vectors containing the names of the predictor variables and the new dummy variables.
Similarly, theBeta
property stores one beta coefficient for each predictor, including the dummy variables.
TheSupportVectors
property stores the predictor values for the support vectors, including the dummy variables. For example, assume that there aremsupport vectors and three predictors, one of which is a categorical variable with three levels. ThenSupportVectors
is anm-by-5 matrix.
TheX
property stores the training data as originally input. It does not include the dummy variables. When the input is a table,X
contains only the columns used as predictors.
For predictors specified in a table, if any of the variables contain ordered (ordinal) categories, the software uses ordinal encoding for these variables.
For a variable havingkordered levels, the software createsk– 1dummy variables. Thejth dummy variable is-1for levels up toj, and+1for levelsj+ 1throughk.
The names of the dummy variables stored in theExpandedPredictorNames
property indicate the first level with the value+1.The software storesk– 1additional predictor names for the dummy variables, including the names of levels 2, 3, ...,k.
All solvers implementL1 soft-margin minimization.
Letp
be the proportion of outliers that you expect in the training data. If you set'OutlierFraction',p
, then the software implementsrobust learning.In other words, the software attempts to remove 100p
% of the observations when the optimization algorithm converges. The removed observations correspond to gradients that are large in magnitude.
[1] Clark, D., Z. Schreter, A. Adams. "A Quantitative Comparison of Dystal and Backpropagation." submitted to the Australian Conference on Neural Networks, 1996.
[2] Fan, R.-E., P.-H. Chen, and C.-J. Lin. “Working set selection using second order information for training support vector machines.”Journal of Machine Learning Research, Vol 6, 2005, pp. 1889–1918.
[3] Kecman V., T. -M. Huang, and M. Vogt. “Iterative Single Data Algorithm for Training Kernel Machines from Huge Data Sets: Theory and Performance.” InSupport Vector Machines: Theory and Applications.Edited by Lipo Wang, 255–274. Berlin: Springer-Verlag, 2005.
[4] Lichman, M.UCI Machine Learning Repository, [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
[5] Nash, W.J., T. L. Sellers, S. R. Talbot, A. J. Cawthorn, and W. B. Ford. "The Population Biology of Abalone (Haliotisspecies) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait." Sea Fisheries Division, Technical Report No. 48, 1994.
[6] Waugh, S. "Extending and Benchmarking Cascade-Correlation: Extensions to the Cascade-Correlation Architecture and Benchmarking of Feed-forward Supervised Artificial Neural Networks."University of Tasmania Department of Computer Science thesis, 1995.
To perform parallel hyperparameter optimization, use the'HyperparameterOptimizationOptions', struct('UseParallel',true)
name-value argument in the call to thefitrsvm
功能ion.
For more information on parallel hyperparameter optimization, seeParallel Bayesian Optimization.
For general information about parallel computing, seeRun MATLAB Functions with Automatic Parallel Support(Parallel Computing Toolbox).
You have a modified version of this example. Do you want to open this example with your edits?
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select:.
Selectweb siteYou can also select a web site from the following list:
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.