crossval
Cross-validate machine learning model
Description
Examples
Cross-Validate SVM Classifier
Load the电离层
数据集。该数据集有34个预测指标和351个二进制响应,用于雷达回报,要么不好(要么'b'
) or good ('G'
).
load电离层rng(1);% For reproducibility
Train a support vector machine (SVM) classifier. Standardize the predictor data and specify the order of the classes.
svmmodel= fitcsvm(X,Y,'Standardize',真的,'ClassNames',{'b','G'});
svmmodel
是训练有素的分类SVM
分类器。'b'
is the negative class and'G'
is the positive class.
Cross-validate the classifier using 10-fold cross-validation.
CVSVMModel = crossval(SVMModel)
cvsvmmodel = classification partitionedmodel crossValidatedModel:'svm'prediactOrnames:{1x34 cell}响应式:'y'numobServations:351 kfold:10分区:[10分区:[1x1 cvpartition] classNames:classNames:classNAMES:{'b'b''g''g'g'g'g'g'g coreTransforts:
CVSVMModel
是一个分类PartitionedModel
cross-validated classifier. During cross-validation, the software completes these steps:
将数据随机分配为10组相等的大小。
Train an SVM classifier on nine of the sets.
重复步骤1和2k= 10 times. The software leaves out one partition each time and trains on the other nine partitions.
Combine generalization statistics for each fold.
Display the first model incvsvmmodel。训练
。
firstModel = cvsvmmodel。训练{1}
FirstModel= CompactClassificationSVM ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: 'none' Alpha: [78x1 double] Bias: -0.2209 KernelParameters: [1x1 struct] Mu: [0.8888 0 0.6320 0.0406 0.5931 0.1205 0.5361 ... ] Sigma: [0.3149 0 0.5033 0.4441 0.5255 0.4663 0.4987 ... ] SupportVectors: [78x34 double] SupportVectorLabels: [78x1 double] Properties, Methods
FirstModel
is the first of the 10 trained classifiers. It is aCompactClassificationSVM
分类器。
您可以通过通过CVSVMModel
至kfoldloss
。
指定幼稚贝叶斯交叉验证的持有样品比例
指定交叉验证的保留样品比例。默认,crossval
使用10倍的交叉验证来交叉验证天真的贝叶斯分类器。但是,您还有其他几种交叉验证选择。例如,您可以指定不同数量的折叠或保留样本比例。
Load the电离层
数据集。该数据集有34个预测指标和351个二进制响应,用于雷达回报,要么不好(要么'b'
) or good ('G'
).
load电离层
删除前两个预测因素以进行稳定。
x = x(:,3:end);rng('default');% For reproducibility
Train a naive Bayes classifier using the predictorsX
和类标签Y
。A recommended practice is to specify the class names.'b'
is the negative class and'G'
is the positive class.fitcnb
assumes that each predictor is conditionally and normally distributed.
mdl = fitcnb(x,y,'ClassNames',{'b','G'});
Mdl
是训练有素的分类
分类器。
旨在分类器通过指定一个30%holdout sample.
CVMdl = crossval(Mdl,'坚持',0.3)
cvmdl =分类partitionedModel crossValidatedModel:'naiveBayes'预测索引:{1x32 cell}响应e anthespEname:'y'numobServations:351 kfold:1 partition:1 partition:[1x1 cvpartition] classNames:classNames:classNAMES:{'b'b''g''g'g'g'g'g'g of scoretransforts:
CVMdl
是一个分类PartitionedModel
cross-validated, naive Bayes classifier.
Display the properties of the classifier trained using 70% of the data.
训练的模型= CVMdl.Trained{1}
训练的模型= CompactClassificationNaiveBayes ResponseName: 'Y' CategoricalPredictors: [] ClassNames: {'b' 'g'} ScoreTransform: 'none' DistributionNames: {1x32 cell} DistributionParameters: {2x32 cell} Properties, Methods
训练的模型
是一个紧凑型班级
分类器。
Estimate the generalization error by passingCVMdl
至kfoldloss
。
kfoldloss(CVMDL)
ANS = 0.2095
The out-of-sample misclassification error is approximately 21%.
Reduce the generalization error by choosing the five most important predictors.
idx = fscmrmr(X,Y); Xnew = X(:,idx(1:5));
Train a naive Bayes classifier for the new predictor.
Mdlnew = fitcnb(Xnew,Y,'ClassNames',{'b','G'});
Cross-validate the new classifier by specifying a 30% holdout sample, and estimate the generalization error.
CVMdlnew = crossval(Mdlnew,'坚持',0.3);kfoldloss(cvmdlnew)
ans = 0.1429
样本外的错误分类误差从大约21%降低到约14%。
使用使用交叉验证的回归游戏crossval
Train a regression generalized additive model (GAM) by usingfitrgam
,并通过使用crossval
和the holdout option. Then, usekfoldpredict
预测反应为validation-fold observations using a model trained on training-fold observations.
Load thepatients
数据集。
loadpatients
创建一个table that contains the predictor variables (Age
,Diastolic
,吸烟者
,Weight
,Gender
,SelfAssessedHealthStatus
) and the response variable (收缩期
).
tbl = table(Age,Diastolic,Smoker,Weight,Gender,SelfAssessedHealthStatus,Systolic);
Train a GAM that contains linear terms for predictors.
Mdl = fitrgam(tbl,“收缩期”);
Mdl
是一个RegressionGAM
模型对象。
Cross-validate the model by specifying a 30% holdout sample.
rng('default')% For reproducibilityCVMdl = crossval(Mdl,'坚持',0.3)
CVMdl = RegressionPartitionedGAM CrossValidatedModel: 'GAM' PredictorNames: {1x6 cell} CategoricalPredictors: [3 5 6] ResponseName: 'Systolic' NumObservations: 100 KFold: 1 Partition: [1x1 cvpartition] NumTrainedPerFold: [1x1 struct] ResponseTransform: 'none' IsStandardDeviationFit: 0 Properties, Methods
Thecrossval
函数创建a回归专业
model objectCVMdl
使用保留选项。在交叉验证期间,该软件完成了以下步骤:
Randomly select and reserve 30% of the data as validation data, and train the model using the rest of the data.
Store the compact, trained model in the
Trained
交叉验证模型对象的属性回归专业
。
You can choose a different cross-validation setting by using the'CrossVal'
,'CVPartition'
,'kfold'
, 或者'Leaveout'
name-value argument.
通过使用kfoldpredict
。The function predicts responses for the validation-fold observations by using the model trained on the training-fold observations. The function assignsNaN
至the training-fold observations.
yFit = kfoldPredict(CVMdl);
Find the validation-fold observation indexes, and create a table containing the observation index, observed response values, and predicted response values. Display the first eight rows of the table.
idx = find(〜isnan(yfit));t = table(idx,tbl.systolic(idx),yfit(idx),...'variablenames',{'Obseraction Index',“观察到的价值”,'Predicted Value'});头(T)
ans=8×3 tableObseraction Index Observed Value Predicted Value _________________ ______________ _______________ 1 124 130.22 6 121 124.38 7 130 125.26 12 115 117.05 20 125 121.82 22 123 116.99 23 114 107 24 128 122.52
计算验证折叠观测值的回归误差(平方平方误差)。
L = kfoldLoss(CVMdl)
L = 43.8715
Input Arguments
Mdl
—Machine learning model
完整回归模型对象|full classification model object
机器学习模型,指定为完整的回归或分类模型对象,如以下支持模型表所示。金宝app
回归模型对象
Model | Full Regression Model Object |
---|---|
Gaussian process regression (GPR) model | RegressionGP (如果您提供自定义'ActiveSet' in the call tofitrgp , then you cannot cross-validate the GPR model.) |
Generalized additive model (GAM) | RegressionGAM |
Neural network model | RegressionNeuralNetwork |
分类模型对象
Model | Full Classification Model Object |
---|---|
Generalized additive model | 分类GAM |
k- 最近的邻居模型 | 分类知识 |
Naive Bayes model | 分类 |
Neural network model | 分类NeuralNetwork |
金宝app支持向量机进行一级和二进制分类 | 分类SVM |
姓名-Value Arguments
将可选的参数对name1 = value1,...,namen = valuen
, 在哪里姓名
is the argument name and价值
is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.
在R2021a之前,请使用逗号分隔每个名称和值,并附上姓名
用引号。
Example:crossval(Mdl,'KFold',3)
specifies using three folds in a cross-validated model.
CVPARTITION
—Cross-validation partition
[]
(默认)|CVPartition
partition object
交叉验证分区,指定为CVPartition
partition object created byCVPartition
。分区对象指定了交叉验证的类型以及培训和验证集的索引。
You can specify only one of these four name-value arguments:'CVPartition'
,'坚持'
,'kfold'
, 或者'Leaveout'
。
Example:Suppose you create a random partition for 5-fold cross-validation on 500 observations by usingCVP =CVPartition(500,'KFold',5)
。然后,您可以使用“ CVPartition”,CVP
。
坚持
—Fraction of data for holdout validation
scalar value in the range (0,1)
Fraction of the data used for holdout validation, specified as a scalar value in the range (0,1). If you specify'坚持',p
, then the software completes these steps:
Randomly select and reserve
p*100
数据的%为验证数据,并使用其余数据训练模型。Store the compact, trained model in the
Trained
交叉验证模型的属性。如果Mdl
没有相应的紧凑对象,然后Trained
contains a full object.
You can specify only one of these four name-value arguments:'CVPartition'
,'坚持'
,'kfold'
, 或者'Leaveout'
。
Example:'坚持',0.1
数据类型:double
|单身的
kfold
—折叠数
10
(默认)|positive integer value greater than 1
折叠数至use in a cross-validated model, specified as a positive integer value greater than 1. If you specify'kfold',k
, then the software completes these steps:
Randomly partition the data into
k
sets.For each set, reserve the set as validation data, and train the model using the other
k
– 1sets.Store the
k
compact, trained models in ak
-1 by-1细胞向量Trained
交叉验证模型的属性。如果Mdl
没有相应的紧凑对象,然后Trained
contains a full object.
You can specify only one of these four name-value arguments:'CVPartition'
,'坚持'
,'kfold'
, 或者'Leaveout'
。
Example:'kfold',5
数据类型:单身的
|double
Leaveout
—一对一的交叉验证旗
'off'
(默认)|'on'
一对一的交叉验证旗, specified as'on'
或者'off'
。如果you specify“离开”,'
,然后每个n观察(其中nis the number of observations, excluding missing observations, specified in theNumObservations
该模型的属性),该软件完成了以下步骤:
保留一个观察结果作为验证数据,并使用另一个观察训练模型n- 1个观察。
Store thencompact, trained models in ann-1 by-1细胞向量
Trained
交叉验证模型的属性。如果Mdl
没有相应的紧凑对象,然后Trained
contains a full object.
You can specify only one of these four name-value arguments:'CVPartition'
,'坚持'
,'kfold'
, 或者'Leaveout'
。
Example:“离开”,'
输出参数
CVMdl
— Cross-validated machine learning model
交叉验证(分区)模型对象
Cross-validated machine learning model, returned as one of the cross-validated (partitioned) model objects in the following tables, depending on the input modelMdl
。
回归模型对象
Model | Regression Model (Mdl ) |
交叉验证模型(CVMdl ) |
---|---|---|
Gaussian process regression model | RegressionGP |
RegressionPartitionedModel |
Generalized additive model | RegressionGAM |
回归专业 |
Neural network model | RegressionNeuralNetwork |
RegressionPartitionedModel |
分类模型对象
Model | 分类模型(Mdl ) |
交叉验证模型(CVMdl ) |
---|---|---|
Generalized additive model | 分类GAM |
分类PartitionedGAM |
k- 最近的邻居模型 | 分类知识 |
分类PartitionedModel |
Naive Bayes model | 分类 |
分类PartitionedModel |
Neural network model | 分类NeuralNetwork |
分类PartitionedModel |
金宝app支持向量机进行一级和二进制分类 | 分类SVM |
分类PartitionedModel |
Tips
评估预测性能
Mdl
on cross-validated data by using thekfold功能和特性CVMdl
, such askfoldpredict
,kfoldloss
,kfoldMargin
, andkfoldEdge
for classification andkfoldpredict
和kfoldloss
for regression.Return a partitioned classifier with stratified partitioning by using the name-value argument
'kfold'
或者'坚持'
。创建一个
CVPartition
objectCVP
使用CVP =
CVPartition
(n,'kfold',k)
。通过使用名称值参数返回带有非启示分区的分区分类器“ CVPartition”,CVP
。
Alternative Functionality
Instead of training a model and then cross-validating it, you can create a cross-validated model directly by using a fitting function and specifying one of these name-value argument:'CrossVal'
,'CVPartition'
,'坚持'
,'Leaveout'
, 或者'kfold'
。
扩展功能
GPU数组
使用并行计算工具箱™在图形处理单元(GPU)上运行加速代码。
Version History
matlab명령
다음 MATLAB 명령에 해당하는 링크를 클릭했습니다.
명령을 실행하려면 MATLAB 명령 창에 입력하십시오. 웹 브라우저는 MATLAB 명령을 지원하지 않습니다.
Select a Web Site
选择一个网站以获取可用的翻译内容,并查看本地事件和优惠。根据您的位置,我们建议您选择:。
You can also select a web site from the following list:
如何获得最佳网站性能
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina(Español)
- Canada(English)
- United States(English)