predict
Class:RegressionLinear
Predict response of linear regression model
Description
Input Arguments
Mdl
—Linear regression model
RegressionLinear
model object
Linear regression model, specified as aRegressionLinear
model object. You can create aRegressionLinear
model object usingfitrlinear
。
X
—Predictor data used to generate responses
full numeric matrix|sparse numeric matrix|table
Predictor data used to generate responses, specified as a full or sparse numeric matrix or a table.
By default, each row ofX
corresponds to one observation, and each column corresponds to one variable.
For a numeric matrix:
The variables in the columns of
X
must have the same order as the predictor variables that trainedMdl
。If you train
Mdl
using a table (for example,Tbl
) andTbl
contains only numeric predictor variables, thenX
can be a numeric matrix. To treat numeric predictors inTbl
as categorical during training, identify categorical predictors by using theCategoricalPredictors
名称-值对argument offitrlinear
。IfTbl
contains heterogeneous predictor variables (for example, numeric and categorical data types) andX
is a numeric matrix, thenpredict
throws an error.
For a table:
predict
does not support multicolumn variables or cell arrays other than cell arrays of character vectors.If you train
Mdl
using a table (for example,Tbl
), then all predictor variables inX
must have the same variable names and data types as the variables that trainedMdl
(stored inMdl.PredictorNames
). However, the column order ofX
does not need to correspond to the column order ofTbl
。Also,Tbl
andX
can contain additional variables (response variables, observation weights, and so on), butpredict
ignores them.If you train
Mdl
using a numeric matrix, then the predictor names inMdl.PredictorNames
must be the same as the corresponding predictor variable names inX
。To specify predictor names during training, use thePredictorNames
名称-值对argument offitrlinear
。所有的预测变量X
must be numeric vectors.X
can contain additional variables (response variables, observation weights, and so on), butpredict
ignores them.
Note
If you orient your predictor matrix so that observations correspond to columns and specify'ObservationsIn','columns'
, then you might experience a significant reduction in optimization execution time. You cannot specify'ObservationsIn','columns'
for predictor data in a table.
Data Types:double
|single
|table
dimension
—Predictor data observation dimension
'rows'
(default) |'columns'
Predictor data observation dimension, specified as'columns'
or'rows'
。
Note
If you orient your predictor matrix so that observations correspond to columns and specify'ObservationsIn','columns'
, then you might experience a significant reduction in optimization execution time. You cannot specify'ObservationsIn','columns'
for predictor data in a table.
Output Arguments
YHat
— Predicted responses
numeric matrix
Predicted responses, returned as an-by-Lnumeric matrix.nis the number of observations inX
andLis the number of regularization strengths inMdl.Lambda
。YHat(
is the response for observationi
,j
)i
using the linear regression model that has regularization strengthMdl.Lambda(
。j
)
The predicted response using the model with regularization strengthjis
xis an observation from the predictor data matrix
X
, and is row vector.is the estimated column vector of coefficients. The software stores this vector in
Mdl.Beta(:,
。j
)is the estimated, scalar bias, which the software stores in
Mdl.Bias(
。j
)
Examples
Predict Test-Sample Responses
Simulate 10000 observations from this model
is a 10000-by-1000 sparse matrix with 10% nonzero standard normal elements.
eis random normal error with mean 0 and standard deviation 0.3.
rng(1)% For reproducibilityn = 1e4; d = 1e3; nz = 0.1; X = sprandn(n,d,nz); Y = X(:,100) + 2*X(:,200) + 0.3*randn(n,1);
Train a linear regression model. Reserve 30% of the observations as a holdout sample.
CVMdl = fitrlinear(X,Y,'Holdout',0.3); Mdl = CVMdl.Trained{1}
Mdl = RegressionLinear ResponseName: 'Y' ResponseTransform: 'none' Beta: [1000x1 double] Bias: -0.0066 Lambda: 1.4286e-04 Learner: 'svm' Properties, Methods
CVMdl
is aRegressionPartitionedLinear
model. It contains the propertyTrained
, which is a 1-by-1 cell array holding aRegressionLinear
model that the software trained using the training set.
Extract the training and test data from the partition definition.
trainIdx = training(CVMdl.Partition); testIdx = test(CVMdl.Partition);
Predict the training- and test-sample responses.
yHatTrain = predict(Mdl,X(trainIdx,:)); yHatTest = predict(Mdl,X(testIdx,:));
因为there is one regularization strength inMdl
,yHatTrain
andyHatTest
are numeric vectors.
Predict from Best-Performing Model
Predict responses from the best-performing, linear regression model that uses a lasso-penalty and least squares.
Simulate 10000 observations as inPredict Test-Sample Responses。
rng(1)% For reproducibilityn = 1e4; d = 1e3; nz = 0.1; X = sprandn(n,d,nz); Y = X(:,100) + 2*X(:,200) + 0.3*randn(n,1);
Create a set of 15 logarithmically-spaced regularization strengths from through 。
Lambda = logspace(-5,-1,15);
Cross-validate the models. To increase execution speed, transpose the predictor data and specify that the observations are in columns. Optimize the objective function using SpaRSA.
X = X'; CVMdl = fitrlinear(X,Y,'ObservationsIn','columns','KFold',5,'Lambda',Lambda,。..'Learner','leastsquares','Solver','sparsa','Regularization','lasso'); numCLModels = numel(CVMdl.Trained)
numCLModels = 5
CVMdl
is aRegressionPartitionedLinear
model. Becausefitrlinear
implements 5-fold cross-validation,CVMdl
contains 5RegressionLinear
models that the software trains on each fold.
Display the first trained linear regression model.
Mdl1 = CVMdl.Trained{1}
Mdl1 = RegressionLinear ResponseName: 'Y' ResponseTransform: 'none' Beta: [1000x15 double] Bias: [-0.0049 -0.0049 -0.0049 -0.0049 -0.0049 -0.0048 ... ] Lambda: [1.0000e-05 1.9307e-05 3.7276e-05 7.1969e-05 ... ] Learner: 'leastsquares' Properties, Methods
Mdl1
is aRegressionLinear
model object.fitrlinear
constructedMdl1
by training on the first four folds. BecauseLambda
is a sequence of regularization strengths, you can think ofMdl1
as 11 models, one for each regularization strength inLambda
。
Estimate the cross-validated MSE.
mse = kfoldLoss(CVMdl);
Higher values ofLambda
lead to predictor variable sparsity, which is a good quality of a regression model. For each regularization strength, train a linear regression model using the entire data set and the same options as when you cross-validated the models. Determine the number of nonzero coefficients per model.
Mdl = fitrlinear(X,Y,'ObservationsIn','columns','Lambda',Lambda,。..'Learner','leastsquares','Solver','sparsa','Regularization','lasso'); numNZCoeff = sum(Mdl.Beta~=0);
In the same figure, plot the cross-validated MSE and frequency of nonzero coefficients for each regularization strength. Plot all variables on the log scale.
figure; [h,hL1,hL2] = plotyy(log10(Lambda),log10(mse),。..log10(Lambda),log10(numNZCoeff)); hL1.Marker ='o'; hL2.Marker ='o'; ylabel(h(1),'log_{10} MSE') ylabel(h(2),'log_{10} nonzero-coefficient frequency') xlabel('log_{10} Lambda') holdoff
Choose the index of the regularization strength that balances predictor variable sparsity and low MSE (for example,Lambda(10)
).
idxFinal = 10;
Extract the model with corresponding to the minimal MSE.
MdlFinal = selectModels(Mdl,idxFinal)
MdlFinal = RegressionLinear ResponseName: 'Y' ResponseTransform: 'none' Beta: [1000x1 double] Bias: -0.0050 Lambda: 0.0037 Learner: 'leastsquares' Properties, Methods
idxNZCoeff = find(MdlFinal.Beta~=0)
idxNZCoeff =2×1100 200
EstCoeff = Mdl.Beta(idxNZCoeff)
EstCoeff =2×11.0051 1.9965
MdlFinal
is aRegressionLinear
model with one regularization strength. The nonzero coefficientsEstCoeff
are close to the coefficients that simulated the data.
Simulate 10 new observations, and predict corresponding responses using the best-performing model.
XNew = sprandn(d,10,nz); YHat = predict(MdlFinal,XNew,'ObservationsIn','columns');
Extended Capabilities
Tall Arrays
Calculate with arrays that have more rows than fit in memory.
Usage notes and limitations:
predict
does not support talltable
data.
For more information, seeTall Arrays。
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
您可以生成C / c++代码
predict
andupdate
by using a coder configurer. Or, generate code only forpredict
by usingsaveLearnerForCoder
,loadLearnerForCoder
, andcodegen
。Code generation for
predict
andupdate
— Create a coder configurer by usinglearnerCoderConfigurer
and then generate code by usinggenerateCode
。Then you can update model parameters in the generated code without having to regenerate the code.Code generation for
predict
— Save a trained model by usingsaveLearnerForCoder
。Define an entry-point function that loads the saved model by usingloadLearnerForCoder
and calls thepredict
function. Then usecodegen
(MATLAB Coder)to generate code for the entry-point function.
To generate single-precision C/C++ code for
predict
, specify the name-value argument"DataType","single"
when you call theloadLearnerForCoder
function.This table contains notes about the arguments of
predict
。Arguments not included in this table are fully supported.Argument Notes and Limitations Mdl
For the usage notes and limitations of the model object, seeCode Generationof the
RegressionLinear
object.X
For general code generation,
X
must be a single-precision or double-precision matrix or a table containing numeric variables, categorical variables, or both.In the coder configurer workflow,
X
must be a single-precision or double-precision matrix.The number of observations in
X
can be a variable size, but the number of variables inX
must be fixed.If you want to specify
X
as a table, then your model must be trained using a table, and your entry-point function for prediction must do the following:Accept data as arrays.
Create a table from the data input arguments and specify the variable names in the table.
Pass the table to
predict
。
For an example of this table workflow, seeGenerate Code to Classify Data in Table。For more information on using tables in code generation, seeCode Generation for Tables(MATLAB Coder)andTable Limitations for Code Generation(MATLAB Coder)。
Name-value pair arguments Names in name-value pair arguments must be compile-time constants.
The value for the
'ObservationsIn'
名称-值对argument must be a compile-time constant. For example, to use the'ObservationsIn','columns'
名称-值对argument in the generated code, include{coder.Constant('ObservationsIn'),coder.Constant('columns')}
in the-args
value ofcodegen
(MATLAB Coder)。
For more information, seeIntroduction to Code Generation。
Version History
See Also
MATLAB 명령
다음 MATLAB 명령에 해당하는 링크를 클릭했습니다.
명령을 실행하려면 MATLAB 명령 창에 입력하십시오. 웹 브라우저는 MATLAB 명령을 지원하지 않습니다.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select:。
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina(Español)
- Canada(English)
- United States(English)
Europe
- Belgium(English)
- Denmark(English)
- Deutschland(Deutsch)
- España(Español)
- Finland(English)
- France(Français)
- Ireland(English)
- Italia(Italiano)
- Luxembourg(English)
- Netherlands(English)
- Norway(English)
- Österreich(Deutsch)
- Portugal(English)
- Sweden(English)
- Switzerland
- United Kingdom(English)