loss
Loss of linear incremental learning model on batch of data
Description
loss
returns the regression or classification loss of a configured incremental learning model for linear regression (incrementalRegressionLinear
object) or linear binary classification (incrementalClassificationLinear
object).
To measure model performance on a data stream and store the results in the output model, callupdateMetrics
orupdateMetricsAndFit
.
Examples
Measure Model Performance During Incremental Learning
The performance of an incremental model on streaming data is measured in three ways:
Cumulative metrics measure the performance since the start of incremental learning.
Window metrics measure the performance on a specified window of observations. The metrics are updated every time the model processes the specified window.
The
loss
function measures the performance on a specified batch of data only.
Load the human activity data set. Randomly shuffle the data.
loadhumanactivityn = numel(actid); rng(1)% For reproducibilityidx = randsample(n,n); X = feat(idx,:); Y = actid(idx);
For details on the data set, enterDescription
at the command line.
Responses can be one of five classes: Sitting, Standing, Walking, Running, or Dancing. Dichotomize the response by identifying whether the subject is moving (actid
> 2).
Y = Y > 2;
创造e an incremental linear SVM model for binary classification. Configure the model forloss
by specifying the class names, prior class distribution (uniform), and arbitrary coefficient and bias values. Specify a metrics window size of 1000 observations.
p = size(X,2); Beta = randn(p,1); Bias = randn(1); Mdl = incrementalClassificationLinear('Beta',Beta,'Bias',Bias,...'ClassNames',unique(Y),'Prior','uniform','MetricsWindowSize',1000);
Mdl
is anincrementalClassificationLinear
model. All its properties are read-only. Instead of specifying arbitrary values, you can take either of these actions to configure the model:
Train an SVM model using
fitcsvm
orfitclinear
on a subset of the data (if available), and then convert the model to an incremental learner by usingincrementalLearner
.Incrementally fit
Mdl
to data by usingfit
.
Simulate a data stream, and perform the following actions on each incoming chunk of 50 observations:
Call
updateMetrics
to measure the cumulative performance and the performance within a window of observations. Overwrite the previous incremental model with a new one to track performance metrics.Call
loss
to measure the model performance on the incoming chunk.Call
fit
to fit the model to the incoming chunk. Overwrite the previous incremental model with a new one fitted to the incoming observations.Store all performance metrics to see how they evolve during incremental learning.
% PreallocationnumObsPerChunk = 50; nchunk = floor(n/numObsPerChunk); ce = array2table(zeros(nchunk,3),'VariableNames',["Cumulative""Window""Loss"]);% Incremental learningforj = 1:nchunk ibegin = min(n,numObsPerChunk*(j-1) + 1); iend = min(n,numObsPerChunk*j); idx = ibegin:iend; Mdl = updateMetrics(Mdl,X(idx,:),Y(idx)); ce{j,["Cumulative""Window"]} = Mdl.Metrics{"ClassificationError",:}; ce{j,"Loss"} = loss(Mdl,X(idx,:),Y(idx)); Mdl = fit(Mdl,X(idx,:),Y(idx));end
Mdl
is anincrementalClassificationLinear
model object trained on all the data in the stream. During incremental learning and after the model is warmed up,updateMetrics
checks the performance of the model on the incoming observations, then and thefit
function fits the model to those observations.loss
is agnostic of the metrics warm-up period, so it measures the classification error for all iterations.
To see how the performance metrics evolve during training, plot them.
figure plot(ce.Variables) xlim([0 nchunk]) ylim([0 0.05]) ylabel('Classification Error') xline(Mdl.MetricsWarmupPeriod/numObsPerChunk,'r-.') legend(ce.Properties.VariableNames) xlabel('Iteration')
The yellow line represents the classification error on each incoming chunk of data. After the metrics warm-up period,Mdl
tracks the cumulative and window metrics. The cumulative and batch losses converge as thefit
function fits the incremental model to the incoming data.
Compute Custom Loss on Incoming Chunks of Data
Fit an incremental learning model for regression to streaming data, and compute the mean absolute deviation (MAD) on the incoming data batches.
Load the robot arm data set. Obtain the sample sizen
and the number of predictor variablesp
.
loadrobotarmn = numel(ytrain); p = size(Xtrain,2);
For details on the data set, enterDescription
at the command line.
创造e an incremental linear model for regression. Configure the model as follows:
Specify a metrics warm-up period of 1000 observations.
Specify a metrics window size of 500 observations.
Track the mean absolute deviation (MAD) to measure the performance of the model. Create an anonymous function that measures the absolute error of each new observation. Create a structure array containing the name
MeanAbsoluteError
and its corresponding function.Configure the model to predict responses by specifying that all regression coefficients and the bias are 0.
maefcn = @(z,zfit,w)(abs(z - zfit)); maemetric = struct("MeanAbsoluteError", maefcn);Mdl = incrementalRegressionLinear ('MetricsWarmupPeriod',1000,'MetricsWindowSize',500,...'Metrics',maemetric,'Beta',zeros(p,1),'Bias',0,'EstimationPeriod',0)
Mdl = incrementalRegressionLinear IsWarm: 0 Metrics: [2x2 table] ResponseTransform: 'none' Beta: [32x1 double] Bias: 0 Learner: 'svm' Properties, Methods
Mdl
is anincrementalRegressionLinear
model object configured for incremental learning.
Perform incremental learning. At each iteration:
Simulate a data stream by processing a chunk of 50 observations.
Call
updateMetrics
计算cumulative and window metrics on the incoming chunk of data. Overwrite the previous incremental model with a new one fitted to overwrite the previous metrics.Call
loss
to compute the MAD on the incoming chunk of data. Whereas the cumulative and window metrics require that custom losses return the loss for each observation,loss
requires the loss on the entire chunk. Compute the mean of the absolute deviation.Call
fit
to fit the incremental model to the incoming chunk of data.Store the cumulative, window, and chunk metrics to see how they evolve during incremental learning.
% PreallocationnumObsPerChunk = 50; nchunk = floor(n/numObsPerChunk); mae = array2table(zeros(nchunk,3),'VariableNames',["Cumulative""Window""Chunk"]);% Incremental fittingforj = 1:nchunk ibegin = min(n,numObsPerChunk*(j-1) + 1); iend = min(n,numObsPerChunk*j); idx = ibegin:iend; Mdl = updateMetrics(Mdl,Xtrain(idx,:),ytrain(idx)); mae{j,1:2} = Mdl.Metrics{"MeanAbsoluteError",:}; mae{j,3} = loss(Mdl,Xtrain(idx,:),ytrain(idx),'LossFun',@(x,y,w)mean(maefcn(x,y,w))); Mdl = fit(Mdl,Xtrain(idx,:),ytrain(idx));end
Mdl
is anincrementalRegressionLinear
model object trained on all the data in the stream. During incremental learning and after the model is warmed up,updateMetrics
checks the performance of the model on the incoming observations, and thefit
function fits the model to those observations.
Plot the performance metrics to see how they evolved during incremental learning.
figure h = plot(mae.Variables); xlim([0 nchunk]) ylabel('Mean Absolute Deviation') xline(Mdl.MetricsWarmupPeriod/numObsPerChunk,'r-.') xlabel('Iteration') legend(h,mae.Properties.VariableNames)
The plot suggests the following:
updateMetrics
computes the performance metrics after the metrics warm-up period only.updateMetrics
computes the cumulative metrics during each iteration.updateMetrics
computes the window metrics after processing 500 observationsBecause
Mdl
was configured to predict observations from the beginning of incremental learning,loss
can compute the MAD on each incoming chunk of data.
Input Arguments
Mdl
—Incremental learning model
incrementalClassificationLinear
model object|incrementalRegressionLinear
model object
Incremental learning model, specified as anincrementalClassificationLinear
orincrementalRegressionLinear
model object. You can createMdl
directly or by converting a supported, traditionally trained machine learning model using theincrementalLearner
function. For more details, see the corresponding reference page.
You must configureMdl
to compute its loss on a batch of observations.
If
Mdl
is a converted, traditionally trained model, you can compute its loss without any modifications.Otherwise,
Mdl
必须满足以下条件,您可以吗specify directly or by fittingMdl
to data usingfit
orupdateMetricsAndFit
.If
Mdl
is anincrementalRegressionLinear
model, its model coefficientsMdl.Beta
and biasMdl.Bias
must be nonempty arrays.If
Mdl
is anincrementalClassificationLinear
model, its model coefficientsMdl.Beta
and biasMdl.Bias
must be nonempty arrays, the class namesMdl.ClassNames
must contain two classes, and the prior class distributionMdl.Prior
must contain known values.Regardless of object type, if you configure the model so that functions standardize predictor data, the predictor means
Mdl.Mu
and standard deviationsMdl.Sigma
must be nonempty arrays.
X
—Batch of predictor data
floating-point matrix
Batch of predictor data with which to compute the loss, specified as a floating-point matrix ofnobservations andMdl.NumPredictors
predictor variables.The value of theObservationsIn
name-value argument determines the orientation of the variables and observations. The defaultObservationsIn
value is"rows"
, which indicates that observations in the predictor data are oriented along the rows ofX
.
The length of the observation labelsY
and the number of observations inX
must be equal;Y(
is the label of observationj(row or column) inj
)X
.
Note
loss
supports only floating-point input predictor data. If your input data includes categorical data, you must prepare an encoded version of the categorical data. Usedummyvar
to convert each categorical variable to a numeric matrix of dummy variables. Then, concatenate all dummy variable matrices and any other numeric predictors. For more details, seeDummy Variables.
Data Types:single
|double
Y
—Batch of responses (labels)
分类数组|character array|string array|logical vector|floating-point vector|cell array of character vectors
Batch of responses (labels) with which to compute the loss, specified as a categorical, character, or string array, logical or floating-point vector, or cell array of character vectors for classification problems; or a floating-point vector for regression problems.
The length of the observation labelsY
and the number of observations inX
must be equal;Y(
is the label of observationj(row or column) inj
)X
.
For classification problems:
loss
supports binary classification only.If
Y
contains a label that is not a member ofMdl.ClassNames
,loss
issues an error.The data type of
Y
andMdl.ClassNames
must be the same.
Data Types:char
|string
|cell
|categorical
|logical
|single
|double
Name-Value Arguments
Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN
, whereName
is the argument name andValue
is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.
Before R2021a, use commas to separate each name and value, and encloseName
in quotes.
Example:'ObservationsIn','columns','Weights',W
specifies that the columns of the predictor matrix correspond to observations, and the vectorW
contains observation weights to apply.
LossFun
—Loss function
string vector|function handle|cell vector|structure array| ...
Loss function, specified as the comma-separated pair consisting of'LossFun'
and a built-in loss function name or function handle.
Classification problems: The following table lists the available loss functions when
Mdl
is anincrementalClassificationLinear
model. Specify one using its corresponding character vector or string scalar.Name Description "binodeviance"
Binomial deviance "classiferror"
(default)Misclassification rate in decimal "exponential"
Exponential loss "hinge"
Hinge loss "logit"
Logistic loss "quadratic"
Quadratic loss For more details, seeClassification Loss.
Logistic regression learners return posterior probabilities as classification scores, but SVM learners do not (see
predict
).To specify a custom loss function, use function handle notation. The function must have this form:
lossval =lossfcn(C,S,W)
The output argument
lossval
is ann-by-1 floating-point vector, wherelossval(
is the classification loss of observationj
)
.j
你指定函数名(
).lossfcn
C
is ann-by-2 logical matrix with rows indicating the class to which the corresponding observation belongs. The column order corresponds to the class order in theClassNames
property. CreateC
by settingC(
=p
,q
)1
, if observation
is in classp
, for each observation in the specified data. Set the other element in rowq
top
0
.S
is ann-by-2 numeric matrix of predicted classification scores.S
is similar to thescore
output ofpredict
, where rows correspond to observations in the data and the column order corresponds to the class order in theClassNames
property.S(
is the classification score of observationp
,q
)
being classified in classp
.q
W
is ann-by-1 numeric vector of observation weights.
Regression problems: The following table lists the available loss functions when
Mdl
is anincrementalRegressionLinear
model. Specify one using its corresponding character vector or string scalar.Name Description Learner Supporting Metric "epsiloninsensitive"
Epsilon insensitive loss 'svm'
"mse"
(default)加权均方误差 'svm'
and'leastsquares'
For more details, seeRegression Loss.
To specify a custom loss function, use function handle notation. The function must have this form:
lossval =lossfcn(Y,YFit,W)
The output argument
lossval
is a floating-point scalar.你指定函数名(
).lossfcn
Y
is a lengthnnumeric vector of observed responses.YFit
is a lengthnnumeric vector of corresponding predicted responses.W
is ann-by-1 numeric vector of observation weights.
Example:'LossFun',"mse"
Example:'LossFun',@
lossfcn
Data Types:char
|string
|function_handle
ObservationsIn
—Predictor data observation dimension
'rows'
(default) |'columns'
Predictor data observation dimension, specified as the comma-separated pair consisting of'ObservationsIn'
and'columns'
or'rows'
.
Data Types:char
|string
Weights
—Batch of observation weights
floating-point vector of positive values
Batch of observation weights, specified as the comma-separated pair consisting of“重量”
and a floating-point vector of positive values.loss
weighs the observations in the input data with the corresponding values inWeights
. The size ofWeights
must equaln, which is the number of observations in the input data.
By default,Weights
isones(
.n
,1)
For more details, seeObservation Weights.
Data Types:double
|single
Output Arguments
More About
Classification Loss
Classification lossfunctions measure the predictive inaccuracy of classification models. When you compare the same type of loss among many models, a lower loss indicates a better predictive model.
Consider the following scenario.
Lis the weighted average classification loss.
nis the sample size.
yjis the observed class label. The software codes it as –1 or 1, indicating the negative or positive class (or the first or second class in the
ClassNames
property), respectively.f(Xj) is the positive-class classification score for observation (row)jof the predictor dataX.
mj=yjf(Xj) is the classification score for classifying observationjinto the class corresponding toyj. Positive values ofmjindicate correct classification and do not contribute much to the average loss. Negative values ofmjindicate incorrect classification and contribute significantly to the average loss.
The weight for observationjiswj.
Given this scenario, the following table describes the supported loss functions that you can specify by using theLossFun
name-value argument.
Loss Function | Value ofLossFun |
Equation |
---|---|---|
Binomial deviance | "binodeviance" |
|
Exponential loss | "exponential" |
|
Misclassification rate in decimal | "classiferror" |
where is the class label corresponding to the class with the maximal score, andI{·} is the indicator function. |
Hinge loss | "hinge" |
|
Logit loss | "logit" |
|
Quadratic loss | "quadratic" |
Theloss
function does not omit an observation with aNaN
score when computing the weighted average loss. Therefore,loss
can returnNaN
when the predictor dataX
contains missing values, and the name-value argumentLossFun
is not specified as"classiferror"
. In most cases, if the data set does not contain missing predictors, theloss
function does not returnNaN
.
This figure compares the loss functions over the scoremfor one observation. Some functions are normalized to pass through the point (0,1).
Regression Loss
Regression lossfunctions measure the predictive inaccuracy of regression models. When you compare the same type of loss among many models, a lower loss indicates a better predictive model.
Consider the following scenario.
Lis the weighted average classification loss.
nis the sample size.
yjis the observed response of observationj.
f(Xj) is the predicted value of observationjof the predictor dataX.
The weight for observationjiswj.
Given this scenario, the following table describes the supported loss functions that you can specify by using theLossFun
name-value argument.
Loss Function | Value ofLossFun |
Equation |
---|---|---|
Epsilon insensitive loss | "epsiloninsensitive" |
|
Mean squared error | "mse" |
Theloss
function does not omit an observation with aNaN
预测时计算加权平均洛杉矶s. Therefore,loss
can returnNaN
when the predictor dataX
contains missing values. In most cases, if the data set does not contain missing predictors, theloss
function does not returnNaN
.
Algorithms
Observation Weights
For classification problems, if the prior class probability distribution is known (in other words, the prior distribution is not empirical),loss
normalizes observation weights to sum to the prior class probabilities in the respective classes. This action implies that observation weights are the respective prior class probabilities by default.
For regression problems or if the prior class probability distribution is empirical, the software normalizes the specified observation weights to sum to 1 each time you callloss
.
Extended Capabilities
C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.
Usage notes and limitations:
Use
saveLearnerForCoder
,loadLearnerForCoder
, andcodegen
(MATLAB Coder)to generate code for theloss
function. Save a trained model by usingsaveLearnerForCoder
. Define an entry-point function that loads the saved model by usingloadLearnerForCoder
and calls theloss
function. Then usecodegen
to generate code for the entry-point function.To generate single-precision C/C++ code for
loss
, specify the name-value argument"DataType","single"
when you call theloadLearnerForCoder
function.This table contains notes about the arguments of
loss
. Arguments not included in this table are fully supported.Argument Notes and Limitations Mdl
For usage notes and limitations of the model object, see
incrementalClassificationLinear
orincrementalRegressionLinear
.X
Batch-to-batch, the number of observations can be a variable size, but must equal the number of observations in
Y
.The number of predictor variables must equal to
Mdl.NumPredictors
.X
must besingle
ordouble
.
Y
Batch-to-batch, the number of observations can be a variable size, but must equal the number of observations in
X
.For classification problems, all labels in
Y
must be represented inMdl.ClassNames
.Y
andMdl.ClassNames
must have the same data type.
'LossFun'
The specified function cannot be an anonymous function. If you configure
Mdl
to shuffle data (Mdl.Shuffle
istrue
, orMdl.Solver
is'sgd'
or'asgd'
), theloss
function randomly shuffles each incoming batch of observations before it fits the model to the batch. The order of the shuffled observations might not match the order generated by MATLAB®.Therefore, if you fitMdl
before computing the loss, the loss computed in MATLAB and those computed by the generated code might not be equal.Use a homogeneous data type for all floating-point input arguments and object properties, specifically, either
single
ordouble
.
For more information, seeIntroduction to Code Generation.
Version History
Introduced in R2020bR2022a:loss
can returnNaN
for predictor data with missing values
Theloss
function no longer omits an observation with aNaN
prediction (score for classification and response for regression) when computing the weighted average loss. Therefore,loss
can now returnNaN
when the predictor dataX
contains missing values, and the name-value argumentLossFun
is not specified as"classiferror"
(for classification). In most cases, if the data set does not contain missing predictors, theloss
function does not returnNaN
.
Ifloss
in your code returnsNaN
, you can update your code to avoid this result. Remove or replace the missing values by usingrmmissing
orfillmissing
, respectively.
Open Example
You have a modified version of this example. Do you want to open this example with your edits?
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select:.
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina(Español)
- Canada(English)
- United States(English)
Europe
- Belgium(English)
- Denmark(English)
- Deutschland(Deutsch)
- España(Español)
- Finland(English)
- France(Français)
- Ireland(English)
- Italia(Italiano)
- Luxembourg(English)
- Netherlands(English)
- Norway(English)
- Österreich(Deutsch)
- Portugal(English)
- Sweden(English)
- Switzerland
- United Kingdom(English)