Main Content

crossentropy

Cross-entropy loss for classification tasks

Since R2019b

Description

The cross-entropy operation computes the cross-entropy loss between network predictions and target values for single-label and multi-label classification tasks.

Thecrossentropyfunction computes the cross-entropy loss between predictions and targets represented asdlarraydata.Usingdlarrayobjects makes working with high dimensional data easier by allowing you to label the dimensions. For example, you can label which dimensions correspond to spatial, time, channel, and batch dimensions using the"S","T","C", and"B"labels, respectively. For unspecified and other dimensions, use the"U"label. Fordlarrayobject functions that operate over particular dimensions, you can specify the dimension labels by formatting thedlarrayobject directly, or by using theDataFormatoption.

Note

To calculate the cross-entropy loss within alayerGraphobject orLayer一个rray for use with thetrainNetworkfunction, useclassificationLayer.

example

loss= crossentropy(Y,targets)returns the categorical cross-entropy loss between the formatteddlarrayobjectYcontaining the predictions and the target valuestargetsfor single-label classification tasks. The outputlossis an unformatteddlarrayscalar.

For unformatted input data, use the'DataFormat'option.

loss= crossentropy(Y,targets,weights)一个pplies weights to the calculated loss values. Use this syntax to weight the contributions of classes, observations, regions, or individual elements of the input to the calculated loss values.

loss= crossentropy(___,'DataFormat',FMT)一个lso specifies the dimension formatFMTwhenYis not a formatteddlarray.

loss= crossentropy(___,Name,Value)使用一个或多个名称pai指定选项r arguments in addition to the input arguments in previous syntaxes. For example,'TargetCategories','independent'computes the cross-entropy loss for a multi-label classification task.

Examples

collapse all

Create an array of prediction scores for 12 observations over 10 classes.

numClasses = 10; numObservations = 12; Y = rand(numClasses,numObservations); dlY = dlarray(Y,'CB'); dlY = softmax(dlY);

View the size and format of the prediction scores.

size(dlY)
一个ns =1×210 12
dims(dlY)
一个ns = 'CB'

Create an array of targets encoded as one-hot vectors.

labels = randi(numClasses,[1 numObservations]); targets = onehotencode(labels,1,'ClassNames',1:numClasses);

View the size of the targets.

size(targets)
一个ns =1×210 12

计算之间的叉损失预测ions and the targets.

loss = crossentropy(dlY,targets)
loss = 1x1 dlarray 2.3343

Create an array of prediction scores for 12 observations over 10 classes.

numClasses = 10; numObservations = 12; Y = rand(numClasses,numObservations); dlY = dlarray(Y,'CB');

View the size and format of the prediction scores.

size(dlY)
一个ns =1×210 12
dims(dlY)
一个ns = 'CB'

Create a random array of targets encoded as a numeric array of zeros and ones. Each observation can have multiple classes.

targets = rand(numClasses,numObservations) > 0.75; targets = single(targets);

View the size of the targets.

size(targets)
一个ns =1×210 12

计算之间的叉损失预测ions and the targets. To specify cross-entropy loss for multi-label classification, set the'TargetCategories'option to'independent'.

loss = crossentropy(dlY,targets,'TargetCategories','independent')
loss = 1x1 single dlarray 9.8853

Create an array of prediction scores for 12 observations over 10 classes.

numClasses = 10; numObservations = 12; Y = rand(numClasses,numObservations); dlY = dlarray(Y,'CB'); dlY = softmax(dlY);

View the size and format of the prediction scores.

size(dlY)
一个ns =1×210 12
dims(dlY)
一个ns = 'CB'

Create an array of targets encoded as one-hot vectors.

labels = randi(numClasses,[1 numObservations]); targets = onehotencode(labels,1,'ClassNames',1:numClasses);

View the size of the targets.

size(targets)
一个ns =1×210 12

Compute the weighted cross-entropy loss between the predictions and the targets using a vector class weights. Specify a weights format of'UC'(unspecified, channel) using the'WeightsFormat'option.

weights = rand(1,numClasses); loss = crossentropy(dlY,targets,weights,'WeightsFormat','UC')
loss = 1x1 dlarray 1.1261

Input Arguments

collapse all

Predictions, specified as a formatteddlarray, an unformatteddlarray, or a numeric array. WhenYis not a formatteddlarray, you must specify the dimension format using theDataFormatoption.

IfYis a numeric array,targetsmust be adlarray.

Target classification labels, specified as a formatted or unformatteddlarrayor a numeric array.

Specify the targets as an array containing one-hot encoded labels with the same size and format asY. For example, ifYis anumObservations-by-numClasses一个rray, thentargets(n,i)= 1 if observationnbelongs to classitargets(n,i)= 0 otherwise.

Iftargetsis a formatteddlarray, then its format must be the same as the format ofY, or the same asDataFormatifYis unformatted.

Iftargetsis an unformatteddlarrayor a numeric array, then the function applies the format ofYor the value ofDataFormattotargets.

Tip

Formatteddlarrayobjects automatically permute the dimensions of the underlying data to have order"S"(spatial),"C"(channel),"B"(batch),"T"(time), then"U"(unspecified). To ensure that the dimensions ofY一个ndtargets一个re consistent, whenYis a formatteddlarray, also specifytargets一个s a formatteddlarray.

Weights, specified as adlarrayor a numeric array.

To specify class weights, specify a vector with a'C'(channel) dimension with size matching the'C'(channel) dimension of theY. Specify the'C'(channel) dimension of the class weights by using a formatteddlarrayobject or by using the'WeightsFormat'option.

To specify observation weights, specify a vector with a'B'(batch) dimension with size matching the'B'(batch) dimension of theY. Specify the'B'(batch) dimension of the class weights by using a formatteddlarrayobject or by using the'WeightsFormat'option.

To specify weights for each element of the input independently, specify the weights as an array of the same size asY. In this case, ifweightsis not a formatteddlarrayobject, then the function uses the same format asY. Alternatively, specify the weights format using the'WeightsFormat'option.

Name-Value Arguments

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, whereNameis the argument name andValueis the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and encloseNamein quotes.

Example:'TargetCategories','independent','DataFormat','CB'evaluates the cross-entropy loss for multi-label classification tasks and specifies the dimension order of the input data as'CB'

Type of classification task, specified as the comma-separated pair consisting of'TargetCategories'一个nd one of the following:

  • 'exclusive'— Single-label classification. Each observation in the predictionsYis exclusively assigned to one category. The function computes the loss between the target value for the single category specified bytargets一个nd the corresponding prediction inY, averaged over the number of observations.

  • 'independent'— Multi-label classification. Each observation in the predictionsYcan be assigned to one or more independent categories. The function computes the sum of the loss between each category specified bytargets一个nd the predictions inYfor those categories, averaged over the number of observations. Cross-entropy loss for this type of classification task is also known as binary cross-entropy loss.

Mask indicating which elements to include for loss computation, specified as adlarrayobject, a logical array, or a numeric array with the same size asY.

The function includes and excludes elements of the input data for loss computation when the corresponding value in the mask is 1 and 0, respectively.

IfMaskis a formatteddlarrayobject, then its format must match that ofY. IfMaskis not a formatteddlarrayobject, then the function uses the same format asY.

If you specify theDataFormatoption, then the function also uses the specified format for the mask.

的大小each dimension ofMaskmust match the size of the corresponding dimension inY. The default value is a logical array of ones.

Tip

Formatteddlarrayobjects automatically permute the dimensions of the underlying data to have this order:"S"(spatial),"C"(channel),"B"(batch),"T"(time), and"U"(unspecified). For example,dlarrayobjects automatically permute the dimensions of data with format"TSCSBS"to have format"SSSCBT".

To ensure that the dimensions ofY一个nd the mask are consistent, whenYis a formatteddlarray, also specify the mask as a formatteddlarray.

Mode for reducing the array of loss values, specified as one of the following:

  • "sum"— Sum all of the elements in the array of loss values. In this case, the outputlossis scalar.

  • "none"— Do not reduce the array of loss values. In this case, the outputlossis an unformatteddlarrayobject with the same size asY.

Divisor for normalizing the reduced loss whenReductionis"sum", specified as one of the following:

  • "batch-size"— Normalize the loss by dividing it by the number of observations inY.

  • “所有元素”— Normalize the loss by dividing it by the number of elements ofY.

  • "mask-included"— Normalize the loss by dividing the loss values by the number of included elements specified by the mask for each observation independently. To use this option, you must specify a mask using theMaskoption.

  • "none"— Do not normalize the loss.

Dimension order of unformatted input data, specified as a character vector or string scalarFMTthat provides a label for each dimension of the data.

When you specify the format of adlarrayobject, each character provides a label for each dimension of the data and must be one of these options:

  • "S"— Spatial

  • "C"— Channel

  • "B"— Batch (for example, samples and observations)

  • "T"— Time (for example, time steps of sequences)

  • "U"— Unspecified

You can specify multiple dimensions labeled"S"or"U". You can use the labels"C","B", and"T"一个t most once.

You must specifyDataFormatwhen the input data is not a formatteddlarray.

Data Types:char|string

Dimension order of the weights, specified as a character vector or string scalar that provides a label for each dimension of the weights.

When you specify the format of adlarrayobject, each character provides a label for each dimension of the data and must be one of these options:

  • "S"— Spatial

  • "C"— Channel

  • "B"— Batch (for example, samples and observations)

  • "T"— Time (for example, time steps of sequences)

  • "U"— Unspecified

You can specify multiple dimensions labeled"S"or"U". You can use the labels"C","B", and"T"一个t most once.

You must specifyWeightsFormatwhenweightsis a numeric vector andYhas two or more nonsingleton dimensions.

Ifweightsis not a vector, or bothweights一个ndY一个re vectors, then default value ofWeightsFormatis the same as the format ofY.

Data Types:char|string

Output Arguments

collapse all

Cross-entropy loss, returned as an unformatteddlarray. The outputlossis an unformatteddlarraywith the same underlying data type as the inputY.

的大小lossdepends on the'Reduction'option.

Algorithms

collapse all

Cross-Entropy Loss

For each elementYjof the input, thecrossentropyfunction computes the corresponding cross-entropy element-wise loss values using the formula

loss j = ( T j ln Y j + ( 1 T j ) ln ( 1 Y j ) ) ,

whereTjis the corresponding target value toYj.

To reduce the loss values to a scalar, the function then reduces the element-wise loss using the formula

loss = 1 N j m j w j loss j ,

whereNis the normalization factor,mjis the mask value for elementj, andwjis the weight value for elementj.

If you do not opt to reduce the loss, then the function applies the mask and the weights to the loss values directly:

loss j * = m j w j loss j

This table shows the loss formulations for different tasks.

Task Description Loss
Single-label classification Cross-entropy loss for mutually exclusive classes. This is useful when observations must have a single label only.

loss = 1 N n = 1 N i = 1 K T n i ln Y n i ,

whereN一个ndK一个re the numbers of observations, and classes, respectively.

Multi-label classification Cross-entropy loss for independent classes. This is useful when observations can have multiple labels.

loss = 1 N n = 1 N i = 1 K ( T n i ln ( Y n i ) + ( 1 T n i ) ln ( 1 Y n i ) ) ,

whereN一个ndK一个re the numbers of observations and classes, respectively.

Single-label classification with weighted classes Cross-entropy loss with class weights. This is useful for datasets with imbalanced classes.

loss = 1 N n = 1 N i = 1 K w i T n i ln Y n i ,

whereN一个ndK一个re the numbers of observations and classes, respectively, andwidenotes the weight for classi.

Sequence-to-sequence classification Cross-entropy loss with masked time-steps. This is useful for ignoring loss values that correspond to padded data.

loss = 1 N n = 1 N t = 1 S m n t i = 1 K T n t i ln Y n t i ,

whereN,S, andK一个re the numbers of observations, time steps, and classes,mntdenotes the mask value for time steptof observationn.

Extended Capabilities

Version History

Introduced in R2019b