resnetLayers

Create 2-D residual network

Since R2021b

collapse all in page

Syntax

lgraph = resnetLayers(inputSize,numClasses)

lgraph = resnetLayers(___,Name=Value)

Description

lgraph= resnetLayers(inputSize,numClasses)creates a 2-D residual network with an image input size specified byinputSizeand a number of classes specified bynumClasses. A residual network consists of stacks of blocks. Each block contains deep learning layers. The network includes an image classification layer, suitable for predicting the categorical label of an input image.

To create a 3-D residual network, useresnet3dLayers.

example

lgraph= resnetLayers(___,Name=Value)creates a residual network using one or more name-value arguments using any of the input arguments in the previous syntax. For example,InitialNumFilters=32specifies 32 filters in the initial convolutional layer.

Examples

collapse all

Residual Network with Bottleneck

Open Live Script

Create a residual network with a bottleneck architecture.

imageSize = [224 224 3]; numClasses = 10; lgraph = resnetLayers(imageSize,numClasses)

lgraph = LayerGraph with properties: InputNames: {'input'} OutputNames: {'output'} Layers: [177x1 nnet.cnn.layer.Layer] Connections: [192x2 table]

Analyze the network.

analyzeNetwork(lgraph)

This network is equivalent to a ResNet-50 residual network.

Residual Network with Custom Stack Depth

Open Live Script

Create a ResNet-101 network using a custom stack depth.

imageSize = [224 224 3]; numClasses = 10; stackDepth = [3 4 23 3]; numFilters = [64 128 256 512]; lgraph = resnetLayers(imageSize,numClasses,...StackDepth=stackDepth,...NumFilters=numFilters)

lgraph = LayerGraph with properties: InputNames: {'input'} OutputNames: {'output'} Layers: [347x1 nnet.cnn.layer.Layer] Connections: [379x2 table]

Analyze the network.

analyzeNetwork(lgraph)

Train Residual Network

Open Live Script

Create and train a residual network to classify images.

Load the digits data as in-memory numeric arrays using thedigitTrain4DArrayDataanddigitTest4DArrayDatafunctions.

[XTrain,YTrain] = digitTrain4DArrayData; [XTest,YTest] = digitTest4DArrayData;

定义残余网络。数字数据帐目ins 28-by-28 pixel images, therefore, construct a residual network with smaller filters.

imageSize = [28 28 1]; numClasses = 10; lgraph = resnetLayers(imageSize,numClasses,...InitialStride=1,...InitialFilterSize=3,...InitialNumFilters=16,...StackDepth=[4 3 2],...NumFilters=[16 32 64]);

Set the options to the default settings for the stochastic gradient descent with momentum. Set the maximum number of epochs at 5, and start the training with an initial learning rate of 0.1.

options = trainingOptions("sgdm",...MaxEpochs=5,...InitialLearnRate=0.1,...Verbose=false,...Plots="training-progress");

Train the network.

net = trainNetwork(XTrain,YTrain,lgraph,options);

Test the performance of the network by evaluating the prediction accuracy of the test data. Use theclassifyfunction to predict the class label of each test image.

YPred = classify(net,XTest);

Calculate the accuracy. The accuracy is the fraction of labels that the network predicts correctly.

accuracy = sum(YPred == YTest)/numel(YTest)

accuracy = 0.9956

Convert Residual Network to`dlnetwork`Object

Open Live Script

To train a residual network using a custom training loop, first convert it to adlnetworkobject.

Create a residual network.

lgraph = resnetLayers([224 224 3],5);

Remove the classification layer.

lgraph = removeLayers(lgraph,"output");

Replace the input layer with a new input layer that hasNormalizationset to"none". To use an input layer with zero-center or z-score normalization, you must specify animageInputLayerwith nonempty value for theMeanproperty. For example,Mean=sum(XTrain,4), whereXTrainis a 4-D array containing your input data.

newInputLayer = imageInputLayer([224 224 3],Normalization="none"); lgraph = replaceLayer(lgraph,"input",newInputLayer);

Convert to adlnetwork.

dlnet = dlnetwork(lgraph)

dlnet = dlnetwork with properties: Layers: [176x1 nnet.cnn.layer.Layer] Connections: [191x2 table] Learnables: [214x3 table] State: [106x3 table] InputNames: {'imageinput'} OutputNames: {'softmax'} Initialized: 1 View summary with summary.

Input Arguments

collapse all

`inputSize`—Network input image size
2-element vector|3-element vector

Network input image size, specified as one of the following:

2-element vector in the form [height,width].
3-element vector in the form [height,width,depth], wheredepthis the number of channels. Setdepthto3for RGB images and to1for grayscale images. For multispectral and hyperspectral images, setdepthto the number of channels.

Theheightandwidthvalues must be greater than or equal toinitialStride * poolingStride * 2^D, whereDis the number of downsampling blocks. Set the initial stride using theInitialStrideargument. The pooling stride is1when theInitialPoolingLayeris set to"none", and2otherwise.

`numClasses`—Number of classes
integer greater than 1

Number of classes in the image classification network, specified as an integer greater than 1.

Name-Value Arguments

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, whereNameis the argument name andValueis the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example:InitialFilterSize=[5,5],InitialNumFilters=32,BottleneckType="none"specifies an initial filter size of 5-by-5 pixels, 32 initial filters, and a network architecture without bottleneck components.

Initial Layers

collapse all

`InitialFilterSize`—Filter size in first convolutional layer
`7`(default) |positive integer|2-element vector of positive integers

Filter size in the first convolutional layer, specified as one of the following:

Positive integer. The filter has equal height and width. For example, specifying5yields a filter of height 5 and width 5.
2-element vector in the form [height,width]. For example, specifying an initial filter size of[1 5]yields a filter of height 1 and width 5.

Example:InitialFilterSize=[5,5]

`InitialNumFilters`—Number of filters in first convolutional layer
`64`(default) |positive integer

Number of filters in the first convolutional layer, specified as a positive integer. The number of initial filters determines the number of channels (feature maps) in the output of the first convolutional layer in the residual network.

Example:InitialNumFilters=32

`InitialStride`—Stride in first convolutional layer
`2`(default) |positive integer|2-element vector of positive integers

Stride in the first convolutional layer, specified as a:

Positive integer. The stride has equal height and width. For example, specifying3yields a stride of height 3 and width 3.
2-element vector in the form [height,width]. For example, specifying an initial stride of[1 2]yields a stride of height 1 and width 2.

The stride defines the step size for traversing the input vertically and horizontally.

Example:InitialStride=[3,3]

`InitialPoolingLayer`—First pooling layer
`"max"`(default) |`"average"`|`"none"`

First pooling layer before the initial residual block, specified as one of the following:

"max"— Use a max pooling layer before the initial residual block. For more information, seemaxPooling2dLayer.
"average"— Use an average pooling layer before the initial residual block. For more information, seeaveragePooling2dLayer.
"none"— Do not use a pooling layer before the initial residual block.

Example:InitialPoolingLayer="average"

Data Types:char|string

Network Architecture

collapse all

`ResidualBlockType`—Residual block type
`"batchnorm-before-add"`(default) |`"batchnorm-after-add"`

Residual block type, specified as one of the following:

"batchnorm-before-add"— Add the batch normalization layer before the addition layer in the residual blocks[1].
"batchnorm-after-add"— Add the batch normalization layer after the addition layer in the residual blocks[2].

TheResidualBlockTypeargument specifies the location of the batch normalization layer in the standard and downsampling residual blocks. For more information, seeMore About.

Example:ResidualBlockType="batchnorm-after-add"

Data Types:char|string

`BottleneckType`—Block bottleneck type
`"downsample-first-conv"`(default) |`"none"`

Block bottleneck type, specified as one of the following:

"downsample-first-conv"— Use bottleneck residual blocks that perform downsampling in the first convolutional layer of the downsampling residual blocks, using a stride of 2. A bottleneck residual block consists of three convolutional layers: a 1-by-1 layer for downsampling the channel dimension, a 3-by-3 convolutional layer, and a 1-by-1 layer for upsampling the channel dimension.
The number of filters in the final convolutional layer is four times that in the first two convolutional layers. For more information, seeNumFilters.
"none"— Do not use bottleneck residual blocks. The residual blocks consist of two 3-by-3 convolutional layers.

A bottleneck block performs a 1-by-1 convolution before the 3-by-3 convolution to reduce the number of channels by a factor of four. Networks with and without bottleneck blocks will have a similar level of computational complexity, but the total number of features propagating in the residual connections is four times larger when you use bottleneck units. Therefore, using a bottleneck increases the efficiency of the network[1]. For more information on the layers in each residual block, seeMore About.

Example:BottleneckType="none"

Data Types:char|string

`StackDepth`—Number of residual blocks in each stack
`[3 4 6 3]`(default) |vector of positive integers

Number of residual blocks in each stack, specified as a vector of positive integers. For example, if the stack depth is[3 4 6 3], the network has four stacks, with three blocks, four blocks, six blocks, and three blocks.

Specify the number of filters in the convolutional layers of each stack using theNumFiltersargument. TheStackDepthvalue must have the same number of elements as theNumFiltersvalue.

Example:StackDepth=[9 12 69 9]

`NumFilters`—Number of filters in convolutional layers of each stack
`[64 128 256 512]`(default) |vector of positive integers

Number of filters in the convolutional layers of each stack, specified as a vector of positive integers.

When you setBottleneckTypeto"downsample-first-conv", the first two convolutional layers in each block of each stack have the same number of filters, set by theNumFiltersvalue. The final convolutional layer has four times the number of filters in the first two convolutional layers.
For example, suppose you setNumFiltersto[4 5]andBottleneckTypeto"downsample-first-conv". In the first stack, the first two convolutional layers in each block have 4 filters and the final convolutional layer in each block has 16 filters. In the second stack, the first two convolutional layers in each block have 5 filters and the final convolutional layer has 20 filters.
When you setBottleneckTypeto"none", the convolutional layers in each stack have the same number of filters, set by theNumFiltersvalue.

TheNumFiltersvalue must have the same number of elements as theStackDepthvalue.

TheNumFiltersvalue determines the layers on the residual connection in the initial residual block. There is a convolutional layer on the residual connection if one of the following conditions is met:

BottleneckType="downsample-first-conv"(default) andInitialNumFiltersis not equal to four times the first element ofNumFilters.
BottleneckType="none"andInitialNumFiltersis not equal to the first element ofNumFilters.

For more information about the layers in each residual block, seeMore About.

Example:NumFilters=[32 64 126 256]

`Normalization`—Data normalization
`"zerocenter"`(default) |`"zscore"`

Data normalization to apply every time data is forward-propagated through the input layer, specified as one of the following:

"zerocenter"— Subtract the mean. The mean is calculated at training time.
"zscore"— Subtract the mean and divide by the standard deviation. The mean and standard deviation are calculated at training time.

Example:Normalization="zscore"

Data Types:char|string

Output Arguments

collapse all

`lgraph`— Residual network
`layerGraph`object

Residual network, returned as alayerGraphobject.

More About

collapse all

Residual Network

Residual networks (ResNets) are a type of deep network consisting of building blocks that haveresidual connections(also known asskiporshortcutconnections). These connections allow the input to skip the convolutional units of the main branch, thus providing a simpler path through the network. By allowing the parameter gradients to flow more easily from the output layer to the earlier layers of the network, residual connections help mitigate the problem of vanishing gradients during early training.

The structure of a residual network is flexible. The key component is the inclusion of the residual connections withinresidual blocks. A group of residual blocks is called astack. A ResNet architecture consists of initial layers, followed by stacks containing residual blocks, and then the final layers. A network has three types of residual blocks:

Initial residual block — This block occurs at the start of the first stack. The layers in the residual connection of the initial residual block determine if the block preserves the activation sizes or performs downsampling.
Standard residual block — This block occurs multiple times in each stack, after the first downsampling residual block. The standard residual block preserves the activation sizes.
Downsampling residual block — This block occurs once, at the start of each stack. The first convolutional unit in the downsampling block downsamples the spatial dimensions by a factor of two.

A typical stack has a downsampling residual block, followed bymstandard residual blocks, wheremis greater than or equal to one. The first stack is the only stack that begins with an initial residual block.

The initial, standard, and downsampling residual blocks can bebottleneck或nonbottleneck块。瓶颈块执行a 1-by-1 convolution before the 3-by-3 convolution, to reduce the number of channels by a factor of four. Networks with and without bottleneck blocks have a similar level of computational complexity, but the total number of features propagating in the residual connections is four times larger when you use the bottleneck units. Therefore, using bottleneck blocks increases the efficiency of the network.

The layers inside each block are determined by the type of block and the options you set.

Block Layers

Name Initial Layers Initial Residual Block Standard Residual Block (BottleneckType="downsample-first-conv") Standard Residual Block (BottleneckType="none") Downsampling Residual Block Final Layers

Description

A residual network starts with the following layers, in order:

imageInputLayer
convolution2dLayer
batchNormalizationLayer
reluLayer
(Optional) Pooling layer (either max, average, or none)

Set the optional pooling layer using theInitialPoolingLayerargument.

The main branch of the initial residual block has the same layers as a standard residual block.

TheInitialNumFiltersandNumFiltersvalues determine the layers on the residual connection. The residual connection has a convolutional layer with[1,1]filter and[1,1]stride if one of the following conditions is met:

BottleneckType="downsample-first-conv"(default) andInitialNumFiltersis not equal to four times the first element ofNumFilters.
BottleneckType="none"andInitialNumFiltersis not equal to the first element ofNumFilters.

IfResidualBlockTypeis set to"batchnorm-before-add", the residual connection will also have a batch normalization layer.

The standard residual block with bottleneck units has the following layers, in order:

convolution2dLayerwith[1,1]filter and[1,1]stride
batchNormalizationLayer
reluLayer
convolution2dLayerwith[3,3]filter and[1,1]stride
batchNormalizationLayer
reluLayer
convolution2dLayerwith[1,1]filter and[1,1]stride
batchNormalizationLayer
additionLayer
reluLayer

The standard block has a residual connection from the output of the previous block to the addition layer.

Set the position of the addition layer using theResidualBlockTypeargument.

The standard residual block without bottleneck units has the following layers, in order:

convolution2dLayerwith[3,3]filter and[1,1]stride
batchNormalizationLayer
reluLayer
convolution2dLayerwith[3,3]filter and[1,1]stride
batchNormalizationLayer
additionLayer
reluLayer

The standard block has a residual connection from the output of the previous block to the addition layer.

Set the position of the addition layer using theResidualBlockTypeargument.

The downsampling residual block is the same as the standard block (either with or without the bottleneck) but with a stride of[2,2]in the first convolutional layer and additional layers on the residual connection.

The layers on the residual connection depend on theResidualBlockTypevalue.

WhenResidualBlockTypeis set to"batchnorm-before-add", the second branch contains aconvolution2dLayerwith[1,1]filter and[2,2]stride, and abatchNormalizationLayer.
WhenResidualBlockTypeis set to"batchnorm-after-add", the second branch contains aconvolution2dLayerwith[1,1]filter and[2,2]stride.

The downsampling block halves the height and width of the input, and increases the number of channels.

A residual network ends with the following layers, in order:

Example Visualization

Initial layers of a residual network.

Example of an initial residual block for a network without a bottleneck and with the batch normalization layer before the addition layer.

Example of an initial residual block in a residual network.

Example of the standard residual block for a network with a bottleneck and with the batch normalization layer before the addition layer.

Example of a standard residual block in a residual network with bottleneck units.

Example of the standard residual block for a network without a bottleneck and with the batch normalization layer before the addition layer.

Example of a standard residual block in a residual network without bottleneck units.

Example of a downsampling residual block for a network without a bottleneck and with the batch normalization layer before the addition layer.

Example of a downsampling residual block in a residual network without bottleneck units.

Final layers of a residual network.

卷积和完全连接层权重are initialized using the He weight initialization method[3]. For more information, seeconvolution2dLayer.

Tips

When working with small images, set theInitialPoolingLayeroption to"none"to remove the initial pooling layer and reduce the amount of downsampling.
Residual networks are usually named ResNet-X, whereXis thedepthof the network. The depth of a network is defined as the largest number of sequential convolutional or fully connected layers on a path from the input layer to the output layer. You can use the following formula to compute the depth of your network:

$depth = {\begin{matrix} 1 + 2 \sum_{i = 1}^{N} s_{i} + 1 If no bottleneck \\ 1 + 3 \sum_{i = 1}^{N} s_{i} + 1 If bottleneck \end{matrix},$

wheres_iis the depth of stacki.
Networks with the same depth can have different network architectures. For example, you can create a ResNet-14 architecture with or without a bottleneck:
```
resnet14Bottleneck = resnetLayers([224 224 3],10,...StackDepth=[2 2],...NumFilters=[64 128]); resnet14NoBottleneck = resnetLayers([224 224 3],10,...BottleneckType="none",...StackDepth=[2 2 2],...NumFilters=[64 128 256]);
```
瓶颈,nonbottlen之间的关系eck architectures also means that a network with a bottleneck will have a different depth than a network without a bottleneck.
```
resnet50Bottleneck = resnetLayers([224 224 3],10); resnet34NoBottleneck = resnetLayers([224 224 3],10,...BottleneckType="none");
```

References

[1] He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Deep Residual Learning for Image Recognition.” Preprint, submitted December 10, 2015. https://arxiv.org/abs/1512.03385.

[2] He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. “Identity Mappings in Deep Residual Networks.” Preprint, submitted July 25, 2016. https://arxiv.org/abs/1603.05027.

[3] He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification." InProceedings of the 2015 IEEE International Conference on Computer Vision, 1026 - 1034。华盛顿特区:IEEE计算机视觉Society, 2015.

Extended Capabilities

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Usage notes and limitations:

You can use the residual network for code generation. First, create the network using theresnetLayersfunction. Then, use thetrainNetworkfunction to train the network. After training and evaluating the network, you can generate code for theDAGNetworkobject by using GPU Coder™.

Version History

Introduced in R2021b

resnetLayers

Syntax

Description

Examples

Residual Network with Bottleneck

Residual Network with Custom Stack Depth

Train Residual Network

Convert Residual Network to`dlnetwork`Object

Input Arguments

`inputSize`—Network input image size
2-element vector|3-element vector

`numClasses`—Number of classes
integer greater than 1

Name-Value Arguments

`InitialFilterSize`—Filter size in first convolutional layer
`7`(default) |positive integer|2-element vector of positive integers

`InitialNumFilters`—Number of filters in first convolutional layer
`64`(default) |positive integer

`InitialStride`—Stride in first convolutional layer
`2`(default) |positive integer|2-element vector of positive integers

`InitialPoolingLayer`—First pooling layer
`"max"`(default) |`"average"`|`"none"`

`ResidualBlockType`—Residual block type
`"batchnorm-before-add"`(default) |`"batchnorm-after-add"`

`BottleneckType`—Block bottleneck type
`"downsample-first-conv"`(default) |`"none"`

`StackDepth`—Number of residual blocks in each stack
`[3 4 6 3]`(default) |vector of positive integers

`NumFilters`—Number of filters in convolutional layers of each stack
`[64 128 256 512]`(default) |vector of positive integers

`Normalization`—Data normalization
`"zerocenter"`(default) |`"zscore"`

Output Arguments

`lgraph`— Residual network
`layerGraph`object

More About

Residual Network

Tips

References

Extended Capabilities

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

See Also

Topics

resnetLayers

Syntax

Description

Examples

Residual Network with Bottleneck

Residual Network with Custom Stack Depth

Train Residual Network

Convert Residual Network todlnetworkObject

Input Arguments

inputSize—Network input image size2-element vector|3-element vector

numClasses—Number of classesinteger greater than 1

Name-Value Arguments

InitialFilterSize—Filter size in first convolutional layer7(default) |positive integer|2-element vector of positive integers

InitialNumFilters—Number of filters in first convolutional layer64(default) |positive integer

InitialStride—Stride in first convolutional layer2(default) |positive integer|2-element vector of positive integers

InitialPoolingLayer—First pooling layer"max"(default) |"average"|"none"

ResidualBlockType—Residual block type"batchnorm-before-add"(default) |"batchnorm-after-add"

BottleneckType—Block bottleneck type"downsample-first-conv"(default) |"none"

StackDepth—Number of residual blocks in each stack[3 4 6 3](default) |vector of positive integers

NumFilters—Number of filters in convolutional layers of each stack[64 128 256 512](default) |vector of positive integers

Normalization—Data normalization"zerocenter"(default) |"zscore"

Output Arguments

lgraph— Residual networklayerGraphobject

More About

Residual Network

Tips

References

Extended Capabilities

GPU Code GenerationGenerate CUDA® code for NVIDIA® GPUs using GPU Coder™.

Version History

See Also

Topics

Convert Residual Network to`dlnetwork`Object

`inputSize`—Network input image size
2-element vector|3-element vector

`numClasses`—Number of classes
integer greater than 1

`InitialFilterSize`—Filter size in first convolutional layer
`7`(default) |positive integer|2-element vector of positive integers

`InitialNumFilters`—Number of filters in first convolutional layer
`64`(default) |positive integer

`InitialStride`—Stride in first convolutional layer
`2`(default) |positive integer|2-element vector of positive integers

`InitialPoolingLayer`—First pooling layer
`"max"`(default) |`"average"`|`"none"`

`ResidualBlockType`—Residual block type
`"batchnorm-before-add"`(default) |`"batchnorm-after-add"`

`BottleneckType`—Block bottleneck type
`"downsample-first-conv"`(default) |`"none"`

`StackDepth`—Number of residual blocks in each stack
`[3 4 6 3]`(default) |vector of positive integers

`NumFilters`—Number of filters in convolutional layers of each stack
`[64 128 256 512]`(default) |vector of positive integers

`Normalization`—Data normalization
`"zerocenter"`(default) |`"zscore"`

`lgraph`— Residual network
`layerGraph`object

GPU Code Generation
Generate CUDA® code for NVIDIA® GPUs using GPU Coder™.