Main Content

transposedConv3dLayer

Transposed 3-D convolution layer

Description

A transposed 3-D convolution layer upsamples three-dimensional feature maps.

This layer is sometimes incorrectly known as a "deconvolution" or "deconv" layer. This layer performs the transpose of convolution and does not perform deconvolution.

layer= transposedConv3dLayer(filterSize,numFilters)returns a 3-D transposed convolution layer and sets theFilterSizeandNumFiltersproperties.

example

layer= transposedConv3dLayer(filterSize,numFilters,Name,Value)returns a 3-D transposed convolutional layer and specifies additional options using one or more name-value pair arguments.

Examples

collapse all

Create a transposed 3-D convolutional layer with 32 filters, each with a height, width, and depth of 11. Use a stride of 4 in the horizontal and vertical directions and 2 along the depth.

layer = transposedConv3dLayer(11,32,'Stride',[4 4 2])
layer = TransposedConvolution3DLayer with properties: Name: '' Hyperparameters FilterSize: [11 11 11] NumChannels: 'auto' NumFilters: 32 Stride: [4 4 2] CroppingMode: 'manual' CroppingSize: [2x3 double] Learnable Parameters Weights: [] Bias: [] Show all properties

Input Arguments

collapse all

Height, width, and depth of the filters, specified as a positive integer or a vector of three positive integers[h w d], wherehis the height,wis the width, anddis the depth. The filter size defines the size of the local regions to which the neurons connect in the input.

IffilterSizeis a scalar, then the software uses the same value for all three dimensions.

Example:[5 6 7]specifies filters with a height, width, and depth of5,6, and7respectively.

Data Types:single|double|int8|int16|int32|int64|uint8|uint16|uint32|uint64

Number of filters, specified as a positive integer. This number corresponds to the number of neurons in the layer that connect to the same region in the input. This parameter determines the number of channels (feature maps) in the output of the layer.

Data Types:single|double|int8|int16|int32|int64|uint8|uint16|uint32|uint64

Name-Value Arguments

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, whereNameis the argument name andValueis the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and encloseNamein quotes.

Example:transposedConv3dLayer(11,96,'Stride',4)creates a 3-D transposed convolutional layer with 96 filters of size 11 and a stride of 4.

Transposed Convolution

collapse all

Step size for traversing the input in three dimensions, specified as a vector[a b c]三个正整数的ais the vertical step size,bis the horizontal step size, andcis the step size along the depth. When creating the layer, you can specifyStrideas a scalar to use the same value for step sizes in all three directions.

Example:[2 3 1]specifies a vertical step size of 2, a horizontal step size of 3, and a step size along the depth of 1.

Data Types:single|double|int8|int16|int32|int64|uint8|uint16|uint32|uint64

Output size reduction, specified as one of the following:

  • 'same'– Set the cropping so that the output size equalsinputSize.*Stride, whereinputSizeis the height, width, and depth of the layer input. If you set theCroppingoption to"same", then the software automatically sets theCroppingModeproperty of the layer to'same'.

    The software trims an equal amount from the top and bottom, the left and right, and the front and back, if possible. If the vertical crop amount has an odd value, then the software trims an extra row from the bottom. If the horizontal crop amount has an odd value, then the software trims an extra column from the right. If the depth crop amount has an odd value, then the software trims an extra plane from the back.

  • A positive integer – Crop the specified amount of data from all the edges.

  • A vector of nonnegative integers[a b c]– Cropafrom the top and bottom, cropbfrom the left and right, and cropcfrom the front and back.

  • a matrix of nonnegative integers[t l f; b r bk]of nonnegative integers — Cropt,l,f,b,r,bkfrom the top, left, front, bottom, right, and back of the input, respectively.

If you set theCroppingoption to a numeric value, then the software automatically sets theCroppingModeproperty of the layer to'manual'.

Example:[1 2 2]

Data Types:single|double|int8|int16|int32|int64|uint8|uint16|uint32|uint64|char|string

Number of input channels, specified as one of the following:

  • “汽车”— Automatically determine the number of input channels at training time.

  • 积极的在teger — Configure the layer for the specified number of input channels.NumChannelsand the number of channels in the layer input data must match. For example, if the input is an RGB image, thenNumChannelsmust be 3. If the input is the output of a convolutional layer with 16 filters, thenNumChannelsmust be 16.

Data Types:single|double|int8|int16|int32|int64|uint8|uint16|uint32|uint64|char|string

Parameters and Initialization

collapse all

Function to initialize the weights, specified as one of the following:

  • 'glorot'– Initialize the weights with the Glorot initializer[1](also known as Xavier initializer). The Glorot initializer independently samples from a uniform distribution with zero mean and variance2/(numIn + numOut), wherenumIn = FilterSize(1)*FilterSize(2)*FilterSize(3)*NumChannelsandnumOut = FilterSize(1)*FilterSize(2)*FilterSize(3)*NumFilters.

  • 'he'– Initialize the weights with the He initializer[2]. The He initializer samples from a normal distribution with zero mean and variance2/numIn, wherenumIn = FilterSize(1)*FilterSize(2)*FilterSize(3)*NumChannels.

  • 'narrow-normal'– Initialize the weights by independently sampling from a normal distribution with zero mean and standard deviation 0.01.

  • 'zeros'– Initialize the weights with zeros.

  • 'ones'– Initialize the weights with ones.

  • Function handle – Initialize the weights with a custom function. If you specify a function handle, then the function must be of the formweights = func(sz), whereszis the size of the weights. For an example, seeSpecify Custom Weight Initialization Function.

The layer only initializes the weights when theWeightsproperty is empty.

Data Types:char|string|function_handle

Function to initialize the bias, specified as one of the following:

  • 'zeros'— Initialize the bias with zeros.

  • 'ones'— Initialize the bias with ones.

  • 'narrow-normal'— Initialize the bias by independently sampling from a normal distribution with a mean of zero and a standard deviation of 0.01.

  • Function handle — Initialize the bias with a custom function. If you specify a function handle, then the function must be of the formbias = func(sz), whereszis the size of the bias.

The layer only initializes the bias when theBiasproperty is empty.

Data Types:char|string|function_handle

Layer weights for the transposed convolution operation, specified as aFilterSize(1)-by-FilterSize(2)-by-FilterSize(3)-by-numFilters-by-NumChannelsnumeric array or[].

The layer weights are learnable parameters. You can specify the initial value for the weights directly using theWeightsproperty of the layer. When you train a network, if theWeightsproperty of the layer is nonempty, thentrainNetworkuses theWeightsproperty as the initial value. If theWeightsproperty is empty, thentrainNetworkuses the initializer specified by theWeightsInitializerproperty of the layer.

Data Types:single|double

Layer biases for the transposed convolutional operation, specified as a 1-by-1-by-1-by-numFiltersnumeric array or[].

The layer biases are learnable parameters. When you train a network, ifBiasis nonempty, thentrainNetworkuses theBiasproperty as the initial value. IfBiasis empty, thentrainNetworkuses the initializer specified byBiasInitializer.

Data Types:single|double

Learning Rate and Regularization

collapse all

Learning rate factor for the weights, specified as a nonnegative scalar.

The software multiplies this factor by the global learning rate to determine the learning rate for the weights in this layer. For example, ifWeightLearnRateFactoris2, then the learning rate for the weights in this layer is twice the current global learning rate. The software determines the global learning rate based on the settings you specify using thetrainingOptionsfunction.

Data Types:single|double|int8|int16|int32|int64|uint8|uint16|uint32|uint64

Learning rate factor for the biases, specified as a nonnegative scalar.

The software multiplies this factor by the global learning rate to determine the learning rate for the biases in this layer. For example, ifBiasLearnRateFactoris2的学习速率,然后biases in the layer is twice the current global learning rate. The software determines the global learning rate based on the settings you specify using thetrainingOptionsfunction.

Data Types:single|double|int8|int16|int32|int64|uint8|uint16|uint32|uint64

L2regularization factor for the weights, specified as a nonnegative scalar.

The software multiplies this factor by the globalL2regularization factor to determine theL2regularization for the weights in this layer. For example, ifWeightL2Factoris2, then theL2regularization for the weights in this layer is twice the globalL2regularization factor. You can specify the globalL2regularization factor using thetrainingOptionsfunction.

Data Types:single|double|int8|int16|int32|int64|uint8|uint16|uint32|uint64

L2regularization factor for the biases, specified as a nonnegative scalar.

The software multiplies this factor by the globalL2regularization factor to determine theL2regularization for the biases in this layer. For example, ifBiasL2Factoris2, then theL2regularization for the biases in this layer is twice the globalL2regularization factor. You can specify the globalL2regularization factor using thetrainingOptionsfunction.

Data Types:single|double|int8|int16|int32|int64|uint8|uint16|uint32|uint64

Layer

collapse all

Layer name, specified as a character vector or a string scalar. ForLayerarray input, thetrainNetwork,assembleNetwork,layerGraph, anddlnetworkfunctions automatically assign names to layers with name''.

Data Types:char|string

Output Arguments

collapse all

Transposed 3-D convolution layer, returned as aTransposedConvolution3dLayerobject.

Algorithms

collapse all

3-D Transposed Convolutional Layer

A transposed 3-D convolution layer upsamples three-dimensional feature maps.

Thestandardconvolution operationdownsamplesthe input by applying sliding convolutional filters to the input. By flattening the input and output, you can express the convolution operation as Y = C X + B for the convolution matrixCand biasBthat can be derived from the layer weights and biases.

Similarly, thetransposedconvolution operationupsamplesthe input by applying sliding convolutional filters to the input. To upsample the input instead of downsampling using sliding filters, the layer zero-pads each edge of the input with padding that has the size of the corresponding filter edge size minus 1.

By flattening the input and output, the transposed convolution operation is equivalent to Y = C X + B , whereCandBdenote the convolution and bias matrices for standard convolution derived from the layer weights and biases, respectively. This operation is equivalent to the backward function of a standard convolution layer.

References

[1] Glorot, Xavier, and Yoshua Bengio. "Understanding the Difficulty of Training Deep Feedforward Neural Networks." InProceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, 249–356. Sardinia, Italy: AISTATS, 2010.

[2] He, Kaiming, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification." InProceedings of the 2015 IEEE International Conference on Computer Vision, 1026–1034. Washington, DC: IEEE Computer Vision Society, 2015.

Version History

Introduced in R2019a