Main Content

pix2pixHDGlobalGenerator

Create pix2pixHD global generator network

Description

example

net= pix2pixHDGlobalGenerator(inputSize)creates a pix2pixHD generator network for input of sizeinputSize. For more information about the network architecture, seepix2pixHD Generator Network.

This function requires Deep Learning Toolbox™.

example

net= pix2pixHDGlobalGenerator(inputSize,Name,Value)modifies properties of the pix2pixHD network using name-value arguments.

Examples

collapse all

Specify the network input size for 32-channel data of size 512-by-1024 pixels.

inputSize = [512 1024 32];

Create a pix2pixHD global generator network.

net = pix2pixHDGlobalGenerator(inputSize)
net = dlnetwork with properties: Layers: [84x1 nnet.cnn.layer.Layer] Connections: [92x2 table] Learnables: [110x3 table] State: [0x3 table] InputNames: {'GlobalGenerator_inputLayer'} OutputNames: {'GlobalGenerator_fActivation'} Initialized: 1

Display the network.

analyzeNetwork(net)

Specify the network input size for 32-channel data of size 512-by-1024 pixels.

inputSize = [512 1024 32];

Create a pix2pixHD generator network that performs batch normalization after each convolution.

net = pix2pixHDGlobalGenerator(inputSize,"Normalization","batch")
net = dlnetwork with properties: Layers: [84x1 nnet.cnn.layer.Layer] Connections: [92x2 table] Learnables: [110x3 table] State: [54x3 table] InputNames: {'GlobalGenerator_inputLayer'} OutputNames: {'GlobalGenerator_fActivation'} Initialized: 1

Display the network.

analyzeNetwork(net)

Input Arguments

collapse all

Network input size, specified as a 3-element vector of positive integers.inputSizehas the form [HWC], whereHis the height,Wis the width, andCis the number of channels.

Example:[28 28 3]specifies an input size of 28-by-28 pixels for a 3-channel image.

Name-Value Arguments

Specify optional pairs of arguments asName1=Value1,...,NameN=ValueN, whereNameis the argument name andValueis the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and encloseNamein quotes.

Example:'NumFiltersInFirstBlock',32creates a network with 32 filters in the first convolution layer

Number of downsampling blocks in the network encoder module, specified as a positive integer. In total, the network downsamples the input by a factor of 2^NumDownsamplingBlocks. The decoder module consists of the same number of upsampling blocks.

Number of filters in the first convolution layer, specified as a positive even integer.

Number of output channels, specified as a positive integer.

Filter size in the first and last convolution layers of the network, specified as a positive odd integer or 2-element vector of positive odd integers of the form [heightwidth]。When you specify the filter size as a scalar, the filter has equal height and width.

Filter size in intermediate convolution layers, specified as a positive odd integer or 2-element vector of positive odd integers of the form [heightwidth]。中间卷积层是反对的volution layers excluding the first and last convolution layer. When you specify the filter size as a scalar, the filter has identical height and width. Typical values are between 3 and 7.

Number of residual blocks, specified as a positive integer.

Style of padding used in the network, specified as one of these values.

PaddingValue Description Example
Numeric scalar Pad with the specified numeric value

[ 3 1 4 1 5 9 2 6 5 ] [ 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 1 4 2 2 2 2 1 5 9 2 2 2 2 2 6 5 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ]

'symmetric-include-edge' Pad using mirrored values of the input, including the edge values

[ 3 1 4 1 5 9 2 6 5 ] [ 5 1 1 5 9 9 5 1 3 3 1 4 4 1 1 3 3 1 4 4 1 5 1 1 5 9 9 5 6 2 2 6 5 5 6 6 2 2 6 5 5 6 5 1 1 5 9 9 5 ]

'symmetric-exclude-edge' Pad using mirrored values of the input, excluding the edge values

[ 3 1 4 1 5 9 2 6 5 ] [ 5 6 2 6 5 6 2 9 5 1 5 9 5 1 4 1 3 1 4 1 3 9 5 1 5 9 5 1 5 6 2 6 5 6 2 9 5 1 5 9 5 1 4 1 3 1 4 1 3 ]

“复制” Pad using repeated border elements of the input

[ 3 1 4 1 5 9 2 6 5 ] [ 3 3 3 1 4 4 4 3 3 3 1 4 4 4 3 3 3 1 4 4 4 1 1 1 5 9 9 9 2 2 2 6 5 5 5 2 2 2 6 5 5 5 2 2 2 6 5 5 5 ]

Method used to upsample activations, specified as one of these values:

Data Types:char|string

Weight initialization used in convolution layers, specified as"glorot","he","narrow-normal", or a function handle. For more information, seeSpecify Custom Weight Initialization Function(Deep Learning Toolbox).

Activation function to use in the network, specified as one of these values. For more information and a list of available layers, seeActivation Layers(Deep Learning Toolbox).

  • "relu"— Use areluLayer(Deep Learning Toolbox)

  • "leakyRelu"— Use aleakyReluLayer(Deep Learning Toolbox)with a scale factor of 0.2

  • "elu"— Use aneluLayer(Deep Learning Toolbox)

  • A layer object

Activation function after the final convolution layer, specified as one of these values. For more information and a list of available layers, seeOutput Layers(Deep Learning Toolbox).

  • "tanh"— Use atanhLayer(Deep Learning Toolbox)

  • "sigmoid"— Use asigmoidLayer(Deep Learning Toolbox)

  • "softmax"— Use asoftmaxLayer(Deep Learning Toolbox)

  • "none"— Do not use a final activation layer

  • A layer object

Normalization operation to use after each convolution, specified as one of these values. For more information and a list of available layers, seeNormalization, Dropout, and Cropping Layers(Deep Learning Toolbox).

Probability of dropout, specified as a number in the range [0, 1]. If you specify a value of0, then the network does not include dropout layers. If you specify a value greater than0, then the network includes adropoutLayer(Deep Learning Toolbox)in each residual block.

Prefix to all layer names in the network, specified as a string or character vector.

Data Types:char|string

Output Arguments

collapse all

Pix2pixHD generator network, returned as adlnetwork(Deep Learning Toolbox)object.

More About

collapse all

pix2pixHD Generator Network

A pix2pixHD generator network consists of an encoder module followed by a decoder module. The default network follows the architecture proposed by Wang et. al.[1].

The encoder module downsamples the input by a factor of 2^NumDownsamplingBlocks. The encoder module consists of an initial block of layers,NumDownsamplingBlocksdownsampling blocks, andNumResidualBlocksresidual blocks. The decoder module upsamples the input by a factor of 2^NumDownsamplingBlocks. The decoder module consists ofNumDownsamplingBlocksupsampling blocks and a final block.

The table describes the blocks of layers that comprise the encoder and decoder modules.

Block Type Layers Diagram of Default Block
Initial block
  • AnimageInputLayer(Deep Learning Toolbox)

  • Aconvolution2dLayer(Deep Learning Toolbox)with a stride of [1 1] and a filter size ofFilterSizeInFirstAndLastBlocks

  • An optional normalization layer, specified by theNormalizationLayername-value argument.

  • An activation layer specified by theActivationLayername-value argument.

Image input layer, 2-D convolution layer, instance normalization layer, ReLU layer

Downsampling block
  • Aconvolution2dLayer(Deep Learning Toolbox)with a stride of [2 2] to perform downsampling. The convolution layer has a filter size ofFilterSizeInIntermediateBlocks.

  • An optional normalization layer, specified by theNormalizationLayername-value argument.

  • An activation layer specified by theActivationLayername-value argument.

2-D convolution layer, instance normalization layer, ReLU layer

Residual block
  • Aconvolution2dLayer(Deep Learning Toolbox)with a stride of [1 1] and a filter size ofFilterSizeInIntermediateBlocks.

  • An optional normalization layer, specified by theNormalizationLayername-value argument.

  • An activation layer specified by theActivationLayername-value argument.

  • An optionaldropoutLayer(Deep Learning Toolbox). By default, residual blocks omit a dropout layer. Include a dropout layer by specifying theDropoutname-value argument as a value in the range (0, 1].

  • A secondconvolution2dLayer(Deep Learning Toolbox).

  • An optional second normalization layer.

  • AnadditionLayer(Deep Learning Toolbox)that provides a skip connection between every block.

2-D convolution layer, instance normalization layer, ReLU layer, 2-D convolution layer, instance normalization layer, addition layer

Upsampling block
  • An upsampling layer that upsamples by a factor of 2 according to theUpsampleMethodname-value argument. The convolution layer has a filter size ofFilterSizeInIntermediateBlocks.

  • An optional normalization layer, specified by theNormalizationLayername-value argument.

  • An activation layer specified by theActivationLayername-value argument.

Transposed 2-D convolution layer, instance normalization layer, ReLU layer

Final block
  • Aconvolution2dLayer(Deep Learning Toolbox)with a stride of [1 1] and a filter size ofFilterSizeInFirstAndLastBlocks.

  • An optional activation layer specified by theFinalActivationLayername-value argument.

2-D convolution layer, tanh layer

Tips

  • You can create the discriminator network for pix2pixHD by using thepatchGANDiscriminatorfunction.

  • Train the pix2pixHD GAN network using a custom training loop.

References

[1]Wang, Ting-Chun, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. "High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs." In2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8798–8807. Salt Lake City, UT, USA: IEEE, 2018.https://doi.org/10.1109/CVPR.2018.00917.

Version History

Introduced in R2021a