Main Content

Supported Networks, Layers, Boards, and Tools

Supported Pretrained Networks

Deep Learning HDL Toolbox™ supports code generation for series convolutional neural networks (CNNs or ConvNets). You can generate code for any trained CNN whose computational layers are supported for code generation. For a full list, seeSupported Layers。您可以使用一个liste pretrained网络d in the table to generate code for your target Intel®or Xilinx®FPGA boards.

Network Network Description Type Single Data Type (with Shipping Bitstreams) INT8 data type (with Shipping Bitstreams) Application Area
ZCU102 ZC706 Arria10 SoC ZCU102 ZC706 Arria10 SoC Classification
AlexNet

AlexNet convolutional neural network.

Series Network No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. Classification
LogoNet

Logo recognition network (LogoNet) is a MATLAB®developed logo identification network. For more information, seeLogo Recognition Network

Series Network Yes Yes Yes Yes Yes Yes Classification
DigitsNet

Digit classification network. SeeCreate Simple Deep Learning Network for Classification

Series Network Yes Yes Yes Yes Yes Yes Classification
Lane detection

LaneNet convolutional neural network. For more information, seeDeploy Transfer Learning Network for Lane Detection

Series Network No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. Classification
VGG-16

VGG-16 convolutional neural network. For the pretrained VGG-16 model, seevgg16

Series Network No. Network exceeds PL DDR memory size No. Network exceeds FC module memory size. Yes Yes No. Network exceeds FC module memory size. Yes Classification
VGG-19

VGG-19 convolutional neural network. For the pretrained VGG-19 model, seevgg19

Series Network No. Network exceeds PL DDR memory size No. Network exceeds FC module memory size. Yes Yes No. Network exceeds FC module memory size. Yes Classification
Darknet-19

Darknet-19 convolutional neural network. For the pretrained darknet-19 model, seedarknet19

Series Network Yes Yes Yes Yes Yes Yes Classification
Radar Classification Convolutional neural network that uses micro-Doppler signatures to identify and classify the object. For more information, seeBicyclist and Pedestrian Classification by Using FPGA Series Network Yes Yes Yes Yes Yes Yes Classification and Software Defined Radio (SDR)
Defect Detectionsnet_defnet snet_defnetis a custom AlexNet network used to identify and classify defects. For more information, seeDefect Detection Series Network No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. Classification
Defect Detectionsnet_blemdetnet snet_blemdetnetis a custom convolutional neural network used to identify and classify defects. For more information, seeDefect Detection Series Network No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. Classification
DarkNet-53 Darknet-53 convolutional neural network. For the pretrained DarkNet-53 model, seedarknet53 Directed acyclic graph (DAG) network based No. Network exceeds PL DDR memory size. No. Network fully connected layer exceeds memory size. Yes Yes No. Network fully connected layer exceeds memory size. Yes Classification
ResNet-18 ResNet-18 convolutional neural network. For the pretrained ResNet-18 model, seeresnet18 Directed acyclic graph (DAG) network based Yes Yes Yes Yes Yes Classification
ResNet-50 ResNet-50 convolutional neural network. For the pretrained ResNet-50 model, seeresnet50 Directed acyclic graph (DAG) network based No. Network exceeds PL DDR memory size. No. Network exceeds PL DDR memory size. Yes Yes Yes Yes Classification
ResNet-based YOLO v2 You look only once (YOLO) is an object detector that decodes the predictions from a convolutional neural network and generates bounding boxes around the objects. For more information, seeVehicle Detection Using DAG Network Based YOLO v2 Deployed to FPGA Directed acyclic graph (DAG) network based Yes Yes Yes Yes Yes Yes Object detection
MobileNetV2 MobileNet-v2 convolutional neural network. For the pretrained MobileNet-v2 model, seemobilenetv2 Directed acyclic graph (DAG) network based Yes No. Fully Connected layer exceeds PL DDR memory size. Yes No No. Fully Connected layer exceeds PL DDR memory size. No Classification
GoogLeNet GoogLeNet convolutional neural network. For the pretrained GoogLeNet model, seegooglenet Directed acyclic graph (DAG) network based No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. No. To use the bitstream, enable theLRNBlockGenerationproperty of the processor configuration for the bitstream and generate the bitstream again. Classification
PoseNet Human pose estimation network. Directed acyclic graph (DAG) network based Yes. Yes Yes Yes Yes Yes Segmentation
U-Net U-Net convolutional neural network designed for semantic image segmentation. Directed acyclic graph (DAG) network based No. PL DDR memory oversize. No. PL DDR memory oversize. No. PL DDR memory oversize. No. PL DDR memory oversize. No. PL DDR memory oversize. Yes Segmentation

Supported Layers

Deep Learning HDL Toolbox supports the layers listed in these tables.

Input Layers

层Type Hardware (HW) or Software(SW) Description and Limitations INT8 Compatible

imageInputLayer

SW

一个图像输入层networ输入二维图像k and applies data normalization. The normalization optionszero-centerandzscorecan run on hardware if thecompilemethodHardwareNormalizationargument is enabled and the input data is ofsingledata type. If theHardwareNormalizationoption is not enabled or the input data type isint8the normalization runs in software. Normalization specified using a function handle is not supported. SeeImage Input Layer Normalization Hardware Implementation

Yes. Runs as single datatype in SW.

Convolution and Fully Connected Layers

层Type Hardware (HW) or Software(SW) 层Output Format Description and Limitations INT8 Compatible

convolution2dLayer

HW Convolution (Conv)

A 2-D convolutional layer applies sliding convolutional filters to the input.

When generating code for a network using this layer, these limitations apply:

  • Filter size must be 1-36.

  • Stride size must be 1-15 and square.

  • Padding size must be in the range 0-8.

  • Dilation factor must be [1 1].

  • Padding value is not supported.

Yes

groupedConvolution2dLayer

HW Convolution (Conv)

A 2-D grouped convolutional layer separates the input channels into groups and applies sliding convolutional filters. Use grouped convolutional layers for channel-wise separable (also known as depth-wise separable) convolution.

Code generation is now supported for a 2-D grouped convolution layer that has theNumGroupsproperty set as'channel-wise'

When generating code for a network using this layer, these limitations apply:

  • Filter size must be 1-15 and square. For example [1 1] or [14 14]. When theNumGroupsis set as'channel-wise', filter size must be 3-14.

  • Stride size must be 1-15 and square.

  • Padding size must be in the range 0-8.

  • Dilation factor must be [1 1].

  • When theNumGroupsis not set as'channel-wise', number of groups must be 1 or 2.

  • The input feature number must be greater than a single multiple of the square root of theConvThreadNumber

  • When theNumGroupsis not set as'channel-wise', the number of filters per group must be a multiple of the square root of theConvThreadNumber

Yes

transposedConv2dLayer

HW Convolution (Conv)

A transposed 2-D convolution layer upsamples feature maps.

When generating code for a network using this layer, these limitations apply:

  • Filter size must be 1-8 and square.

  • Stride size must be 1-36 and square.

  • Padding size must be in the range 0-8.

  • Padding value is not supported.

Yes

fullyConnectedLayer

HW Fully Connected (FC)

A fully connected layer multiplies the input by a weight matrix, and then adds a bias vector.

When generating code for a network using this layer, these limitations apply:

Yes

Activation Layers

层Type Hardware (HW) or Software(SW) 层Output Format Description and Limitations INT8 Compatible

reluLayer

HW 层is fused.

A ReLU layer performs a threshold operation to each element of the input where any value less than zero is set to zero.

A ReLU layer is supported only when it is preceded by any of these layers:

  • Convolution

  • Fully Connected

  • Adder

Yes

leakyReluLayer

HW 层is fused.

A leaky ReLU layer performs a threshold operation where any input value less than zero is multiplied by a fixed scalar.

A leaky ReLU layer is supported only when it is preceded by any of these layers:

  • Convolution

  • Fully Connected

  • Adder

Yes

clippedReluLayer

HW 层is fused.

A clipped ReLU layer performs a threshold operation where any input value less than zero is set to zero and any value above the clipping ceiling is set to that clipping ceiling value.

A clipped ReLU layer is supported only when it is preceded by any of these layers:

  • Convolution

  • Fully Connected

  • Adder

Yes

Normalization, Dropout, and Cropping Layers

层Type Hardware (HW) or Software(SW) 层Output Format Description and Limitations INT8 Compatible

batchNormalizationLayer

HW 层is fused.

A batch normalization layer normalizes each input channel across a mini-batch.

A batch normalization layer is supported only when it is preceded by a convolution layer.

Yes

crossChannelNormalizationLayer

HW Convolution (Conv)

A channel-wise local response (cross-channel) normalization layer carries out channel-wise normalization.

TheWindowChannelSizemust be in the range of 3-9 for code generation.

Yes. Runs as single datatype in HW.

dropoutLayer

NoOP on inference NoOP on inference

A dropout layer randomly sets input elements to zero within a given probability.

Yes

Pooling and Unpooling Layers

层Type Hardware (HW) or Software(SW) 层Output Format Description and Limitations INT8 Compatible

maxPooling2dLayer

HW Convolution (Conv)

A max pooling layer performs downsampling by dividing the layer input into rectangular pooling regions and computing the maximum of each region.

When generating code for a network using this layer, these limitations apply:

  • Pool size must be 1-36.

  • Stride size must be 1-15 and square.

  • 填充大小必须在范围0 - 2。

HasUnpoolingOutputsis supported. When this parameter is enabled, these limitations apply for code generation for this layer:

  • Pool size must be 2-by-2 or 3-by-3.

  • The stride size must be the same as the filter size.

  • Padding size is not supported.

  • Filter size and stride size must be square. For example, [2 2].

Yes

No, whenHasUnpoolingOutputsis enabled.

maxUnpooling2dLayer

HW Convolution (Conv)

A max unpooling layer unpools the output of a max pooling layer.

No

averagePooling2dLayer

HW Convolution (Conv)

An average pooling layer performs downsampling by dividing the layer input into rectangular pooling regions and computing the average values of each region.

When generating code for a network using this layer, these limitations apply:

  • Pool size must be 1-36.

  • Stride size must be 1-15 and square.

  • 填充大小必须在范围0 - 2。

Yes

globalAveragePooling2dLayer

HW Convolution (Conv)

A global average pooling layer performs downsampling by computing the mean of the height and width dimensions of the input.

When generating code for a network using this layer, these limitations apply:

  • When the layer is implemented in the Conv module, the pool size must be 1-36 and square.

  • Can accept inputs of sizes up to 15-by-15-by-N.

Yes

Combination Layers

层Type Hardware (HW) or Software(SW) 层Output Format Description and Limitations INT8 Compatible

additionLayer

HW Inherit from input.

An addition layer adds inputs from multiple neural network layers element-wise.

You can now generated code for this layer withint8data type when the layer is combined with a Leaky ReLU or Clipped ReLU layer.

When generating code for a network using this layer, these limitations apply:

  • Both input layers must have the same output layer format. For example, both layers must have conv output format or fc output format.

Yes

depthConcatenationLayer

HW Inherit from input.

A depth concatenation layer takes inputs that have the same height and width and concatenates them along the third dimension (the channel dimension).

When generating code for a network using this layer, these limitations apply:

  • The input activation feature number must be a multiple of the square root of theConvThreadNumber

  • No layer may drive more than one input to any depth concatenation layer.

  • 层s that have a conv output format and layers that have an FC output format cannot be concatenated together.

Yes

multiplicationLayer

HW Inherit from input A multiplication layer multiplies inputs from multiple neural network layers element-wise. No

Output Layer

层Type Hardware (HW) or Software(SW) Description and Limitations INT8 Compatible

softmaxLayer

SW and HW

A softmax layer applies a softmax function to the input.

If the softmax layer is implemented in hardware:

  • The inputs must be in the range -87 to 88 .

  • Softmax layer followed by adder layer or depth concatenation layer is not supported.

  • The inputs to this layer must have the format 1-by-N, N-by-1, 1-by-1-by-N, N-by-1-by-1, and 1-by-N-by-1.

  • If the convolution module of the deep learning processor is enabled the square root of the convolution thread number must be an integral power of two. If not, the layer is implemented in software.

Yes. Runs as single datatype in SW.

classificationLayer

SW

A classification layer computes the cross-entropy loss for multiclass classification issues that have mutually exclusive classes.

Yes

regressionLayer

SW

A regression layer computes the half mean squared error loss for regression problems.

Yes

sigmoidLayer

HW

A sigmoid layer applies a sigmoid function to the input.

When generating code for a network using this layer, these limitations apply:

  • The inputs must be in the range -87 to 88 .

  • Softmax layer followed by adder layer or depth concatenation layer is not supported.

  • The inputs to this layer must have the format 1-by-N, N-by-1, 1-by-1-by-N, N-by-1-by-1, and 1-by-N-by-1.

Yes

Keras and ONNX Layers

层Type Hardware (HW) or Software(SW) 层Output Format Description and Limitations INT8 Compatible
nnet.keras.layer.FlattenCStyleLayer HW 层will be fused

Flatten activations into 1-D layers assuming C-style (row-major) order.

Annet.keras.layer.FlattenCStyleLayeris supported only when it is followed by a fully connected layer.

Yes

nnet.keras.layer.ZeroPadding2dLayer HW 层will be fused.

Zero padding layer for 2-D input.

Annet.keras.layer.ZeroPadding2dLayeris supported only when it is preceded by a convolution layer or a maxpool layer.

Yes

Custom Layers

层Type Hardware (HW) or Software(SW) 层Output Format Description and Limitations INT8 Compatible
Custom Layers HW Inherit from input Custom layers, with or without learnable parameters, that you define for your problem. To learn how to define your custom deep learning layers, seeCreate Deep Learning Processor Configuration for Custom Layers No

Supported Boards

These boards are supported by Deep Learning HDL Toolbox:

  • Xilinx Zynq®-7000 ZC706

  • Intel Arria®10 SoC

  • Xilinx Zynq UltraScale+™ MPSoC ZCU102

Third-Party Synthesis Tools and Version Support

Deep Learning HDL Toolbox has been tested with:

  • Xilinx Vivado Design Suite 2020.1

  • Intel Quartus Prime 18.1

Image Input Layer Normalization Hardware Implementation

To enable hardware implementation of the normalization functions for the image input layer, set theHardwareNormalizationargument of thecompilemethod toautooron。WhenHardwareNormalizationis set toauto, the compile method looks for the presence of addition and multiplication layers to implement the normalization function on hardware. The normalization is implemented on hardware by:

  • Creating a new constant layer, This layer holds the value which is to be subtracted.

  • Using existing addition and multiplication layers. The layers to be used depends on the normalization function being implemented.

Constant Layer Buffer Content

This table describes the value stored in the constant layer buffer.

Normalization Function Number of Constants Constant Layer Buffer Value
zerocenter 1 - Mean
zscore 2 The first constant value is-Mean。The second constant value is1/StandardDeviation

Related Topics