Quantization

Quantize the weights, biases, and activations of layers to reduced precision scaled integer data types

Use Deep Learning Toolbox™ together with theDeep Learning Toolbox Model Quantization Librarysupport package to reduce the memory footprint and computational requirements of a deep neural network by quantizing the weights, biases, and activations of layers to reduced precision scaled integer data types. You can then generate C/C++, CUDA^®, or HDL code from these quantized networks.

Functions

`dlquantizer`	Quantize a deep neural network to 8-bit scaled integer data types
`dlquantizationOptions`	Options for quantizing a trained deep neural network
`calibrate`	Simulate and collect ranges of a deep neural network
`validate`	数字转换和验证神经网络

Apps

Deep Network Quantizer

Quantize a deep neural network to 8-bit scaled integer data types

Topics

Deep Learning Quantization

Quantization of Deep Neural Networks

Understand effects of quantization and how to visualize dynamic ranges of network convolution layers.

Quantization Workflow Prerequisites

Products required for the quantization of deep learning networks.

Quantization for GPU Target

Code Generation for Quantized Deep Learning Networks(GPU Coder)

Quantize and generate code for a pretrained convolutional neural network.

Quantize Residual Network Trained for Image Classification and Generate CUDA Code

This example shows how to quantize the learnable parameters in the convolution layers of a deep learning neural network that has residual connections and has been trained for image classification with CIFAR-10 data.

Quantize Object Detectors and Generate CUDA® Code

This example shows how to generate CUDA® code for an SSD vehicle detector and a YOLO v2 vehicle detector that performs inference computations in 8-bit integers.