主要内容

Quantization of Deep Neural Networks

在数字硬件中,数字以二进制单词存储。二进制单词是位(1和0)的固定长度序列。数据类型定义了硬件组件或软件功能如何解释1和0的序列。数字表示为缩放整数(通常称为定点)或浮点数据类型。

大多数使用深度学习工具箱培训的神经网络和神经网络使用单精度浮点数据类型。即使是小的训练有素的神经网络也需要大量的内存,并且需要可以执行浮点算术的硬件。这些限制可以抑制对低功率微控制器和FPGA的深度学习能力的部署。

Using the Deep Learning Toolbox Model Quantization Library support package, you can quantize a network to use 8-bit scaled integer data types.

To learn about the products required to quantize and deploy the deep learning network to a GPU, FPGA, or CPU environment, seeQuantization Workflow Prerequisites.

Precision and Range

Scaled 8-bit integer data types have limited precision and range when compared to single-precision floating point data types. There are several numerical considerations when casting a number from a larger floating-point data type to a smaller data type of fixed length.

  • Precision loss: Precision loss is a rounding error. When precision loss occurs, the value is rounded to the nearest number that is representable by the data type. In the case of a tie it rounds:

    • 正向无穷大的方向上最接近代表值的正数。

    • 在负无穷大的方向上的最接近代表值的负数。

    In MATLAB®您可以使用圆形的功能。

  • Underflow: Underflow is a type of precision loss. Underflows occur when the value is smaller than the smallest value representable by the data type. When this occurs, the value saturates to zero.

  • Overflow: When a value is larger than the largest value that a data type can represent, an overflow occurs. When an overflow occurs, the value saturates to the largest value representable by the data type.

Histograms of Dynamic Ranges

Use theDeep Network Quantizerapp to collect and visualize the dynamic ranges of the weights and biases of the convolution layers and fully connected layers of a network, and the activations of all layers in the network. The app assigns a scaled 8-bit integer data type for the weights, biases, and activations of the convolution layers of the network. The app displays a histogram of the dynamic range for each of these parameters. The following steps describe how these histograms are produced.

  1. 在行使网络时,请考虑参数记录的以下值。

    Schematic representation of values logged for a parameter.

  2. 找到参数的每个记录值的理想二进制表示。

    最重要的位(MSB)是二进制单词的最左侧位。这个位最大,对数字的值有最大的贡献。每个值的MSB以黄色突出显示。

    表中显示的每个记录值的理想二进制表示形式,其中最显着的位突出显示为黄色。

  3. 通过对齐二进制单词,您可以看到参数的记录值使用的位分布。每列中MSB的总和以绿色突出显示,给出了记录值的汇总视图。

    Sum of MSB's in each column shown at the bottom of the table and highlighted in green.

  4. The MSB counts of each bit location are displayed as a heat map. In this heat map, darker blue regions correspond to a larger number of MSB's in the bit location.

    MSB counts shown as a heat map with darker regions corresponding to a larger number of MSB's in the bit location.

  5. TheDeep Network Quantizerapp assigns a data type that can avoid overflow, cover the range, and allow underflow. An additional sign bit is required to represent the signedness of the value.

    The figure below shows an example of a data type that represents bits from 23到2-3,包括标志位。

    原始值的二进制表示表,区域从2^3到2^-3,符号位列由边界框突出显示。

  6. After assigning the data type, any bits outside of that data type are removed. Due to the assignment of a smaller data type of fixed length, precision loss, overflow, and underflow can occur for values that are not representable by the data type.

    Table of binary representations of values, with non-representable bits grayed out. A table on the right displays the 8-bit binary representations and quantized values.

    In this example, the value 0.03125, suffers from an underflow, so the quantized value is 0. The value 2.1 suffers some precision loss, so the quantized value is 2.125. The value 16.250 is larger than the largest representable value of the data type, so this value overflows and the quantized value saturates to 15.874.

    The same table, with representative cases of underflow, precision loss, and overflow highlighted in the right table.

  7. TheDeep Network Quantizerapp displays this heat map histogram for each learnable parameter in the convolution layers and fully connected layers of the network. The gray regions of the histogram show the bits that cannot be represented by the data type.

    Schematic representation of the heat map histograms displayed by the Deep Network Quantizer app.

也可以看看

应用

Functions