主要内容

coder.TensorRTConfig

Parameters to configure deep learning code generation with theNVIDIATensorRT library

Description

coder.TensorRTConfigobject contains NVIDIA®high performance deep learning inference optimizer and run-time library (TensorRT) specific parameters.codegen使用这些参数生成CUDA®code for deep neural networks.

To use acoder.TensorRTConfigobject for code generation, assign it to theDeepLearningConfig财产的coder.gpuConfigobject that you pass tocodegen.

Creation

Create a TensorRT configuration object by using thecoder.deeplearningconfig功能with target library set as'tensorrt'.

Properties

expand all

指定支持层中推理计算的精度。金宝app在32位浮点中执行推理时,使用'fp32'. For half-precision, use'fp16'. For 8-bit integer, use'int8'. Default value is'fp32'.

INT8precision requires a CUDA GPU with minimum compute capability of 6.1. Compute capability of 6.2 does not supportINT8precision.FP16precision requires a CUDA GPU with minimum compute capability of 7.0. Use theComputeCapability财产的财产gpuconfig对象设置适当的计算能力值。

看看利用NVIDIA Tensorrt深入学习预测example for 8-bit integer prediction for a logo classification network by using TensorRT.

Location of the image dataset used during recalibration. Default value is''. This option is applicable only whenDataType被设定为'int8'.

When you select the'INT8'option, TensorRT™ quantizes the floating-point data toint8. The recalibration is performed with a reduced set of the calibration data. The calibration data must be present in the image data location specified byDataPath.

Numeric value specifying the number of batches forint8calibration. The software uses the product ofbatchsize*NumCalibrationBatchesto pick a random subset of images from the image dataset to perform calibration. Thebatchsize*NumCalibrationBatchesvalue must not be greater than the number of images present in the image dataset. This option is applicable only whenDataType被设定为'int8'.

NVIDIA recommends that about 500 images are sufficient for calibrating. Refer to the TensorRT documentation for more information.

只读值,指定目标库的名称。

Examples

collapse all

Create an entry-point functionresnet_predictthat uses thecoder.loadDeepLearningNetwork装载的功能Reset50(Deep Learning Toolbox)SeriesNetwork目的。

功能OUT = RESNET_PREDICT(IN)persistentmynet;ifisempty(mynet)mynet = coder.loaddeeplearningnetwork('resnet50','myresnet');endout = predict(mynet,in);

Create acoder.gpuConfigconfiguration object for MEX code generation.

cfg = coder.gpuConfig('mex');

Set the target language to C++.

cfg.targetlang ='C++';

Create acoder.TensorRTConfigdeep learning configuration object. Assign it to theDeepLearningConfig财产的财产cfg配置对象。

cfg.DeepLearningConfig = coder.DeepLearningConfig('tensorrt');

使用-configoption of thecodegen功能to pass thecfg配置对象。这codegen函数必须确定matlab的大小,类和复杂性®功能输入。使用-argsoption to specify the size of the input to the entry-point function.

codegen-args{ones(224,224,3,'single')}-configcfgresnet_predict;

codegencommand places all the generated files in thecodegenfolder. The folder contains the CUDA code for the entry-point functionresnet_predict.cu,包含Convoluted神经网络(CNN),重量和偏置文件的C ++类定义的标题和源文件。

Version History

Introduced in R2018b