Main Content

Deep Learning Prediction by Using NVIDIA TensorRT

This example shows code generation for a deep learning application by using the NVIDIA TensorRT™ library. It uses thecodegencommand to generate a MEX file to perform prediction with a ResNet-50 image classification network by using TensorRT. A second example demonstrates usage ofcodegencommand to generate a MEX file that performs 8-bit integer prediction by using TensorRT for a logo classification network.

Third-Party Prerequisites

Required

This example generates CUDA® MEX and has the following third-party requirements.

  • CUDA enabled NVIDIA® GPU and compatible driver.

Optional

For non-MEX builds such as static, dynamic libraries or executables, this example has the following additional requirements.

Verify GPU Environment

Use thecoder.checkGpuInstall(GPU Coder)function to verify that the compilers and libraries necessary for running this example are set up correctly.

envCfg = coder.gpuEnvConfig('host'); envCfg.DeepLibTarget ='tensorrt';envCfg.DeepCodegen = 1; envCfg.Quiet = 1; coder.checkGpuInstall(envCfg);

Theresnet_predictEntry-Point Function

This example uses the DAG network ResNet-50 to show image classification by using TensorRT. A pretrained ResNet-50 model for MATLAB® is available in the ResNet-50 support package of Deep Learning Toolbox. To download and install the support package, use the Add-On Explorer.

Theresnet_predict.mfunction loads the ResNet-50 network into a persistent network object and reuses the persistent object on subsequent prediction calls.

type('resnet_predict.m')
% 2020年版权MathWorks公司功能= resnet_predict(in) %#codegen % A persistent object mynet is used to load the series network object. At % the first call to this function, the persistent object is constructed and % setup. When the function is called subsequent times, the same object is % reused to call predict on inputs, avoiding reconstructing and reloading % the network object. persistent mynet; if isempty(mynet) % Call the function resnet50 that returns a DAG network % for ResNet-50 model. mynet = coder.loadDeepLearningNetwork('resnet50','resnet'); end % pass in input out = mynet.predict(in);

Run MEX Code Generation

To generate CUDA code for theresnet_predictentry-point function, create a GPU code configuration object for a MEX target and set the target language to C++. Use thecoder.DeepLearningConfig(GPU Coder)function to create aTensorRTdeep learning configuration object and assign it to theDeepLearningConfigproperty of the GPU code configuration object. Run thecodegencommand specifying an input size of [224,224,3]. This value corresponds to the input layer size of ResNet-50 network.

cfg = coder.gpuConfig('mex'); cfg.TargetLang ='C++';cfg.DeepLearningConfig = coder.DeepLearningConfig('tensorrt'); codegen-configcfgresnet_predict-args{ones(224,224,3)}-report
Code generation successful: View report

Perform Prediction on Test Image

im = imread('peppers.png'); im = imresize(im, [224,224]); predict_scores = resnet_predict_mex(double(im));%% get top 5 probability scores and their labels[val,indx] = sort(predict_scores,'descend'); scores = val(1:5)*100; net = resnet50; classnames = net.Layers(end).ClassNames; labels = classnames(indx(1:5));

Clear the static network object that was loaded in memory.

clearmex;

Generate TensorRT Code for INT8 Prediction

Generate TensorRT code that runs inference in int8 precision. Use a pretrained logo classification network to classify logos in images. Download the pretrainedLogoNet网络和保存它logonet.matfile. The network was developed in MATLAB. This network can recognize 32 logos under various lighting conditions and camera angles. The network is pretrained in single precision floating-point format.

net = getLogonet();

Code generation by using the NVIDIA TensorRT Library with inference computation in 8-bit integer precision supports these additional networks:

  • Object detector networks such as YOLOv2 and SSD.

  • Regression and semantic segmentation networks.

TensorRT requires a calibration data set to calibrate a network that is trained in floating-point to compute inference in 8-bit integer precision. Set the data type to int8 and the path to the calibration data set by using theDeepLearningConfig.logos_datasetis a subfolder containing images grouped by their corresponding classification labels. For int8 support, GPU compute capability must be 6.1 or higher.

Note:For semantic segmentation networks, the calibration data images must be of a format supported by theimreadfunction.

unzip('logos_dataset.zip'); cfg = coder.gpuConfig('mex'); cfg.TargetLang ='C++';cfg.GpuConfig.ComputeCapability ='6.1';cfg.DeepLearningConfig = coder.DeepLearningConfig('tensorrt'); cfg.DeepLearningConfig.DataType ='int8';cfg.DeepLearningConfig.DataPath ='logos_dataset';cfg.DeepLearningConfig.NumCalibrationBatches = 50; codegen-configcfglogonet_predict-args{ones(227,227,3,'int8')}-report
Code generation successful: View report

Run INT8 Prediction on Test Image

im = imread('gpucoder_tensorrt_test.png'); im = imresize(im, [227,227]); predict_scores = logonet_predict_mex(int8(im));%% get top 5 probability scores and their labels[val,indx] = sort(predict_scores,'descend'); scores = val(1:5)*100; classnames = net.Layers(end).ClassNames; labels = classnames(indx(1:5));

Clear the static network object that was loaded in memory.

clearmex;