车道检测和GPU编码器优化

这个示例使用:

这个例子展示了如何生成CUDA®代码从深学习网络,由一个表示SeriesNetwork对象。在这个例子中,该系列网络是一种卷积神经网络,可以检测并输出通路标志从图像边界。

先决条件

CUDA NVIDIA GPU®启用。
NVIDIA CUDA工具包和司机。
英伟达cuDNN图书馆。
OpenCV图书馆视频读和图像显示操作。
环境变量的编译器和库。信息的支持版本的编译器和库,明白了金宝app第三方硬件。设置环境变量,看到设置必备产品下载188bet金宝搏。

验证GPU环境

使用coder.checkGpuInstall函数来确认所需的编译器和库运行这个例子是正确设置。

envCfg = coder.gpuEnvConfig (“主机”);envCfg。DeepLibTarget =“cudnn”;envCfg。DeepCodegen = 1;envCfg。安静= 1;coder.checkGpuInstall (envCfg);

得到Pretrained SeriesNetwork

[laneNet, coeffMeans coeffStds] = getLaneDetectionNetworkGPU ();

这个网络需要一个图像作为输入和输出两个车道边界对应于自我的左和右车道车辆。每个车道边界由抛物型方程表示: $y = 一个 x^{2} + b x + c$ x, y是横向偏移量和纵向距离车辆。网络输出三个参数a、b和c /巷。网络体系结构是类似的AlexNet除了最后几层取而代之的是一个更小的完全连接层和回归输出层。查看网络体系结构,使用analyzeNetwork函数。

analyzeNetwork (laneNet)

检查主入口点函数

类型detect_lane.m

函数(laneFound、ltPts rtPts] = detect_lane(框架、laneCoeffMeans laneCoeffStds) %的网络输出,计算左和右车道点在图像%坐标。摄像机坐标被加州理工学院mono %相机模型。% # codegen %一系列持久对象mynet用于加载网络对象。在%第一次调用这个函数,构造持久对象和%设置。只有在函数被调用时,随后的时期,%重用相同的对象调用预测输入,从而避免网络重构和%重新加载对象。持久lanenet;如果isempty (lanenet) lanenet = coder.loadDeepLearningNetwork (lanenet。席”、“lanenet”);lanecoeffsNetworkOutput = lanenet结束。预测(排列(框架、[2 1 3]));%恢复原始多项式系数通过扭转规范化步骤params = lanecoeffsNetworkOutput。* laneCoeffStds + laneCoeffMeans;isRightLaneFound = abs (params (6)) > 0.5; %c should be more than 0.5 for it to be a right lane isLeftLaneFound = abs(params(3)) > 0.5; vehicleXPoints = 3:30; %meters, ahead of the sensor ltPts = coder.nullcopy(zeros(28,2,'single')); rtPts = coder.nullcopy(zeros(28,2,'single')); if isRightLaneFound && isLeftLaneFound rtBoundary = params(4:6); rt_y = computeBoundaryModel(rtBoundary, vehicleXPoints); ltBoundary = params(1:3); lt_y = computeBoundaryModel(ltBoundary, vehicleXPoints); % Visualize lane boundaries of the ego vehicle tform = get_tformToImage; % map vehicle to image coordinates ltPts = tform.transformPointsInverse([vehicleXPoints', lt_y']); rtPts = tform.transformPointsInverse([vehicleXPoints', rt_y']); laneFound = true; else laneFound = false; end end function yWorld = computeBoundaryModel(model, xWorld) yWorld = polyval(model, xWorld); end function tform = get_tformToImage % Compute extrinsics based on camera setup yaw = 0; pitch = 14; % pitch of the camera in degrees roll = 0; translation = translationVector(yaw, pitch, roll); rotation = rotationMatrix(yaw, pitch, roll); % Construct a camera matrix focalLength = [309.4362, 344.2161]; principalPoint = [318.9034, 257.5352]; Skew = 0; camMatrix = [rotation; translation] * intrinsicMatrix(focalLength, ... Skew, principalPoint); % Turn camMatrix into 2-D homography tform2D = [camMatrix(1,:); camMatrix(2,:); camMatrix(4,:)]; % drop Z tform = projective2d(tform2D); tform = tform.invert(); end function translation = translationVector(yaw, pitch, roll) SensorLocation = [0 0]; Height = 2.1798; % mounting height in meters from the ground rotationMatrix = (... rotZ(yaw)*... % last rotation rotX(90-pitch)*... rotZ(roll)... % first rotation ); % Adjust for the SensorLocation by adding a translation sl = SensorLocation; translationInWorldUnits = [sl(2), sl(1), Height]; translation = translationInWorldUnits*rotationMatrix; end %------------------------------------------------------------------ % Rotation around X-axis function R = rotX(a) a = deg2rad(a); R = [... 1 0 0; 0 cos(a) -sin(a); 0 sin(a) cos(a)]; end %------------------------------------------------------------------ % Rotation around Y-axis function R = rotY(a) a = deg2rad(a); R = [... cos(a) 0 sin(a); 0 1 0; -sin(a) 0 cos(a)]; end %------------------------------------------------------------------ % Rotation around Z-axis function R = rotZ(a) a = deg2rad(a); R = [... cos(a) -sin(a) 0; sin(a) cos(a) 0; 0 0 1]; end %------------------------------------------------------------------ % Given the Yaw, Pitch, and Roll, determine the appropriate Euler angles % and the sequence in which they are applied to align the camera's % coordinate system with the vehicle coordinate system. The resulting % matrix is a Rotation matrix that together with the Translation vector % defines the extrinsic parameters of the camera. function rotation = rotationMatrix(yaw, pitch, roll) rotation = (... rotY(180)*... % last rotation: point Z up rotZ(-90)*... % X-Y swap rotZ(yaw)*... % point the camera forward rotX(90-pitch)*... % "un-pitch" rotZ(roll)... % 1st rotation: "un-roll" ); end function intrinsicMat = intrinsicMatrix(FocalLength, Skew, PrincipalPoint) intrinsicMat = ... [FocalLength(1) , 0 , 0; ... Skew , FocalLength(2) , 0; ... PrincipalPoint(1), PrincipalPoint(2), 1]; end

为网络和后期处理代码生成代码

网络计算参数a、b和c的抛物型方程描述左和右车道边界。

从这些参数,计算出x和y坐标对应通道的位置。坐标必须映射到图像坐标。这个函数detect_lane.m执行所有这些计算。生成这个函数创建一个GPU的CUDA代码代码配置对象“自由”目标和目标语言设置为c++。使用coder.DeepLearningConfig函数创建一个CuDNN深度学习配置对象,并将其分配给DeepLearningConfigGPU代码配置对象的属性。运行codegen命令。

cfg = coder.gpuConfig (“自由”);cfg。DeepLearningConfig = coder.DeepLearningConfig (“cudnn”);cfg。GenerateReport = true;cfg。TargetLang =“c++”;输入= {(227227 3“单一”),(1 6“双”),(1 6“双”)};codegenarg游戏输入配置cfgdetect_lane

代码生成成功:查看报告

生成的代码描述

生成的系列网络作为一个c++类包含23层类的数组。

类c_lanenet{公众:int32_TbatchSize;int32_TnumLayers;real32_T* inputData;real32_T * outputData;MWCNNLayer*层[23];公众:c_lanenet(无效);无效设置(空白);无效预测(空白);无效的清理(无效);~ c_lanenet(无效);};

的设置()类的方法设置处理,每一层对象分配内存。的预测()方法调用中每个23层的预测网络。

cnn_lanenet_conv * _w和cnn_lanenet_conv * _b文件是二进制卷积层的重量和偏见文件网络。cnn_lanenet_fc * _w和cnn_lanenet_fc * _b文件是二进制的重量和偏见文件完全连接层的网络。

codegendir = fullfile (“codegen”,“自由”,“detect_lane”);dir (codegendir)

。MWMaxPoolingLayer。o . .MWNormLayer。cpp .gitignore MWNormLayer.hpp DeepLearningNetwork.cu MWNormLayer.o DeepLearningNetwork.h MWOutputLayer.cpp DeepLearningNetwork.o MWOutputLayer.hpp MWActivationFunctionType.hpp MWOutputLayer.o MWCNNLayer.cpp MWRNNParameterTypes.hpp MWCNNLayer.hpp MWReLULayer.cpp MWCNNLayer.o MWReLULayer.hpp MWCNNLayerImplBase.hpp MWReLULayer.o MWCUSOLVERUtils.cpp MWTargetNetworkImplBase.hpp MWCUSOLVERUtils.hpp MWTargetTypes.hpp MWCUSOLVERUtils.o MWTensor.hpp MWCudaDimUtility.hpp MWTensorBase.cpp MWCudaMemoryFunctions.hpp MWTensorBase.hpp MWCudnnCNNLayerImpl.cu MWTensorBase.o MWCudnnCNNLayerImpl.hpp _clang-format MWCudnnCNNLayerImpl.o buildInfo.mat MWCudnnCommonHeaders.hpp cnn_lanenet0_0_conv1_b.bin MWCudnnCustomLayerBase.cu cnn_lanenet0_0_conv1_w.bin MWCudnnCustomLayerBase.hpp cnn_lanenet0_0_conv2_b.bin MWCudnnCustomLayerBase.o cnn_lanenet0_0_conv2_w.bin MWCudnnElementwiseAffineLayerImpl.cu cnn_lanenet0_0_conv3_b.bin MWCudnnElementwiseAffineLayerImpl.hpp cnn_lanenet0_0_conv3_w.bin MWCudnnElementwiseAffineLayerImpl.o cnn_lanenet0_0_conv4_b.bin MWCudnnFCLayerImpl.cu cnn_lanenet0_0_conv4_w.bin MWCudnnFCLayerImpl.hpp cnn_lanenet0_0_conv5_b.bin MWCudnnFCLayerImpl.o cnn_lanenet0_0_conv5_w.bin MWCudnnFusedConvActivationLayerImpl.cu cnn_lanenet0_0_data_offset.bin MWCudnnFusedConvActivationLayerImpl.hpp cnn_lanenet0_0_data_scale.bin MWCudnnFusedConvActivationLayerImpl.o cnn_lanenet0_0_fc6_b.bin MWCudnnInputLayerImpl.hpp cnn_lanenet0_0_fc6_w.bin MWCudnnLayerImplFactory.cu cnn_lanenet0_0_fcLane1_b.bin MWCudnnLayerImplFactory.hpp cnn_lanenet0_0_fcLane1_w.bin MWCudnnLayerImplFactory.o cnn_lanenet0_0_fcLane2_b.bin MWCudnnMaxPoolingLayerImpl.cu cnn_lanenet0_0_fcLane2_w.bin MWCudnnMaxPoolingLayerImpl.hpp cnn_lanenet0_0_responseNames.txt MWCudnnMaxPoolingLayerImpl.o codeInfo.mat MWCudnnNormLayerImpl.cu codedescriptor.dmr MWCudnnNormLayerImpl.hpp compileInfo.mat MWCudnnNormLayerImpl.o detect_lane.a MWCudnnOutputLayerImpl.cu detect_lane.cu MWCudnnOutputLayerImpl.hpp detect_lane.h MWCudnnOutputLayerImpl.o detect_lane.o MWCudnnReLULayerImpl.cu detect_lane_data.cu MWCudnnReLULayerImpl.hpp detect_lane_data.h MWCudnnReLULayerImpl.o detect_lane_data.o MWCudnnTargetNetworkImpl.cu detect_lane_initialize.cu MWCudnnTargetNetworkImpl.hpp detect_lane_initialize.h MWCudnnTargetNetworkImpl.o detect_lane_initialize.o MWElementwiseAffineLayer.cpp detect_lane_internal_types.h MWElementwiseAffineLayer.hpp detect_lane_rtw.mk MWElementwiseAffineLayer.o detect_lane_terminate.cu MWElementwiseAffineLayerImplKernel.cu detect_lane_terminate.h MWElementwiseAffineLayerImplKernel.o detect_lane_terminate.o MWFCLayer.cpp detect_lane_types.h MWFCLayer.hpp examples MWFCLayer.o gpu_codegen_info.mat MWFusedConvActivationLayer.cpp html MWFusedConvActivationLayer.hpp interface MWFusedConvActivationLayer.o networkParamsInfo_lanenet0_0.bin MWInputLayer.cpp predict.cu MWInputLayer.hpp predict.h MWInputLayer.o predict.o MWKernelHeaders.hpp rtw_proj.tmw MWLayerImplFactory.hpp rtwtypes.h MWMaxPoolingLayer.cpp shared_layers_export_macros.hpp MWMaxPoolingLayer.hpp

为后处理生成附加文件输出

出口意味着和性病值训练网络执行期间使用。

codegendir = fullfile (pwd,“codegen”,“自由”,“detect_lane”);fid = fopen (fullfile (codegendir“mean.bin”),' w ');一个= [coeffMeans coeffStds];写入文件(fid,,“双”);文件关闭(fid);

主文件

网络编译代码通过使用一个主文件。主文件使用OpenCVVideoCapture方法阅读从输入视频帧。每一帧处理和分类,直到没有更多的帧读取。在显示每一帧的输出,输出之前被使用进行后期处理detect_lane函数中生成detect_lane.cu。

类型main_lanenet.cu

/ * 2016年版权MathWorks公司* / # include < stdio。h > # include < stdlib。h > # include < cuda。h > # include < opencv2 / opencv。hpp > # include < opencv2 / imgproc。hpp > # include < opencv2 /核心/核心。hpp > # include < opencv2 /核心/类型。hpp > # include < opencv2 / highgui。hpp > # include <列表> # include < cmath > #包括“detect_lane。使用名称空间的简历h”;空白readData(浮动*输入,垫源自,垫和im){大小尺寸(227227);调整(源自,im,大小,0,0,INTER_LINEAR);(int j = 0; < 227 * 227; j + +) {/ / BGR RGB输入[2 * 227 * 227 + j] =(浮动)(im.data [j * 3 + 0]);输入(1 * 227 * 227 + j] =(浮动)(im.data [j * 3 + 1]); input[0*227*227+j]=(float)(im.data[j*3+2]); } } void addLane(float pts[28][2], Mat & im, int numPts) { std::vector iArray; for(int k=0; k> orig; if (orig.empty()) break; readData(inputBuffer, orig, im); writeData(inputBuffer, orig, 6, means, stds); cudaEventRecord(stop); cudaEventSynchronize(stop); char strbuf[50]; float milliseconds = -1.0; cudaEventElapsedTime(&milliseconds, start, stop); fps = fps*.9+1000.0/milliseconds*.1; sprintf (strbuf, "%.2f FPS", fps); putText(orig, strbuf, Point(200,30), FONT_HERSHEY_DUPLEX, 1, CV_RGB(0,0,0), 2); imshow("Lane detection demo", orig); if( waitKey(50)%256 == 27 ) break; // stop capturing by pressing ESC */ } destroyWindow("Lane detection demo"); free(inputBuffer); free(outputBuffer); return 0; }

下载示例视频

如果~ (”。/ caltech_cordova1.avi ',“文件”)url =“//www.tatmou.com/金宝appsupportfiles/gpucoder/media/caltech_cordova1.avi”;websave (“caltech_cordova1.avi”url);结束

构建可执行

如果ispc setenv (“MATLAB_ROOT”,matlabroot);vcvarsall = mex.getCompilerConfigurations (“c++”).Details.CommandLineShell;setenv (“VCVARSALL”,vcvarsall);系统(“make_win_lane_detection.bat”);cd (codegendir);系统(“lanenet。exe . . \ \…\ caltech_cordova1.avi”);其他的setenv (“MATLAB_ROOT”,matlabroot);系统(“让- f Makefile_lane_detection.mk”);cd (codegendir);系统(”。/ lanenet . . / . . / . . / caltech_cordova1.avi”);结束