此示例显示了使用深度学习的图像分割应用程序的代码生成。它使用代码根
command to generate a MEX function that performs prediction on a DAG Network object for U-Net, a deep learning network for image segmentation.
对于类似的示例,通过使用U-NET而没有代码根
命令,请参阅使用深度学习对多光谱图像的语义分割(Image Processing Toolbox)。
必需的
这个example generates CUDA MEX and has the following third-party requirements.
CUDA® enabled NVIDIA® GPU and compatible driver.
可选的
对于非MEX构建,例如静态,动态库或可执行文件,此示例具有以下其他要求。
NVIDIA toolkit.
Nvidia Cudnn图书馆。
Environment variables for the compilers and libraries. For more information, seeThird-Party Hardware(GPU编码器)和Setting Up the Prerequisite Products(GPU编码器)。
使用Coder.CheckgPuinstall
(GPU编码器)function to verify that the compilers and libraries necessary for running this example are set up correctly.
envCfg = coder.gpuEnvConfig('主持人');envCfg.DeepLibTarget ='cudnn';envcfg.deepcodegen = 1;envcfg.quiet = 1;coder.checkgpuinstall(envcfg);
U-NET [1]是一种用于语义图像分割的卷积神经网络(CNN)。在U-NET中,初始卷积层与最大池层散布在一起,依次降低了输入图像的分辨率。这些层之后是一系列卷积层,散布着上采样算子,依次增加了输入图像的分辨率。组合这两个系列路径形成U形图。该网络最初是对生物医学图像分割应用程序进行预测的训练和用于进行预测的。该示例演示了网络随着时间的推移跟踪森林覆盖率变化的能力。环境机构跟踪森林砍伐,以评估和限定一个地区的环境和生态健康。
基于深度学习的语义分割可以从高分辨率航拍照片中精确测量植被覆盖物。一个挑战是区分具有相似视觉特征的类,例如试图将绿色像素分类为草,灌木或树。为了提高分类精度,一些数据集包含多光谱图像,可提供有关每个像素的其他信息。例如,Hamlin Beach State Park数据集用近红外通道为颜色图像提供了更清晰的类别。
这个example uses the Hamlin Beach State Park Data [2] along with a pretrained U-Net network in order to correctly classify each pixel.
使用的U-NET经过培训,可将属于18个类的像素分段,其中包括:
0.其他类/图像边框7.野餐表14.草1.道路标记8.黑色木板15.沙子2.树9.白色木板16.水(湖)3。建筑物10.建筑物10.橙色着陆垫17。水(池塘)4。车辆(汽车,卡车或巴士)11。水浮标18.沥青(停车场/人行道)5。人12.岩石6.救生员椅子13.其他植被
segmentImageUnet
Entry-Point Function这segmentImageUnet.m
入口点函数通过使用在输入图像上执行补丁语义分割,并使用在MultiSpectralunet.mat
file. The function loads the network object from theMultiSpectralunet.mat
file into a persistent variablemynet和reuses the persistent variable on subsequent prediction calls.
类型('segmentimageunet.m')
%out = sementimageunet(im,patchsize)返回使用网络多光谱的语言分割的语义分割的%图像。分割%在每个尺寸贴片贴片上执行。%%版权2019-2021 The MathWorks,Inc。函数out = segmentimageunet(im,patchSize)%#codegen persistent mynet;如果Isempty(mynet)mynet = coder.loaddeeplearningnetwork('trainedunet/multispectralunet.mat');端[高度,宽度,nChannel] = size(im);patch = coder.nullcopy(zeros([PatchSize,nChannel-1]));%pad图像具有尺寸为patchsize padsize = zeros的倍数(1,2);padSize(1)= PatchSize(1)-Mod(高度,PatchSize(1));padsize(2)= PatchSize(2)-Mod(width,PatchSize(2));im_pad = padarray(im,padsize,0,'post'); [height_pad, width_pad, ~] = size(im_pad); out = zeros([size(im_pad,1), size(im_pad,2)], 'uint8'); for i = 1:patchSize(1):height_pad for j =1:patchSize(2):width_pad for p = 1:nChannel-1 patch(:,:,p) = squeeze( im_pad( i:i+patchSize(1)-1,... j:j+patchSize(2)-1,... p)); end % pass in input segmentedLabels = activations(mynet, patch, 'Segmentation-Layer'); % Takes the max of each channel (6 total at this point) [~,L] = max(segmentedLabels,[],3); patch_seg = uint8(L); % populate section of output out(i:i+patchSize(1)-1, j:j+patchSize(2)-1) = patch_seg; end end % Remove the padding out = out(1:height, 1:width);
Trainedunet_url ='//www.tatmou.com/金宝appsupportfiles/vision/data/multispectralunet.mat';downloadTrainedUnet(trainedUnet_url,pwd);
为Hamlin Beach数据集下载预估计的U-NET ...这将需要几分钟才能下载...完成。
ld = load("trainedUnet/multispectralUnet.mat");net = ld.net;
这DAG network contains 58 layers including convolution, max pooling, depth concatenation, and the pixel classification output layers. To display an interactive visualization of the deep learning network architecture, use theanalyzeNetwork
功能。分析(NET);
下载Hamlin海滩州立公园数据。
if〜存在(fullfile(pwd,'数据'),,,,'dir')url ='http://www.cis.rit.edu/~rmk6217/rit18_data.mat';Downloadhamlinbeachmsidata(URL,PWD+“/数据/”);end
下载Hamlin Beach数据集...这将需要几分钟才能下载...完成。
加载并检查MATLAB中的数据。
负载(fullfile(PWD,'数据',,,,'rit18_data',,,,'rit18_data.mat'));%检查数据谁是测试数据
Name Size Bytes Class Attributes test_data 7x12446x7654 1333663576 uint16
这image has seven channels. The RGB color channels are the fourth, fifth, and sixth image channels. The first three channels correspond to the near-infrared bands and highlight different components of the image based on their heat signatures. Channel 7 is a mask that indicates the valid segmentation region.
这multispectral image data is arranged as numChannels-by-width-by-height arrays. In MATLAB, multichannel images are arranged as width-by-height-by-numChannels arrays. To reshape the data so that the channels are in the third dimension, use the helper function,switchChannelsToThirdPlane
。
test_data = switchChannelstothirdplane(test_data);% Confirm data has the correct structure (channels last).谁是测试数据
Name Size Bytes Class Attributes test_data 12446x7654x7 1333663576 uint16
生成CUDA代码segmentImageUnet.m
入口点函数,为MEX目标创建GPU配置对象,将目标语言设置为C ++。使用coder.deeplearningconfig
(GPU编码器)function to create aCuDNN
深度学习配置对象并将其分配给DeepLearningConfig
GPU代码配置对象的属性。跑过代码根
命令指定输入大小为[12446,7654,7]和[1024,1024]的补丁大小。这些值对应于整个test_data大小。较小的补丁大小加快推理。要查看如何计算补丁,请参阅segmentImageUnet.m
入口点功能。
cfg = coder.gpuconfig('Mex');cfg.targetlang ='C ++';CFG。DeepLearningConfig = coder.DeepLearningConfig('cudnn');输入= {ones(size(test_data),'uint16'),coder.constant([1024 1024])};代码根-configCFGsegmentImageUnet-args输入-report
代码生成成功:查看报告
这个segmentImageUnet
函数在数据测试(test_data)和a vector containing the dimensions of the patch size to use. Take patches of the image, predict the pixels in a particular patch, then combine all the patches together. Due to the size of test_data (12446x7654x7), it is easier to process such a large image in patches.
sementedImage = segmentimageunet_mex(test_data,[1024 1024]);
要仅提取分割的有效部分,请将分段图像乘以测试数据的掩码通道。
分段图= uint8(test_data(:,:,:,7)〜= 0)。
Because the output of the semantic segmentation is noisy, remove the noise and stray pixels by using themedfilt2
功能。
分段图= medfilt2(分段图,[5,5]);
这following line of code creates a vector of the class names.
classNames = [“路标”,,,,“树”,,,,“建造”,,,,“车辆”,,,,“人”,,,,...“救生员”,,,,"PicnicTable",,,,"BlackWoodPanel",,,,..."WhiteWoodPanel",,,,“ OrandelandingPad”,,,,“浮标”,,,,“岩石”,,,,..."LowLevelVegetation",,,,“ grass_lawn”,,,,"Sand_Beach",,,,..."Water_Lake",,,,"Water_Pond",,,,"Asphalt"];
Overlay the labels on the segmented RGB test image and add a color bar to the segmentation image.
cmap = jet(numel(classNames));b = labeloverlay(imadjust(test_data(:,::,4:6),[0 0.6],[0.1 0.9],0.55),0.55),...segmentedImage,'Transparency',,,,0.8,'Colormap',,,,cmap); figure imshow(B) N = numel(classNames); ticks = 1/(N*2):1/N:1; colorbar('TickLabels',celltr(classNames),'Ticks',,,,ticks,'TickLength',,,,0,...'ticklabelinterterpreter',,,,'none');colormap(cmap)标题('Segmented Image');
[1] Ronneberger,Olaf,Philipp Fischer和Thomas Brox。“ U-NET:生物医学图像分割的卷积网络。”ARXIV预印ARXIV:1505.04597,2015.
[2] Kemker,R。,C。Salvaggio和C. Kanan。“用于语义分割的高分辨率多光谱数据集。”Corr,ABS/1703.01918,2017。