主要内容

Train Network in the Cloud Using Automatic Parallel Support

This example shows how to train a convolutional neural network using MATLAB automatic support for parallel training. Deep learning training often takes hours or days. With parallel computing, you can speed up training using multiple graphical processing units (GPUs) locally or in a cluster in the cloud. If you have access to a machine with multiple GPUs, then you can complete this example on a local copy of the data. If you want to use more resources, then you can scale up deep learning training to the cloud. To learn more about your options for parallel training, seeScale Up Deep Learning in Parallel, on GPUs, and in the Cloud。This example guides you through the steps to train a deep learning network in a cluster in the cloud using MATLAB automatic parallel support.

要求

在运行该示例之前,您需要将群集配置到云端。在MATLAB中,您可以直接从MATLAB桌面创建云中的群集。在这方面Home标签,在平行线menu, selectCreate and Manage Clusters。在集群配置文件管理器中,单击Create Cloud Cluster。或者,您可以使用MathWorks云中心来创建和访问计算群集。有关更多信息,请参阅Getting Started with Cloud Center。After that, upload your data to an Amazon S3 bucket and access it directly from MATLAB. This example uses a copy of the CIFAR-10 data set that is already stored in Amazon S3. For instructions, seeUpload Deep Learning Data to the Cloud

Set Up Parallel Pool

Start a parallel pool in the cluster and set the number of workers to the number of GPUs in your cluster. If you specify more workers than GPUs, then the remaining workers are idle. This example assumes that the cluster you are using is set as the default cluster profile. Check the default cluster profile on the MATLABHome标签,在平行线>Select a Default Cluster

Numberofworkers = 8;Parpool(Numberofworkers);
使用'myclusterinthecloud'配置文件启动并行池(Parpool)...连接到8名工人。

Load Data Set from the Cloud

Load the training and test data sets from the cloud usingImageageAtastore.。在此示例中,使用存储在Amazon S3中的CIFAR-10数据集的副本。为确保工人可以访问云中的数据存储,请确保正确设置AWS凭据的环境变量。看Upload Deep Learning Data to the Cloud

imdstrain = imageageataStore('s3://cifar10cloud/cifar10/train',。。。'upplyubfolders',true,。。。'labelsource','foldernames');imdstest = imageageatastore('s3:// cifar10cloud / cifar10 / test',。。。'upplyubfolders',true,。。。'labelsource','foldernames');

Train the network with augmented image data by creating anaugmentedImageDatastoreobject. Use random translations and horizontal reflections. Data augmentation helps prevent the network from overfitting and memorizing the exact details of the training images.

imageSize = [32 32 3]; pixelRange = [-4 4]; imageAugmenter = imageDataAugmenter(。。。'randxreflection',true,。。。'RandXTranslation',pixelRange,。。。'randytranslation',pixelrange);egmentedimdstrain = upmentedimageageataStore(图像化,Imdstrain,。。。'DataAugmentation',imageAugmenter,。。。'OutputSizeMode','randcrop');

定义网络架构和培训选项

定义CIFAR-10数据集的网络架构。要简化代码,请使用卷积卷积的卷积块。汇集层向下采样空间尺寸。

blockDepth = 4;% blockDepth controls the depth of a convolutional blocknetWidth = 32;% netWidth controls the number of filters in a convolutional block图层= [ImageInputLayer(iconageize)卷积块(netwidth,blockdepth)maxpooling2dlayer(2,'Stride',2)卷大布洛克(2 * NetWidth,BlockDepth)MaxPooling2Dlayer(2,'Stride',2)卷积仪(4 * NetWidth,BlockDepth)普通Pooling2dlayer(8)全连接层(10)SoftmaxLayer分类层];

定义培训选项。使用当前群集将网络训练通过将执行环境设置为平行线。使用多个GPU时,可以增加可用的计算资源。使用GPU的数量扩展迷你批量大小,以保持每个GPU常数的工作负载。根据迷你批量尺寸缩放学习率。使用学习率计划以降低培训进展的学习率。打开培训进度策划,以在培训期间获得视觉反馈。

miniBatchSize = 256 * numberOfWorkers; initialLearnRate = 1e-1 * miniBatchSize/256; options = trainingOptions('sgdm',。。。'ExecutionEnvironment','parallel',。。。%打开自动并行支持。金宝app'InitialLearnRate',initialLearnRate,。。。% Set the initial learning rate.'MiniBatchSize',小匹马,。。。% Set the MiniBatchSize.'Verbose',错误的,。。。%不发送命令行输出。“阴谋”,'training-progress',。。。% Turn on the training progress plot.'L2Regularization',1e-10,。。。“MaxEpochs”,50,。。。'Shuffle','每个时代',。。。'vightationdata',imdsTest,。。。'验证职业',地板(numel(imdstrain.files)/ minibatchsize),。。。'学习chedule','piecewise',。。。'LearnRateDropFactor',0.1,。。。'LearnRateDropPeriod',45);

火车网络和用于分类

Train the network in the cluster. During training, the plot displays the progress.

net = trainNetwork(augmentedImdsTrain,layers,options)

net = SeriesNetwork with properties: Layers: [43×1 nnet.cnn.layer.Layer]

通过使用培训的网络对网络的准确性进行分类到本地计算机上的测试图像。然后将预测的标签与实际标签进行比较。

YPredicted = classify(net,imdsTest); accuracy = sum(YPredicted == imdsTest.Labels)/numel(imdsTest.Labels)

定义辅助功能

Define a function to create a convolutional block in the network architecture.

功能layers = convolutionalBlock(numFilters,numConvLayers) layers = [ convolution2dLayer(3,numFilters,'Padding','相同的')BatchnormalizationLayer Ruilulayer];图层= Repmat(图层,Numconvlayers,1);end

看Also

||

相关话题