Transfer Learning Using Pretrained Network

This example uses:

Open Live Script

This example shows how to fine-tune a pretrained GoogLeNet convolutional neural network to perform classification on a new collection of images.

GoogLeNet has been trained on over a million images and can classify images into 1000 object categories (such as keyboard, coffee mug, pencil, and many animals). The network has learned rich feature representations for a wide range of images. The network takes an image as input and outputs a label for the object in the image together with the probabilities for each of the object categories.

Transfer learning is commonly used in deep learning applications. You can take a pretrained network and use it as a starting point to learn a new task. Fine-tuning a network with transfer learning is usually much faster and easier than training a network with randomly initialized weights from scratch. You can quickly transfer learned features to a new task using a smaller number of training images.

Load Data

解压缩并加载新图像作为图像数据存储。imageDatastoreautomatically labels the images based on folder names and stores the data as anImageDatastoreobject. An image datastore enables you to store large image data, including data that does not fit in memory, and efficiently read batches of images during training of a convolutional neural network.

unzip('MerchData.zip'); imds = imageDatastore('MerchData',...'IncludeSubfolders',true,...'LabelSource','foldernames');

将数据分为培训和验证数据集。使用70％的图像进行培训，30％用于验证。splitEachLabelsplits the image datastore into two new datastores.

[imdsTrain,imdsValidation] = splitEachLabel(imds,0.7,'randomized');

This very small data set now contains 55 training images and 20 validation images. Display some sample images.

numTrainImages = numel(imdsTrain.Labels); idx = randperm(numTrainImages,16); figurefori = 1:16 subplot(4,4,i) I = readimage(imdsTrain,idx(i)); imshow(I)end

负载预估计的网络

Load the pretrained GoogLeNet neural network. If Deep Learning Toolbox™ Modelfor GoogLeNet Networkis not installed, then the software provides a download link.

net = googlenet;

UsedeepNetworkDesignerto display an interactive visualization of the network architecture and detailed information about the network layers.

deepNetworkDesigner(net)

The first layer, which is the image input layer, requires input images of size 224-by-224-by-3, where 3 is the number of color channels.

inputSize = net.Layers(1).InputSize

inputSize =1×3224 224 3

Replace Final Layers

The fully connected layer and classification layer of the pretrained networknetare configured for 1000 classes. These two layers,loss3-classifierandoutputin GoogLeNet, contain information on how to combine the features that the network extracts into class probabilities, a loss value, and predicted labels. To retrain a pretrained network to classify new images, replace these two layers with new layers adapted to the new data set.

Extract the layer graph from the trained network.

lgraph = layerGraph(net);

Replace the fully connected layer with a new fully connected layer that has number of outputs equal to the number of classes. To make learning faster in the new layers than in the transferred layers, increase theWeightLearnRateFactorandBiasLearnRateFactorvalues of the fully connected layer.

numClasses = numel(categories(imdsTrain.Labels))

numClasses = 5

newLearnableLayer = fullyConnectedLayer(numClasses,...'Name','new_fc',...'WeightLearnRateFactor',10,...'BiasLearnRateFactor',10); lgraph = replaceLayer(lgraph,'loss3-classifier',newLearnableLayer);

The classification layer specifies the output classes of the network. Replace the classification layer with a new one without class labels.trainNetworkautomatically sets the output classes of the layer at training time.

newClassLayer = classificationLayer('Name','new_classoutput'); lgraph = replaceLayer(lgraph,'output',newClassLayer);

Train Network

The network requires input images of size 224-by-224-by-3, but the images in the image datastores have different sizes. Use an augmented image datastore to automatically resize the training images. Specify additional augmentation operations to perform on the training images: randomly flip the training images along the vertical axis, and randomly translate them up to 30 pixels horizontally and vertically. Data augmentation helps prevent the network from overfitting and memorizing the exact details of the training images.

pixelRange = [-30 30]; imageAugmenter = imageDataAugmenter(...“ Randxfellection”,true,...'RandXTranslation',pixelRange,...'RandYTranslation',pixelRange); augimdsTrain = augmentedImageDatastore(inputSize(1:2),imdsTrain,...'DataAugmentation',imageAugmenter);

To automatically resize the validation images without performing further data augmentation, use an augmented image datastore without specifying any additional preprocessing operations.

augimdsValidation = augmentedImageDatastore(inputSize(1:2),imdsValidation);

指定培训选项。为了进行转移学习，请将特征从预告片的网络的早期层（转移的层权重）中保留。要减慢转移层中的学习，请将初始学习率设置为较小的价值。在上一步中，您增加了完全连接层的学习率因素，以加快新的最终层中的学习。学习率设置的这种组合仅在新层中的新层和较慢的学习中才能在其他层中进行快速学习。进行转移学习时，您不需要训练尽可能多的时代。一个时期是整个培训数据集的完整训练周期。指定迷你批量大小和验证数据。该软件每次验证网络ValidationFrequencyiterations during training.

options = trainingOptions('sgdm',...'MiniBatchSize',10,...'MaxEpochs'，6，...'InitialLearnRate',1e-4,...'Shuffle','every-epoch',...'ValidationData',augimdsValidation,...“验证频率”,3,...'Verbose',false,...'Plots','training-progress');

Train the network consisting of the transferred and new layers. By default,trainNetworkuses a GPU if one is available. This requires Parallel Computing Toolbox™ and a supported GPU device. For information on supported devices, seeGPU Support by Release(Parallel Computing Toolbox). Otherwise, it uses a CPU. You can also specify the execution environment by using the'ExecutionEnvironment'name-value pair argument oftrainingOptions.

netTransfer = trainNetwork(augimdsTrain,lgraph,options);

Classify Validation Images

Classify the validation images using the fine-tuned network.

[YPred,scores] = classify(netTransfer,augimdsValidation);

Display four sample validation images with their predicted labels.

idx = randperm(numel(imdsValidation.Files),4); figurefori = 1：4子图（2,2，i）i = deadImage（imdsvalidation，idx（i））;imshow（i）label = ypred（idx（i））;标题（字符串（标签））;end

Calculate the classification accuracy on the validation set. Accuracy is the fraction of labels that the network predicts correctly.

YValidation = imdsValidation.Labels; accuracy = mean(YPred == YValidation)

accuracy = 1

For tips on improving classification accuracy, seeDeep Learning Tips and Tricks.

参考

[1] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "ImageNet Classification with Deep Convolutional Neural Networks."Advances in neural information processing systems25 (2012).

[2] Szegedy, Christian, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. "Going deeper with convolutions."Proceedings of the IEEE conference on computer vision and pattern recognition(2015): 1–9.

[3] "BVLC GoogLeNet Model."https://github.com/BVLC/caffe/tree/master/models/bvlc_googlenet.