Datastores for Deep Learning

Datastores in MATLAB^®是一个方便的方式工作ing with and representing collections of data that are too large to fit in memory at one time. Because deep learning often requires large amounts of data, datastores are an important part of the deep learning workflow in MATLAB.

Select Datastore

For many applications, the easiest approach is to start with a built-in datastore. For more information about the available built-in datastores, seeSelect Datastore for File Format or Application。However, only some types of built-in datastores can be used directly as input for network training, validation, and inference. These datastores are:

Datastore	Description	Additional Toolbox Required
`ImageDatastore`	Datastore for image data	none
`AugmentedImageDatastore`	Datastore for resizing and augmenting training images Datastore is nondeterministic	none
`PixelLabelDatastore`(Computer Vision Toolbox)	Datastore for pixel label data	Computer Vision Toolbox™
`PixelLabelImageDatastore`(Computer Vision Toolbox)	Datastore for training semantic segmentation networks Datastore is nondeterministic	Computer Vision Toolbox
`boxLabelDatastore`(Computer Vision Toolbox)	Datastore for bounding box label data	Computer Vision Toolbox
`RandomPatchExtractionDatastore`(Image Processing Toolbox)	Datastore for extracting random patches from image-based data Datastore is nondeterministic	Image Processing Toolbox™
`blockedImageDatastore`(Image Processing Toolbox)	Datastore for blockwise reading and processing of image data, including large images that do not fit in memory	Image Processing Toolbox
`DenoisingImageDatastore`(Image Processing Toolbox)	Datastore to train an image denoising deep neural network Datastore is nondeterministic	Image Processing Toolbox

Other built-in datastores can be used as input for deep learning, but the data read from these datastores must be preprocessed into a format required by a deep learning network. For more information on the required format of read data, seeInput Datastore for Training, Validation, and Inference。For more information on how to preprocess data read from datastores, seeTransform and Combine Datastores。

For some applications, there may not be a built-in datastore type that fits your data well. For these problems, you can create a custom datastore. For more information, seeDevelop Custom Datastore。All custom datastores are valid inputs to deep learning interfaces as long as thereadfunction of the custom datastore returns data in the required form.

Input Datastore for Training, Validation, and Inference

Datastores are valid inputs in Deep Learning Toolbox™ for training, validation, and inference.

Training and Validation

You can use an image datastore or other types of datastore as a source of training data when training using thetrainNetworkfunction. To use a datastore for validation, use the'ValidationData'name-value pair argument intrainingOptions。

To be a valid input for training or validation, thereadfunction of a datastore must return data as either a cell array or a table (with the exception ofImageDatastoreobjects which can output numeric arrays and custom mini-batch datastores which must output tables).

For networks with a single input, the table or cell array returned by the datastore must have two columns. The first column of data represents inputs to the network and the second column of data represents responses. Each row of data represents a separate observation. ForImageDatastoreonly,trainNetworkandtrainingOptionssupport data returned as integer arrays and single-column cell array of integer arrays.

To use a datastore for networks with multiple input layers, use thecombineandtransformfunctions to create a datastore that outputs a cell array with (numInputs+ 1) columns, wherenumInputsis the number of network inputs. In this case, the firstnumInputscolumns specify the predictors for each input and the last column specifies the responses. The order of inputs is given by theInputNamesproperty of the layer graphlayers。

The following table shows example outputs of calling thereadfunction for datastoreds。

Network Architecture Datastore Output Example Output

Single input layer

Network Architecture	Datastore Output	Example Output
Single input layer	Table or cell array with two columns. The first and second columns specify the predictors and responses, respectively. Table elements must be scalars, row vectors, or 1-by-1 cell arrays containing a numeric array. Custom mini-batch datastores must output tables.	Table for network with one input and one output: data = read(ds) data = 4×2 table Predictors Response __________________ ________ {224×224×3 double} 2 {224×224×3 double} 7 {224×224×3 double} 9 {224×224×3 double} 9
Cell array for network with one input and one output: data = read(ds) data = 4×2 cell array {224×224×3 double} {[2]} {224×224×3 double} {[7]} {224×224×3 double} {[9]} {224×224×3 double} {[9]}
Multiple input layers	Cell array with (`numInputs`+ 1) columns, where`numInputs`is the number of network inputs. The first`numInputs`columns specify the predictors for each input and the last column specifies the responses. The order of inputs is given by the`InputNames`property of the layer graph`layers`。	Cell array for network with two inputs and one output. data = read(ds) data = 4×3 cell array {224×224×3 double} {128×128×3 double} {[2]} {224×224×3 double} {128×128×3 double} {[2]} {224×224×3 double} {128×128×3 double} {[9]} {224×224×3 double} {128×128×3 double} {[9]}

Table or cell array with two columns.

The first and second columns specify the predictors and responses, respectively.

Table elements must be scalars, row vectors, or 1-by-1 cell arrays containing a numeric array.

Custom mini-batch datastores must output tables.

Table for network with one input and one output:

data = read(ds)

data = 4×2 table Predictors Response __________________ ________ {224×224×3 double} 2 {224×224×3 double} 7 {224×224×3 double} 9 {224×224×3 double} 9

Cell array for network with one input and one output:

data = read(ds)

data = 4×2 cell array {224×224×3 double} {[2]} {224×224×3 double} {[7]} {224×224×3 double} {[9]} {224×224×3 double} {[9]}

Multiple input layers

Cell array with (numInputs+ 1) columns, wherenumInputsis the number of network inputs.

The firstnumInputscolumns specify the predictors for each input and the last column specifies the responses.

The order of inputs is given by theInputNamesproperty of the layer graphlayers。

Cell array for network with two inputs and one output.

data = read(ds)

data = 4×3 cell array {224×224×3 double} {128×128×3 double} {[2]} {224×224×3 double} {128×128×3 double} {[2]} {224×224×3 double} {128×128×3 double} {[9]} {224×224×3 double} {128×128×3 double} {[9]}

The format of the predictors depend on the type of data.

Data	Format of Predictors
2-D image	h-by-w-by-cnumeric array, whereh,w, andcare the height, width, and number of channels of the image, respectively.
3-D image	h-by-w-by-d-by-cnumeric array, whereh,w,d, andcare the height, width, depth, and number of channels of the image, respectively.
Vector sequence	c-by-smatrix, wherecis the number of features of the sequence andsis the sequence length.
1-D image sequence	h-by-c-by-sarray, wherehandccorrespond to the height and number of channels of the image, respectively, andsis the sequence length. Each sequence in the mini-batch must have the same sequence length.
2-D image sequence	h-by-w-by-c-by-sarray, whereh,w, andccorrespond to the height, width, and number of channels of the image, respectively, andsis the sequence length. Each sequence in the mini-batch must have the same sequence length.
3-D image sequence	h-by-w-by-d-by-c-by-sarray, whereh,w,d, andccorrespond to the height, width, depth, and number of channels of the image, respectively, andsis the sequence length. Each sequence in the mini-batch must have the same sequence length.
Features	c-by-1 column vector, wherecis the number of features.

For predictors returned in tables, the elements must contain a numeric scalar, a numeric row vector, or a 1-by-1 cell array containing a numeric array.

ThetrainNetworkfunction does not support networks with multiple sequence input layers.

The format of the responses depend on the type of task.

Task	Format of Responses
Classification	Categorical scalar
Regression	Scalar Numeric vector 3-D numeric array representing an image
Sequence-to-sequence classification	1-by-ssequence of categorical labels, wheresis the sequence length of the corresponding predictor sequence.
Sequence-to-sequence regression	R-by-smatrix, whereRis the number of responses andsis the sequence length of the corresponding predictor sequence.

For responses returned in tables, the elements must be a categorical scalar, a numeric scalar, a numeric row vector, or a 1-by-1 cell array containing a numeric array.

Prediction

For inference usingpredict,classify, andactivations, a datastore is only required to yield the columns corresponding to the predictors. The inference functions use the firstNumInputscolumns and ignores the subsequent layers, whereNumInputsis the number of network input layers.

Specify Read Size and Mini-Batch Size

A datastore may return any number of rows (observations) for each call toread。Functions such astrainNetwork,predict,classify, andactivationsthat accept datastores and support specifying a'MiniBatchSize'callreadas many times as is necessary to form complete mini-batches of data. As these functions form mini-batches, they use internal queues in memory to store read data. For example, if a datastore consistently returns 64 rows per call toreadandMiniBatchSizeis128, then to form each mini-batch of data requires two calls toread。

For best runtime performance, it is recommended to configure datastores such that the number of observations returned byreadis equal to the'MiniBatchSize'。For datastores that have a'ReadSize'property, set the'ReadSize'to change the number of observations returned by the datastore for each call toread。

Transform and Combine Datastores

Deep learning frequently requires the data to be preprocessed and augmented before data is in an appropriate form to input to a network. Thetransformandcombinefunctions of datastore are useful in preparing data to be fed into a network.

Transform Datastores

A transformed datastore applies a particular data transformation to an underlying datastore when reading data. To create a transformed datastore, use thetransformfunction and specify the underlying datastore and the transformation.

For complex transformations involving several preprocessing operations, define the complete set of transformations in your own function. Then, specify a handle to your function as the@fcnargument oftransform。For more information, seeCreate Functions in Files。
For simple transformations that can be expressed in one line of code, you can specify a handle to an anonymous function as the@fcnargument oftransform。For more information, seeAnonymous Functions。

The function handle provided totransformmust accept input data in the same format as returned by thereadfunction of the underlying datastore.

Example: Transform Image Datastore to Train Digit Classification Network

This example uses thetransformfunction to create a training set in which randomized 90 degree rotation is added to each image within an image datastore. Pass the resultingTransformedDatastoretotrainNetworkto train a simple digit classification network.

Create an image datastore containing digit images.

挖itDatasetPath = fullfile(matlabroot,'toolbox','nnet',。..'nndemos','nndatasets','DigitDataset'); imds = imageDatastore(digitDatasetPath,。..'IncludeSubfolders',true,。..'LabelSource','foldernames');

Set the mini-batch size equal to theReadSizeof the image datastore.

miniBatchSize = 128; imds.ReadSize = miniBatchSize;

Transform images in the image datastore by adding randomized 90 degree rotation. The transformation function,preprocessForTraining, is defined at the end of this example.

dsTrain = transform(imds,@preprocessForTraining,'IncludeInfo',true)

dsTrain = TransformedDatastore with properties: UnderlyingDatastore: [1×1 matlab.io.datastore.ImageDatastore] Transforms: {@preprocessForTraining} IncludeInfo: 1

Specify layers of the network and training options, then train the network using the transformed datastoredsTrainas a source of data.

layers = [ imageInputLayer([28 28 1],'Normalization','none') convolution2dLayer(5,20) reluLayer maxPooling2dLayer(2,'Stride',2) fullyConnectedLayer(10); softmaxLayer classificationLayer]; options = trainingOptions('adam',。..“阴谋”,'training-progress',。..'MiniBatchSize',miniBatchSize); net = trainNetwork(dsTrain,layers,options);

Define the transformation function,preprocessForTraining。The input to the function is a batch of data,data, read from the underlying datastore. The function in this example loops through each read image and performs randomized rotation, then returns the transformed image and corresponding label as a cell array as expected bytrainNetwork。

function[dataOut,info] = preprocessForTraining(data,info) numRows = size(data,1); dataOut = cell(numRows,2);foridx = 1:numRows% Randomized 90 degree rotationimgOut = rot90(data{idx,1},randi(4)-1);% Return the label from info struct as the% second column in dataOut.dataOut(idx,:) = {imgOut,info.Label(idx)};endend

Combine Datastores

Thecombinefunction associates multiple datastores. Operating on the resultingCombinedDatastore, such as resetting the datastore, performs the same operation on all of the underlying datastores. Calling thereadfunction of a combined datastore reads one batch of data from all of theNunderlying datastores, which must return the same number of observations. Reading from a combined datastore returns the horizontally concatenated results in anN-column cell array that is suitable for training and validation. Shuffling a combined datastore results in an identical randomized ordering of files in the underlying datastores.

For example, if you are training an image-to-image regression network, then you can create the training data set by combining two image datastores. This sample code demonstrates combining two image datastores namedimdsXandimdsY。The combined datastoreimdsTrainreturns data as a two-column cell array.

imdsX = imageDatastore(___); imdsY = imageDatastore(___); imdsTrain = combine(imdsX,imdsY)

imdsTrain = CombinedDatastore with properties: UnderlyingDatastores: {1×2 cell}

如果你有图像处理工具箱,然后randomPatchExtractionDatastore(Image Processing Toolbox)provides an alternate solution to associating image-based data inImageDatastores,PixelLabelDatastores, andTransformedDatastores. ArandomPatchExtractionDatastorehas several advantages over associating data using thecombinefunction. Specifically, a random patch extraction datastore:

Provides an easy way to extract patches from both 2-D and 3-D data without requiring you to implement a custom cropping operation usingtransformandcombine
Provides an easy way to generate multiple patches per image per mini-batch without requiring you to define a custom concatenation operation usingtransform。
Supports efficient conversion between categorical and numeric data when applying image transforms to categorical data
Supports parallel training
Improves performance by caching images

使用数据存储并行训练和英航ckground Dispatching

Datastores used for parallel training or multi-GPU training must be partitionable. To determine if a datastore is partitionable, use the functionisPartitionable。Specify parallel or multi-GPU training using the'ExecutionEnvironment'name-value pair argument oftrainingOptions。或者使用单个或多个并行训练GPUs requires Parallel Computing Toolbox™.

Many built-in datastores are already partitionable because they support thepartitionfunction. Using thetransformandcombinefunctions with built-in datastores frequently maintains support for parallel and multi-GPU training.

If you need to create a custom datastore that supports parallel or multi-GPU training, then your datastore must implement thematlab.io.datastore.Partitionableclass.

Partitionable datastores support reading training data using background dispatching. Background dispatching queues data in memory while the GPU is working. Specify background dispatching using the'DispatchInBackground'name-value pair argument oftrainingOptions。Background dispatching requires Parallel Computing Toolbox.

When training in parallel, datastores do not support specifying the'Shuffle'name-value pair argument oftrainingOptionsas'none'。