dlupdate

Update parameters using custom function

Since R2019b

collapse all in page

Syntax

netUpdated = dlupdate(fun,net)

params = dlupdate(fun,params)

[___] = dlupdate(fun,___A1,...,An)

[___，X1,...,Xm] = dlupdate(fun,___)

Description

example

netUpdated= dlupdate(fun，net)updates the learnable parameters of thedlnetworkobjectnetby evaluating the functionfunwith each learnable parameter as an input.funis a function handle to a function that takes one parameter array as an input argument and returns an updated parameter array.

params= dlupdate(fun，params)updates the learnable parameters inparamsby evaluating the functionfunwith each learnable parameter as an input.

[___] = dlupdate(fun，___A1,...,An)also specifies additional input arguments, in addition to the input arguments in previous syntaxes, whenfunis a function handle to a function that requiresn+1input values.

[___，X1,...,Xm] = dlupdate(fun，___)returns multiple outputsX1,...,Xmwhenfunis a function handle to a function that returnsm+1output values.

Examples

collapse all

L1 Regularization with`dlupdate`

Open Live Script

Perform L1 regularization on a structure of parameter gradients.

Create the sample input data.

dlX = dlarray(rand(100,100,3),'SSC');

Initialize the learnable parameters for the convolution operation.

params.Weights = dlarray(rand(10,10,3,50)); params.Bias = dlarray(rand(50,1));

Calculate the gradients for the convolution operation using the helper functionconvGradients，defined at the end of this example.

gradients = dlfeval(@convGradients,dlX,params);

Define the regularization factor.

L1Factor = 0.001;

Create an anonymous function that regularizes the gradients. By using an anonymous function to pass a scalar constant to the function, you can avoid having to expand the constant value to the same size and structure as the parameter variable.

L1Regularizer = @(grad,param) grad + L1Factor.*sign(param);

Usedlupdateto apply the regularization function to each of the gradients.

gradients = dlupdate(L1Regularizer,gradients,params);

The gradients ingradsare now regularized according to the functionL1Regularizer.

convGradientsFunction

TheconvGradientshelper function takes the learnable parameters of the convolution operation and a mini-batch of input datadlX，and returns the gradients with respect to the learnable parameters.

functiongradients = convGradients(dlX,params) dlY = dlconv(dlX,params.Weights,params.Bias); dlY = sum(dlY,'all'); gradients = dlgradient(dlY,params);end

Use`dlupdate`to Train Network Using Custom Update Function

Open Live Script

Usedlupdateto train a network using a custom update function that implements the stochastic gradient descent algorithm (without momentum).

Load Training Data

Load the digits training data.

[XTrain,TTrain] = digitTrain4DArrayData; classes = categories(TTrain); numClasses = numel(classes);

Define the Network

Define the network architecture and specify the average image value using theMeanoption in the image input layer.

layers = [ imageInputLayer([28 28 1],'Mean'，mean(XTrain,4)) convolution2dLayer(5,20) reluLayer convolution2dLayer(3,20,'Padding'，1) reluLayer convolution2dLayer(3,20,'Padding'，1) reluLayer fullyConnectedLayer(numClasses) softmaxLayer];

Create adlnetworkobject from the layer array.

net = dlnetwork(layers);

Define Model Loss Function

Create the helper functionmodelLoss，listed at the end of this example. The function takes adlnetworkobject and a mini-batch of input data with corresponding labels, and returns the loss and the gradients of the loss with respect to the learnable parameters.

Define Stochastic Gradient Descent Function

Create the helper functionsgdFunction，listed at the end of this example. The function takes the parameters and the gradients of the loss with respect to the parameters, and returns the updated parameters using the stochastic gradient descent algorithm, expressed as

$θ_{l + 1} = θ - α \nabla E (θ_{l})$

where $l$ 是迭代数, $α > 0$ is the learning rate, $θ$ is the parameter vector, and $E (θ)$ is the loss function.

Specify Training Options

Specify the options to use during training.

miniBatchSize = 128; numEpochs = 30; numObservations = numel(TTrain); numIterationsPerEpoch = floor(numObservations./miniBatchSize);

Specify the learning rate.

learnRate = 0.01;

Train Network

Calculate the total number of iterations for the training progress monitor.

numIterations = numEpochs * numIterationsPerEpoch;

Initialize theTrainingProgressMonitorobject. Because the timer starts when you create the monitor object, make sure that you create the object close to the training loop.

monitor = trainingProgressMonitor(Metrics="Loss"，Info="Epoch"，XLabel="Iteration");

Train the model using a custom training loop. For each epoch, shuffle the data and loop over mini-batches of data. Update the network parameters by callingdlupdatewith the functionsgdFunctiondefined at the end of this example. At the end of each epoch, display the training progress.

Train on a GPU, if one is available. Using a GPU requires Parallel Computing Toolbox™ and a supported GPU device. For information on supported devices, seeGPU Computing Requirements(Parallel Computing Toolbox).

iteration = 0; epoch = 0;whileepoch < numEpochs && ~monitor.Stop epoch = epoch + 1;% Shuffle data.idx = randperm(numel(TTrain)); XTrain = XTrain(:,:,:,idx); TTrain = TTrain(idx); i = 0;whilei < numIterationsPerEpoch && ~monitor.Stop i = i + 1; iteration = iteration + 1;% Read mini-batch of data and convert the labels to dummy% variables.idx = (i-1)*miniBatchSize+1:i*miniBatchSize; X = XTrain(:,:,:,idx); T = zeros(numClasses, miniBatchSize,"single");forc = 1:numClasses T(c,TTrain(idx)==classes(c)) = 1;end% Convert mini-batch of data to dlarray.X = dlarray(single(X),"SSCB");% If training on a GPU, then convert data to a gpuArray.ifcanUseGPU X = gpuArray(X);end% Evaluate the model loss and gradients using dlfeval and the% modelLoss function.[loss,gradients] = dlfeval(@modelLoss,net,X,T);% Update the network parameters using the SGD algorithm defined in% the sgdFunction helper function.updateFcn = @(net,gradients) sgdFunction(net,gradients,learnRate); net = dlupdate(updateFcn,net,gradients);% Update the training progress monitor.recordMetrics(monitor,iteration,Loss=loss); updateInfo(monitor,Epoch=epoch +" of "+ numEpochs); monitor.Progress = 100 * iteration/numIterations;endend

Test Network

Test the classification accuracy of the model by comparing the predictions on a test set with the true labels.

[XTest,TTest] = digitTest4DArrayData;

Convert the data to adlarraywith the dimension format"SSCB"(spatial, spatial, channel, batch). For GPU prediction, also convert the data to agpuArray.

XTest = dlarray(XTest,"SSCB");ifcanUseGPU XTest = gpuArray(XTest);end

To classify images using adlnetworkobject, use thepredictfunction and find the classes with the highest scores.

YTest = predict(net,XTest); [~,idx] = max(extractdata(YTest),[],1); YTest = classes(idx);

Evaluate the classification accuracy.

accuracy = mean(YTest==TTest)

accuracy = 0.9040

Model Loss Function

The helper functionmodelLosstakes adlnetworkobjectnetand a mini-batch of input dataXwith corresponding labelsT，and returns the loss and the gradients of the loss with respect to the learnable parameters innet. To compute the gradients automatically, use thedlgradientfunction.

function[loss,gradients] = modelLoss(net,X,T) Y = forward(net,X); loss = crossentropy(Y,T); gradients = dlgradient(loss,net.Learnables);end

Stochastic Gradient Descent Function

The helper functionsgdFunctiontakes the learnable parametersparameters，the gradients of the loss with with respect to the learnable parameters, and the learning ratelearnRate，and returns the updated parameters using the stochastic gradient descent algorithm, expressed as

$θ_{l + 1} = θ - α \nabla E (θ_{l})$

where $l$ 是迭代数, $α > 0$ is the learning rate, $θ$ is the parameter vector, and $E (θ)$ is the loss function.

functionparameters = sgdFunction(parameters,gradients,learnRate) parameters = parameters - learnRate .* gradients;end

Input Arguments

collapse all

`fun`—Function to apply
function handle

Function to apply to the learnable parameters, specified as a function handle.

dlupdateevaluatesfunwith each network learnable parameter as an input.funis evaluated as many times as there are arrays of learnable parameters innetorparams.

`net`—Network
`dlnetwork`object

Network, specified as adlnetworkobject.

The function updates theLearnablesproperty of thedlnetworkobject.net.Learnablesis a table with three variables:

Layer— Layer name, specified as a string scalar.
Parameter— Parameter name, specified as a string scalar.
Value— Value of parameter, specified as a cell array containing adlarray.

`params`—Network learnable parameters
`dlarray`|numeric array|cell array|structure|table

Network learnable parameters, specified as adlarray，a numeric array, a cell array, a structure, or a table.

If you specifyparamsas a table, it must contain the following three variables.

Layer— Layer name, specified as a string scalar.
Parameter— Parameter name, specified as a string scalar.
Value— Value of parameter, specified as a cell array containing adlarray.

You can specifyparams东北的容器可学的参数twork using a cell array, structure, or table, or nested cell arrays or structures. The learnable parameters inside the cell array, structure, or table must bedlarrayor numeric values of data typedoubleorsingle.

The input argumentA1,...,Anmust be provided with exactly the same data type, ordering, and fields (for structures) or variables (for tables) asparams.

Data Types:single|double|struct|table|cell

`A1,...,An`—Additional input arguments
`dlarray`|numeric array|cell array|structure|table

Additional input arguments tofun，specified asdlarrayobjects, numeric arrays, cell arrays, structures, or tables with aValuevariable.

The exact form ofA1,...,Andepends on the input network or learnable parameters. The following table shows the required format forA1,...,Anfor possible inputs todlupdate.

Input	Learnable Parameters	`A1,...,An`
`net`	Table`net.Learnables`containing`Layer`，`Parameter`，and`Value`variables. The`Value`variable consists of cell arrays that contain each learnable parameter as a`dlarray`.	Table with the same data type, variables, and ordering as`net.Learnables`.`A1,...,An`must have a`Value`variable consisting of cell arrays that contain the additional input arguments for the function`fun`to apply to each learnable parameter.
`params`	`dlarray`	`dlarray`with the same data type and ordering as`params`.
	Numeric array	Numeric array with the same data type and ordering as`params`.
	Cell array	Cell array with the same data types, structure, and ordering as`params`.
	Structure	Structure with the same data types, fields, and ordering as`params`.
	Table with`Layer`，`Parameter`，and`Value`variables. The`Value`variable must consist of cell arrays that contain each learnable parameter as a`dlarray`.	Table with the same data types, variables and ordering as`params`.`A1,...,An`must have a`Value`variable consisting of cell arrays that contain the additional input argument for the function`fun`to apply to each learnable parameter.

Output Arguments

collapse all

`netUpdated`— Updated network
`dlnetwork`object

Network, returned as adlnetworkobject.

The function updates theLearnablesproperty of thedlnetworkobject.

`params`— Updated network learnable parameters
`dlarray`| numeric array | cell array | structure | table

Updated network learnable parameters, returned as adlarray，a numeric array, a cell array, a structure, or a table with aValuevariable containing the updated learnable parameters of the network.

`X1,...,Xm`— Additional output arguments
`dlarray`| numeric array | cell array | structure | table

Additional output arguments from the functionfun，wherefunis a function handle to a function that returns multiple outputs, returned asdlarrayobjects, numeric arrays, cell arrays, structures, or tables with aValuevariable.

The exact form ofX1,...,Xmdepends on the input network or learnable parameters. The following table shows the returned format ofX1,...,Xmfor possible inputs todlupdate.

Input	Learnable parameters	`X1,...,Xm`
`net`	Table`net.Learnables`containing`Layer`，`Parameter`，and`Value`variables. The`Value`variable consists of cell arrays that contain each learnable parameter as a`dlarray`.	Table with the same data type, variables, and ordering as`net.Learnables`.`X1,...,Xm`has a`Value`variable consisting of cell arrays that contain the additional output arguments of the function`fun`applied to each learnable parameter.
`params`	`dlarray`	`dlarray`with the same data type and ordering as`params`.
	Numeric array	Numeric array with the same data type and ordering as`params`.
	Cell array	Cell array with the same data types, structure, and ordering as`params`.
	Structure	Structure with the same data types, fields, and ordering as`params`.
	Table with`Layer`，`Parameter`，and`Value`variables. The`Value`variable must consist of cell arrays that contain each learnable parameter as a`dlarray`.	Table with the same data types, variables. and ordering as`params`.`X1,...,Xm`has a`Value`variable consisting of cell arrays that contain the additional output argument of the function`fun`applied to each learnable parameter.

Extended Capabilities

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Usage notes and limitations:

When at least one of the following input arguments is agpuArrayor adlarraywith underlying data of typegpuArray，this function runs on the GPU.
- params
- A1,...,An

For more information, seeRun MATLAB Functions on a GPU(Parallel Computing Toolbox).

Version History

Introduced in R2019b

dlupdate

Syntax

Description

Examples

L1 Regularization with`dlupdate`

Use`dlupdate`to Train Network Using Custom Update Function

Input Arguments

`fun`—Function to apply
function handle

`net`—Network
`dlnetwork`object

`params`—Network learnable parameters
`dlarray`|numeric array|cell array|structure|table

`A1,...,An`—Additional input arguments
`dlarray`|numeric array|cell array|structure|table

Output Arguments

`netUpdated`— Updated network
`dlnetwork`object

`params`— Updated network learnable parameters
`dlarray`| numeric array | cell array | structure | table

`X1,...,Xm`— Additional output arguments
`dlarray`| numeric array | cell array | structure | table

Extended Capabilities

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

See Also

Topics

dlupdate

Syntax

Description

Examples

L1 Regularization withdlupdate

Usedlupdateto Train Network Using Custom Update Function

Input Arguments

fun—Function to applyfunction handle

net—Networkdlnetworkobject

params—Network learnable parametersdlarray|numeric array|cell array|structure|table

A1,...,An—Additional input argumentsdlarray|numeric array|cell array|structure|table

Output Arguments

netUpdated— Updated networkdlnetworkobject

params— Updated network learnable parametersdlarray| numeric array | cell array | structure | table

X1,...,Xm— Additional output argumentsdlarray| numeric array | cell array | structure | table

Extended Capabilities

GPU ArraysAccelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

See Also

Topics

L1 Regularization with`dlupdate`

Use`dlupdate`to Train Network Using Custom Update Function

`fun`—Function to apply
function handle

`net`—Network
`dlnetwork`object

`params`—Network learnable parameters
`dlarray`|numeric array|cell array|structure|table

`A1,...,An`—Additional input arguments
`dlarray`|numeric array|cell array|structure|table

`netUpdated`— Updated network
`dlnetwork`object

`params`— Updated network learnable parameters
`dlarray`| numeric array | cell array | structure | table

`X1,...,Xm`— Additional output arguments
`dlarray`| numeric array | cell array | structure | table

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.