dlgradient
Compute gradients for custom training loops using automatic differentiation
Syntax
Description
使用dlgradient
to compute derivatives using automatic differentiation for custom training loops.
Tip
对于大多数深度学习的任务,您可以使用一个pretrained network and adapt it to your own data. For an example showing how to use transfer learning to retrain a convolutional neural network to classify a new set of images, seeTrain Deep Learning Network to Classify New Images. Alternatively, you can create and train networks from scratch usinglayerGraph
对象with thetrainNetwork
andtrainingOptions
functions.
If thetrainingOptions
function does not provide the training options that you need for your task, then you can create a custom training loop using automatic differentiation. To learn more, seeDefine Deep Learning Network for Custom Training Loops.
[
returns the gradients ofdydx1,...,dydxk
] = dlgradient(y
,x1,...,xk
)y
with respect to the variablesx1
throughxk
.
Calldlgradient
from inside a function passed todlfeval
. SeeCompute Gradient Using Automatic Differentiationand使用Automatic Differentiation In Deep Learning Toolbox.
[
returns the gradients and specifies additional options using one or more name-value pairs. For example,dydx1,...,dydxk
] = dlgradient(y
,x1,...,xk
,Name,Value
)dydx = dlgradient(y,x,'RetainData',true)
causes the gradient to retain intermediate values for reuse in subsequentdlgradient
calls. This syntax can save time, but uses more memory. For more information, seeTips.
Examples
Input Arguments
Output Arguments
Limitations
The
dlgradient
function does not support calculating higher-order derivatives when usingdlnetwork
对象containing custom layers with a custom backward function.The
dlgradient
function does not support calculating higher-order derivatives when usingdlnetwork
对象containing the following layers:gruLayer
lstmLayer
bilstmLayer
The
dlgradient
function does not support calculating higher-order derivatives that depend on the following functions:gru
lstm
embed
prod
interp1
More About
Tips
A
dlgradient
call must be inside a function. To obtain a numeric value of a gradient, you must evaluate the function usingdlfeval
, and the argument to the function must be adlarray
. See使用Automatic Differentiation In Deep Learning Toolbox.To enable the correct evaluation of gradients, the
y
argument must use only supported functions fordlarray
. SeeList of Functions with dlarray Support.If you set the
'RetainData'
name-value pair argument totrue
, the software preserves tracing for the duration of thedlfeval
function call instead of erasing the trace immediately after the derivative computation. This preservation can cause a subsequentdlgradient
call within the samedlfeval
call to be executed faster, but uses more memory. For example, in training an adversarial network, the'RetainData'
setting is useful because the two networks share data and functions during training. SeeTrain Generative Adversarial Network (GAN).When you need to calculate first-order derivatives only, ensure that the
'EnableHigherDerivatives'
option isfalse
as this is usually quicker and requires less memory.Complex gradients are calculated using the Wirtinger derivative. The gradient is defined in the direction of increase of the real part of the function to differentiate. This is because the variable to differentiate — for example, the loss — must be real, even if the function is complex.