违抗ne Custom Recurrent Deep Learning Layer
If Deep Learning Toolbox™ does not provide the layer you require for your task, then you can define your own custom layer using this example as a guide. For a list of built-in layers, seeList of Deep Learning Layers.
To define a custom deep learning layer, you can use the template provided in this example, which takes you through the following steps:
Name the layer — Give the layer a name so that you can use it in MATLAB®.
Declare the layer properties — Specify the properties of the layer, including learnable parameters and state parameters.
创建一个构造函数(可选)——指定how to construct the layer and initialize its properties. If you do not specify a constructor function, then at creation, the software initializes the
Name
,Description
, andType
properties with[]
and sets the number of layer inputs and outputs to 1.Create forward functions — Specify how data passes forward through the layer (forward propagation) at prediction time and at training time.
Create reset state function (optional) — Specify how to reset state parameters.
Create a backward function (optional) — Specify the derivatives of the loss with respect to the input data and the learnable parameters (backward propagation). If you do not specify a backward function, then the forward functions must support
dlarray
objects.
When defining the layer functions, you can usedlarray
objects.Usingdlarray
objects makes working with high dimensional data easier by allowing you to label the dimensions. For example, you can label which dimensions correspond to spatial, time, channel, and batch dimensions using the"S"
,"T"
,"C"
, and"B"
labels, respectively. For unspecified and other dimensions, use the"U"
label. Fordlarray
object functions that operate over particular dimensions, you can specify the dimension labels by formatting thedlarray
object directly, or by using theDataFormat
option.
Using formatteddlarray
objects in custom layers also allows you to define layers where the inputs and outputs have different formats, such as layers that permute, add, or remove dimensions. For example, you can define a layer that takes as input a mini-batch of images with the format"SSCB"
(spatial, spatial, channel, batch) and output a mini-batch of sequences with the format"CBT"
(channel, batch, time). Using formatteddlarray
objects also allows you to define layers that can operate on data with different input formats, for example, layers that support inputs with the formats"SSCB"
(spatial, spatial, channel, batch) and"CBT"
(channel, batch, time).
dlarray
objects also enable support for automatic differentiation. Consequently, if your forward functions fully supportdlarray
objects, then defining the backward function is optional.
To enable support for using formatteddlarray
objects in custom layer forward functions, also inherit from thennet.layer.Formattable
class when defining the custom layer. For an example, see违抗ne Custom Deep Learning Layer with Formatted Inputs.
This example shows how to define a peephole LSTM layer[1], which is a recurrent layer with learnable parameters, and use it in a neural network.A peephole LSTM layer is a variant of an LSTM layer, where the gate calculations use the layer cell state.
Intermediate Layer Template
Copy the intermediate layer template into a new file in MATLAB. This template gives the structure of an intermediate layer class definition. It outlines:
The optional
properties
blocks for the layer properties, learnable parameters, and state parameters.The layer constructor function.
The
predict
function and the optionalforward
function.The optional
resetState
function for layers with state properties.The optional
backward
function.
classdefmyLayer < nnet.layer.Layer% ...% & nnet.layer.Formattable ... % (Optional)% & nnet.layer.Acceleratable % (Optional)properties% (Optional) Layer properties.% Declare layer properties here.endproperties(Learnable)% (Optional) Layer learnable parameters.% Declare learnable parameters here.endproperties(State)% (Optional) Layer state parameters.% Declare state parameters here.endproperties(Learnable, State)% (Optional) Nested dlnetwork objects with both learnable% parameters and state parameters.% Declare nested networks with learnable and state parameters here.endmethodsfunctionlayer = myLayer()% (Optional) Create a myLayer.%这个函数必须具有相同的名称作为班ss.% Define layer constructor function here.endfunction[Z,state] = predict(layer,X)% Forward input data through the layer at prediction time and% output the result and updated state.%% Inputs:% layer - Layer to forward propagate through% X - Input data% Outputs:% Z - Output of layer forward function% state - (Optional) Updated layer state%% - For layers with multiple inputs, replace X with X1,...,XN,% where N is the number of inputs.% - For layers with multiple outputs, replace Z with% Z1,...,ZM, where M is the number of outputs.% - For layers with multiple state parameters, replace state% with state1,...,stateK, where K is the number of state% parameters.% Define layer predict function here.endfunction[Z,state,memory] = forward(layer,X)% (Optional) Forward input data through the layer at training% time and output the result, the updated state, and a memory% value.%% Inputs:% layer - Layer to forward propagate through% X - Layer input data% Outputs:% Z - Output of layer forward function% state - (Optional) Updated layer state% memory - (Optional) Memory value for custom backward% function%% - For layers with multiple inputs, replace X with X1,...,XN,% where N is the number of inputs.% - For layers with multiple outputs, replace Z with% Z1,...,ZM, where M is the number of outputs.% - For layers with multiple state parameters, replace state% with state1,...,stateK, where K is the number of state% parameters.% Define layer forward function here.endfunctionlayer = resetState(layer)% (Optional) Reset layer state.% Define reset state function here.endfunction[dLdX,dLdW,dLdSin] = backward(layer,X,Z,dLdZ,dLdSout,memory)% (Optional) Backward propagate the derivative of the loss% function through the layer.%% Inputs:%层-层反向传播% X - Layer input data% Z - Layer output data% dLdZ - Derivative of loss with respect to layer% output% dLdSout - (Optional) Derivative of loss with respect% to state output% memory - Memory value from forward function% Outputs:% dLdX - Derivative of loss with respect to layer input% dLdW - (Optional) Derivative of loss with respect to% learnable parameter% dLdSin - (Optional) Derivative of loss with respect to% state input%% - For layers with state parameters, the backward syntax must% include both dLdSout and dLdSin, or neither.% - For layers with multiple inputs, replace X and dLdX with% X1,...,XN and dLdX1,...,dLdXN, respectively, where N is% the number of inputs.% - For layers with multiple outputs, replace Z and dlZ with% Z1,...,ZM and dLdZ,...,dLdZM, respectively, where M is the% number of outputs.% - For layers with multiple learnable parameters, replace% dLdW with dLdW1,...,dLdWP, where P is the number of% learnable parameters.% - For layers with multiple state parameters, replace dLdSin%和dLdSout dLdSin1,……、dLdSinK和% dLdSout1,...,dldSoutK, respectively, where K is the number% of state parameters.% Define layer backward function here.endendend
Name Layer
First, give the layer a name. In the first line of the class file, replace the existing namemyLayer
withpeepholeLSTMLayer
. To allow the layer to output different data formats, for example data with the format"CBT"
(channel, batch, time) for sequence output and the format"CB"
(channel, batch) for single time step or feature output, also include thennet.layer.Formattable
mixin.
classdefpeepholeLSTMLayer < nnet.layer.Layer & nnet.layer.Formattable...end
Next, rename themyLayer
constructor function (the first function in themethods
section) so that it has the same name as the layer.
methodsfunctionlayer = peepholeLSTMLayer() ...end...end
Save Layer
Save the layer class file in a new file namedpeepholeLSTMLayer.m
. The file name must match the layer name. To use the layer, you must save the file in the current folder or in a folder on the MATLAB path.
Declare Properties, State, and Learnable Parameters
Declare the layer properties in theproperties
section, the layer states in theproperties (State)
section, and the learnable parameters in theproperties (Learnable)
section.
By default, custom intermediate layers have these properties. Do not declare these properties in theproperties
section.
Property | Description |
---|---|
Name |
Layer name, specified as a character vector or a string scalar. ForLayer array input, thetrainNetwork ,assembleNetwork ,layerGraph , anddlnetwork functions automatically assign names to layers with name'' . |
Description |
One-line description of the layer, specified as a string scalar or a character vector. This description appears when the layer is displayed in a If you do not specify a layer description, then the software displays the layer class name. |
Type |
Type of the layer, specified as a character vector or a string scalar. The value of If you do not specify a layer type, then the software displays the layer class name. |
NumInputs |
Number of inputs of the layer, specified as a positive integer. If you do not specify this value, then the software automatically setsNumInputs to the number of names inInputNames . The default value is 1. |
InputNames |
输入层的名称,指定为一个细胞array of character vectors. If you do not specify this value andNumInputs is greater than 1, then the software automatically setsInputNames to{'in1',...,'inN'} , whereN is equal toNumInputs . The default value is{'in'} . |
NumOutputs |
Number of outputs of the layer, specified as a positive integer. If you do not specify this value, then the software automatically setsNumOutputs to the number of names inOutputNames . The default value is 1. |
OutputNames |
Output names of the layer, specified as a cell array of character vectors. If you do not specify this value andNumOutputs is greater than 1, then the software automatically setsOutputNames to{'out1',...,'outM'} , whereM is equal toNumOutputs . The default value is{'out'} . |
If the layer has no other properties, then you can omit theproperties
section.
Tip
If you are creating a layer with multiple inputs, then you must set either theNumInputs
orInputNames
properties in the layer constructor. If you are creating a layer with multiple outputs, then you must set either theNumOutputs
orOutputNames
properties in the layer constructor.For an example, see违抗ne Custom Deep Learning Layer with Multiple Inputs.
Declare the following layer properties in theproperties
section:
NumHiddenUnits
— Number of hidden units in the peephole LSTM operationOutputMode
— Flag indicating whether the layer returns a sequence or a single time step
properties% Layer properties.NumHiddenUnits OutputModeend
A peephole LSTM layer has four learnable parameters: the input weights, the recurrent weights, the peephole weights, and the bias. Declare these learnable parameters in theproperties (Learnable)
section with the namesInputWeights
,RecurrentWeights
,PeepholeWeights
, andBias
, respectively.
properties(Learnable)% Layer learnable parameters.InputWeights RecurrentWeights PeepholeWeights Biasend
A peephole LSTM layer has two state parameters: the hidden state and the cell state. Declare these state parameters in theproperties (State)
section with the namesHiddenState
andCellState
, respectively.
properties(State)% Layer state parameters.HiddenState CellStateend
Parallel training of networks containing custom layers with state parameters using thetrainNetwork
function is not supported. When you train a network with custom layers with state parameters, theExecutionEnvironment
training option must be"auto"
,"gpu"
, or"cpu"
.
Create Constructor Function
Create the function that constructs the layer and initializes the layer properties. Specify any variables required to create the layer as inputs to the constructor function.
The peephole LSTM layer constructor function requires two input arguments (the number of hidden units and the number of input channels) and two optional arguments (the layer name and output mode). Specify two input arguments namednumHiddenUnits
andinputSize
in thepeepholeLSTMLayer
function that correspond to the number of hidden units and the number of input channels, respectively. Specify the optional input arguments as a single argument with the nameargs
. Add a comment to the top of the function that explains the syntaxes of the function.
functionlayer = peepholeLSTMLayer(numHiddenUnits,inputSize,args)%PEEPHOLELSTMLAYER Peephole LSTM Layer% layer = peepholeLSTMLayer(numHiddenUnits,inputSize)% creates a peephole LSTM layer with the specified number of% hidden units and input channels.%% layer = peepholeLSTMLayer(numHiddenUnits,inputSize,Name=Value)% creates a peephole LSTM layer and specifies additional% options using one or more name-value arguments:%% Name - Name of the layer, specified as a string.% The default is "".%% OutputMode - Output mode, specified as one of the% following:% "sequence" - Output the entire sequence% of data.%% "last" - Output the last time step% of the data.% The default is "sequence"....end
Initialize Layer Properties
Initialize the layer properties, including the learnable and state parameters in the constructor function. Replace the comment% Layer constructor function goes here
with code that initializes the layer properties.
Parse the input arguments using anarguments
block and set theName
and output properties.
argumentsnumHiddenUnits inputSize args.Name =""; args.OutputMode = "sequence"endlayer.NumHiddenUnits = numHiddenUnits; layer.Name = args.Name; layer.OutputMode = args.OutputMode;
Give the layer a one-line description by setting theDescription
property of the layer. Set the description to describe the type of the layer and its size.
% Set layer description.layer.Description ="Peephole LSTM with "+ numHiddenUnits +" hidden units";
Initialize the learnable parameters. Initialize the input weights using Glorot initialization. Initialize the recurrent weights using orthogonal initialization. Initialize the bias using unit-forget-gate normalization. This code uses the helper functionsinitializeGlorot
,initializeOrthogonal
, andinitializeUnitForgetGate
. To access these functions, open the example as a live script. For more information about initializing weights, seeInitialize Learnable Parameters for Model Function.
注意that the recurrent weights of a peephole LSTM layer and standard LSTM layers have different sizes. A peephole LSTM layer does not require recurrent weights for the cell candidate calculation, so the recurrent weights is a3*NumHiddenUnits
-by-NumHiddenUnits
array.
% Initialize weights and bias.sz = [4*numHiddenUnits inputSize]; numOut = 4*numHiddenUnits; numIn = inputSize; layer.InputWeights = initializeGlorot(sz,numOut,numIn); sz = [4*numHiddenUnits numHiddenUnits]; layer.RecurrentWeights = initializeOrthogonal(sz); sz = [3*numHiddenUnits 1]; numOut = 3*numHiddenUnits; numIn = 1; layer.PeepholeWeights = initializeGlorot(sz,numOut,numIn); layer.Bias = initializeUnitForgetGate(numHiddenUnits);
Initialize the layer state parameters. For convenience, use theresetState
function defined in the section.
% Initialize layer states.layer = resetState(layer);
View the completed constructor function.
functionlayer = peepholeLSTMLayer(numHiddenUnits,inputSize,args)%PEEPHOLELSTMLAYER Peephole LSTM Layer % layer = peepholeLSTMLayer(numHiddenUnits,inputSize) % creates a peephole LSTM layer with the specified number of % hidden units and input channels. % % layer = peepholeLSTMLayer(numHiddenUnits,inputSize,Name=Value) % creates a peephole LSTM layer and specifies additional % options using one or more name-value arguments: % % Name - Name of the layer, specified as a string. % The default is "". % % OutputMode - Output mode, specified as one of the % following: % "sequence" - Output the entire sequence % of data. % % "last" - Output the last time step % of the data. % The default is "sequence".% Parse input arguments.argumentsnumHiddenUnits inputSize args.Name =""; args.OutputMode ="sequence";endlayer.NumHiddenUnits = numHiddenUnits; layer.Name = args.Name; layer.OutputMode = args.OutputMode;% Set layer description.layer.Description ="Peephole LSTM with "+ numHiddenUnits +" hidden units";% Initialize weights and bias.sz = [4*numHiddenUnits inputSize]; numOut = 4*numHiddenUnits; numIn = inputSize; layer.InputWeights = initializeGlorot(sz,numOut,numIn); sz = [4*numHiddenUnits numHiddenUnits]; layer.RecurrentWeights = initializeOrthogonal(sz); sz = [3*numHiddenUnits 1]; numOut = 3*numHiddenUnits; numIn = 1; layer.PeepholeWeights = initializeGlorot(sz,numOut,numIn); layer.Bias = initializeUnitForgetGate(numHiddenUnits);% Initialize layer states.layer = resetState(layer);end
With this constructor function, the commandpeepholeLSTMLayer(200,12,OutputMode="last",Name="peephole")
creates a peephole LSTM layer with 200 hidden units, an input size of 12, and the name"peephole"
, and outputs the last time step of the peephole LSTM operation.
Create Predict Function
Create the layer forward functions to use at prediction time and training time.
Create a function namedpredict
that propagates the data forward through the layer atprediction timeand outputs the result.
Thepredict
function syntax depends on the type of layer.
Z = predict(layer,X)
forwards the input dataX
through the layer and outputs the resultZ
, wherelayer
has a single input and a single output.[Z,state] = predict(layer,X)
also outputs the updated state parameterstate
, wherelayer
has a single state parameter.
You can adjust the syntaxes for layers with multiple inputs, multiple outputs, or multiple state parameters:
For layers with multiple inputs, replace
X
withX1,...,XN
, whereN
is the number of inputs. TheNumInputs
property must matchN
.For layers with multiple outputs, replace
Z
withZ1,...,ZM
, whereM
is the number of outputs. TheNumOutputs
property must matchM
.For layers with multiple state parameters, replace
state
withstate1,...,stateK
, whereK
is the number of state parameters.
Tip
If the number of inputs to the layer can vary, then usevarargin
instead ofX1,…,XN
. In this case,varargin
is a cell array of the inputs, wherevarargin{i}
corresponds toXi
.
If the number of outputs can vary, then usevarargout
instead ofZ1,…,ZN
. In this case,varargout
is a cell array of the outputs, wherevarargout{j}
corresponds toZj
.
Tip
If the custom layer has adlnetwork
object for a learnable parameter, then in thepredict
function of the custom layer, use thepredict
function for thedlnetwork
. When you do so, thedlnetwork
objectpredict
function uses the appropriate layer operations for prediction.
Because a peephole LSTM layer has only one input, one output, and two state parameters, the syntax forpredict
for a peephole LSTM layer is[Z,hiddenState,cellState] = predict(layer,X)
.
By default, the layer usespredict
as the forward function at training time. To use a different forward function at training time, or retain a value required for a custom backward function, you must also create a function namedforward
.
Because the layer inherits fromnnet.layer.Formattable
, the layer inputs are formatteddlarray
objects and thepredict
function must also output data as formatteddlarray
objects.
The hidden state at time steptis given by
denotes the Hadamard product (element-wise multiplication of vectors).
The cell state at time steptis given by
The following formulas describe the components at time stept.
Component | Formula |
---|---|
Input gate | |
Forget gate | |
Cell candidate | |
Output gate |
注意that the output gate calculation requires the updated cell state .
In these calculations, and denote the gate and state activation functions. For peephole LSTM layers, use the sigmoid and hyperbolic tangent functions as the gate and state activation functions, respectively.
Implement this operation in thepredict
function. Because the layer does not require a different forward function for training or a memory value for a custom backward function, you can remove theforward
function from the class file. Add a comment to the top of the function that explains the syntaxes of the function.
Tip
If you preallocate arrays using functions such aszeros
, then you must ensure that the data types of these arrays are consistent with the layer function inputs. To create an array of zeros of the same data type as another array, use the"like"
option ofzeros
. For example, to initialize an array of zeros of sizesz
with the same data type as the arrayX
, useZ = zeros(sz,"like",X)
.
function[Z,cellState,hiddenState] = predict(layer,X)%PREDICT Peephole LSTM predict function % [Z,hiddenState,cellState] = predict(layer,X) forward % propagates the data X through the layer and returns the % layer output Z and the updated hidden and cell states. X % is a dlarray with format "CBT" and Z is a dlarray with % format "CB" or "CBT", depending on the layer OutputMode % property.% Initialize sequence output.numHiddenUnits = layer.NumHiddenUnits; miniBatchSize = size(X,finddim(X,"B")); numTimeSteps = size(X,finddim(X,"T"));iflayer.OutputMode =="sequence"Z = zeros(numHiddenUnits,miniBatchSize,numTimeSteps,"like",X); Z = dlarray(Z,"CBT");end% Calculate WX + b.X = stripdims(X); WX = pagemtimes(layer.InputWeights,X) + layer.Bias;% Indices of concatenated weight arrays.idx1 = 1:numHiddenUnits; idx2 = 1+numHiddenUnits:2*numHiddenUnits; idx3 = 1+2*numHiddenUnits:3*numHiddenUnits; idx4 = 1+3*numHiddenUnits:4*numHiddenUnits;% Initial states.hiddenState = layer.HiddenState; cellState = layer.CellState;% Loop over time steps.fort = 1:numTimeSteps% Calculate R*h_{t-1}.Rht = layer.RecurrentWeights * hiddenState;% Calculate p*c_{t-1}.皮克特人= layer.PeepholeWeights (idx1)。* cellState;pfct = layer.PeepholeWeights(idx2) .* cellState;% Gate calculations.it = sigmoid(WX(idx1,:,t) + Rht(idx1,:) + pict); ft = sigmoid(WX(idx2,:,t) + Rht(idx2,:) + pfct); gt = tanh(WX(idx3,:,t) + Rht(idx3,:));% Calculate ot using updated cell state.cellState = gt .* it + cellState .* ft; poct = layer.PeepholeWeights(idx3) .* cellState; ot = sigmoid(WX(idx4,:,t) + Rht(idx4,:) + poct);% Update hidden state.hiddenState = tanh(cellState) .* ot;% Update sequence output.iflayer.OutputMode =="sequence"Z(:,:,t) = hiddenState;endend% Last time step output.iflayer.OutputMode =="last"Z = dlarray(hiddenState,"CB");endend
Because thepredict
function uses only functions that supportdlarray
objects, defining thebackward
function is optional. For a list of functions that supportdlarray
objects, seeList of Functions with dlarray Support.
Create Reset State Function
WhenDAGNetwork
orSeriesNetwork
objects contain layers with state parameters, you can make predictions and update the layer states using thepredictAndUpdateState
andclassifyAndUpdateState
functions. You can reset the network state using theresetState
function.
TheresetState
function forDAGNetwork
,SeriesNetwork
, anddlnetwork
objects, by default, has no effect on custom layers with state parameters. To define the layer behavior for theresetState
function for network objects, define the optional layerresetState
function in the layer definition that resets the state parameters.
TheresetState
function must have the syntaxlayer = resetState(layer)
, where the returned layer has the reset state properties.
Create a function namedresetState
that resets the layer state parameters to vectors of zeros.
functionlayer = resetState(layer)%RESETSTATE Reset layer state % layer = resetState(layer) resets the state properties of the % layer.numHiddenUnits = layer.NumHiddenUnits; layer.HiddenState = zeros(numHiddenUnits,1); layer.CellState = zeros(numHiddenUnits,1);end
Completed Layer
View the completed layer class file.
classdefpeepholeLSTMLayer < nnet.layer.Layer & nnet.layer.Formattable%PEEPHOLELSTMLAYER Peephole LSTM Layerproperties% Layer properties.NumHiddenUnits OutputModeendproperties(Learnable)% Layer learnable parameters.InputWeights RecurrentWeights PeepholeWeights Biasendproperties(State)% Layer state parameters.HiddenState CellStateendmethodsfunctionlayer = peepholeLSTMLayer(numHiddenUnits,inputSize,args)%PEEPHOLELSTMLAYER Peephole LSTM Layer % layer = peepholeLSTMLayer(numHiddenUnits,inputSize) % creates a peephole LSTM layer with the specified number of % hidden units and input channels. % % layer = peepholeLSTMLayer(numHiddenUnits,inputSize,Name=Value) % creates a peephole LSTM layer and specifies additional % options using one or more name-value arguments: % % Name - Name of the layer, specified as a string. % The default is "". % % OutputMode - Output mode, specified as one of the % following: % "sequence" - Output the entire sequence % of data. % % "last" - Output the last time step % of the data. % The default is "sequence".% Parse input arguments.argumentsnumHiddenUnits inputSize args.Name =""; args.OutputMode ="sequence";endlayer.NumHiddenUnits = numHiddenUnits; layer.Name = args.Name; layer.OutputMode = args.OutputMode;% Set layer description.layer.Description ="Peephole LSTM with "+ numHiddenUnits +" hidden units";% Initialize weights and bias.sz = [4*numHiddenUnits inputSize]; numOut = 4*numHiddenUnits; numIn = inputSize; layer.InputWeights = initializeGlorot(sz,numOut,numIn); sz = [4*numHiddenUnits numHiddenUnits]; layer.RecurrentWeights = initializeOrthogonal(sz); sz = [3*numHiddenUnits 1]; numOut = 3*numHiddenUnits; numIn = 1; layer.PeepholeWeights = initializeGlorot(sz,numOut,numIn); layer.Bias = initializeUnitForgetGate(numHiddenUnits);% Initialize layer states.layer = resetState(layer);endfunction[Z,cellState,hiddenState] = predict(layer,X)%PREDICT Peephole LSTM predict function % [Z,hiddenState,cellState] = predict(layer,X) forward % propagates the data X through the layer and returns the % layer output Z and the updated hidden and cell states. X % is a dlarray with format "CBT" and Z is a dlarray with % format "CB" or "CBT", depending on the layer OutputMode % property.% Initialize sequence output.numHiddenUnits = layer.NumHiddenUnits; miniBatchSize = size(X,finddim(X,"B")); numTimeSteps = size(X,finddim(X,"T"));iflayer.OutputMode =="sequence"Z = zeros(numHiddenUnits,miniBatchSize,numTimeSteps,"like",X); Z = dlarray(Z,"CBT");end% Calculate WX + b.X = stripdims(X); WX = pagemtimes(layer.InputWeights,X) + layer.Bias;% Indices of concatenated weight arrays.idx1 = 1:numHiddenUnits; idx2 = 1+numHiddenUnits:2*numHiddenUnits; idx3 = 1+2*numHiddenUnits:3*numHiddenUnits; idx4 = 1+3*numHiddenUnits:4*numHiddenUnits;% Initial states.hiddenState = layer.HiddenState; cellState = layer.CellState;% Loop over time steps.fort = 1:numTimeSteps% Calculate R*h_{t-1}.Rht = layer.RecurrentWeights * hiddenState;% Calculate p*c_{t-1}.皮克特人= layer.PeepholeWeights (idx1)。* cellState;pfct = layer.PeepholeWeights(idx2) .* cellState;% Gate calculations.it = sigmoid(WX(idx1,:,t) + Rht(idx1,:) + pict); ft = sigmoid(WX(idx2,:,t) + Rht(idx2,:) + pfct); gt = tanh(WX(idx3,:,t) + Rht(idx3,:));% Calculate ot using updated cell state.cellState = gt .* it + cellState .* ft; poct = layer.PeepholeWeights(idx3) .* cellState; ot = sigmoid(WX(idx4,:,t) + Rht(idx4,:) + poct);% Update hidden state.hiddenState = tanh(cellState) .* ot;% Update sequence output.iflayer.OutputMode =="sequence"Z(:,:,t) = hiddenState;endend% Last time step output.iflayer.OutputMode =="last"Z = dlarray(hiddenState,"CB");endendfunctionlayer = resetState(layer)%RESETSTATE Reset layer state % layer = resetState(layer) resets the state properties of the % layer.numHiddenUnits = layer.NumHiddenUnits; layer.HiddenState = zeros(numHiddenUnits,1); layer.CellState = zeros(numHiddenUnits,1);endendend
GPU Compatibility
If the layer forward functions fully supportdlarray
objects, then the layer is GPU compatible. Otherwise, to be GPU compatible, the layer functions must support inputs and return outputs of typegpuArray
(Parallel Computing Toolbox).
Many MATLAB built-in functions supportgpuArray
(Parallel Computing Toolbox)anddlarray
input arguments. For a list of functions that supportdlarray
objects, seeList of Functions with dlarray Support. For a list of functions that execute on a GPU, seeRun MATLAB Functions on a GPU(Parallel Computing Toolbox).To use a GPU for deep learning, you must also have a supported GPU device. For information on supported devices, seeGPU Support by Release(Parallel Computing Toolbox).For more information on working with GPUs in MATLAB, seeGPU Computing in MATLAB(Parallel Computing Toolbox).
In this example, the MATLAB functions used inpredict
all supportdlarray
objects, so the layer is GPU compatible.
Include Custom Layer in Network
You can use a custom layer in the same way as any other layer in Deep Learning Toolbox. Create and train a network for sequence classification using the peephole LSTM layer you created earlier.
Load the example training data.
[XTrain,TTrain] = japaneseVowelsTrainData;
违抗ne the network architecture. Create a layer array containing a peephole LSTM layer.
inputSize = 12; numHiddenUnits = 100; numClasses = 9; layers = [ sequenceInputLayer(inputSize) peepholeLSTMLayer(numHiddenUnits,inputSize,OutputMode="last") fullyConnectedLayer(numClasses) softmaxLayer classificationLayer];
Specify the training options and train the network. Train with a mini-batch size of 27 and left-pad the data.
options = trainingOptions("adam",MiniBatchSize=27,SequencePaddingDirection="left"); net = trainNetwork(XTrain,TTrain,layers,options);
Training on single CPU. |========================================================================================| | Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning | | | | (hh:mm:ss) | Accuracy | Loss | Rate | |========================================================================================| | 1 | 1 | 00:00:01 | 3.70% | 2.2060 | 0.0010 | | 5 | 50 | 00:00:26 | 92.59% | 0.5917 | 0.0010 | | 10 | 100 | 00:00:45 | 92.59% | 0.2182 | 0.0010 | | 15 | 150 | 00:01:01 | 100.00% | 0.0588 | 0.0010 | | 20 | 200 | 00:01:17 | 96.30% | 0.0872 | 0.0010 | | 25 | 250 | 00:01:37 | 100.00% | 0.0329 | 0.0010 | | 30 | 300 | 00:01:54 | 100.00% | 0.0141 | 0.0010 | |========================================================================================| Training finished: Max epochs completed.
Evaluate the network performance by predicting on new data and calculating the accuracy.
[XTest,TTest] = japaneseVowelsTestData; YTest = classify(net,XTest,MiniBatchSize=27); accuracy = mean(YTest==TTest)
accuracy = 0.8703
References
[1] Greff, Klaus, Rupesh K. Srivastava, Jan Koutník, Bas R. Steunebrink, and Jürgen Schmidhuber. "LSTM: A Search Space Odyssey."IEEE Transactions on Neural Networks and Learning Systems28, no. 10 (2016): 2222–2232.
See Also
functionLayer
|checkLayer
|setLearnRateFactor
|setL2Factor
|getLearnRateFactor
|getL2Factor
|findPlaceholderLayers
|replaceLayer
|assembleNetwork
|PlaceholderLayer
Related Topics
- 违抗ne Custom Deep Learning Intermediate Layers
- 违抗ne Custom Deep Learning Output Layers
- 违抗ne Custom Deep Learning Layer with Learnable Parameters
- 违抗ne Custom Deep Learning Layer with Multiple Inputs
- 违抗ne Custom Deep Learning Layer with Formatted Inputs
- Specify Custom Layer Backward Function
- 违抗ne Custom Deep Learning Layer for Code Generation
- 违抗ne Nested Deep Learning Layer
- Check Custom Layer Validity