Main Content

违抗ne Custom Recurrent Deep Learning Layer

If Deep Learning Toolbox™ does not provide the layer you require for your task, then you can define your own custom layer using this example as a guide. For a list of built-in layers, seeList of Deep Learning Layers.

To define a custom deep learning layer, you can use the template provided in this example, which takes you through the following steps:

  1. Name the layer — Give the layer a name so that you can use it in MATLAB®.

  2. Declare the layer properties — Specify the properties of the layer, including learnable parameters and state parameters.

  3. 创建一个构造函数(可选)——指定how to construct the layer and initialize its properties. If you do not specify a constructor function, then at creation, the software initializes theName,Description, andTypeproperties with[]and sets the number of layer inputs and outputs to 1.

  4. Create forward functions — Specify how data passes forward through the layer (forward propagation) at prediction time and at training time.

  5. Create reset state function (optional) — Specify how to reset state parameters.

  6. Create a backward function (optional) — Specify the derivatives of the loss with respect to the input data and the learnable parameters (backward propagation). If you do not specify a backward function, then the forward functions must supportdlarrayobjects.

When defining the layer functions, you can usedlarrayobjects.Usingdlarrayobjects makes working with high dimensional data easier by allowing you to label the dimensions. For example, you can label which dimensions correspond to spatial, time, channel, and batch dimensions using the"S","T","C", and"B"labels, respectively. For unspecified and other dimensions, use the"U"label. Fordlarrayobject functions that operate over particular dimensions, you can specify the dimension labels by formatting thedlarrayobject directly, or by using theDataFormatoption.

Using formatteddlarrayobjects in custom layers also allows you to define layers where the inputs and outputs have different formats, such as layers that permute, add, or remove dimensions. For example, you can define a layer that takes as input a mini-batch of images with the format"SSCB"(spatial, spatial, channel, batch) and output a mini-batch of sequences with the format"CBT"(channel, batch, time). Using formatteddlarrayobjects also allows you to define layers that can operate on data with different input formats, for example, layers that support inputs with the formats"SSCB"(spatial, spatial, channel, batch) and"CBT"(channel, batch, time).

dlarrayobjects also enable support for automatic differentiation. Consequently, if your forward functions fully supportdlarrayobjects, then defining the backward function is optional.

To enable support for using formatteddlarrayobjects in custom layer forward functions, also inherit from thennet.layer.Formattableclass when defining the custom layer. For an example, see违抗ne Custom Deep Learning Layer with Formatted Inputs.

This example shows how to define a peephole LSTM layer[1], which is a recurrent layer with learnable parameters, and use it in a neural network.A peephole LSTM layer is a variant of an LSTM layer, where the gate calculations use the layer cell state.

Intermediate Layer Template

Copy the intermediate layer template into a new file in MATLAB. This template gives the structure of an intermediate layer class definition. It outlines:

  • The optionalpropertiesblocks for the layer properties, learnable parameters, and state parameters.

  • The layer constructor function.

  • Thepredictfunction and the optionalforwardfunction.

  • The optionalresetStatefunction for layers with state properties.

  • The optionalbackwardfunction.

classdefmyLayer < nnet.layer.Layer% ...% & nnet.layer.Formattable ... % (Optional)% & nnet.layer.Acceleratable % (Optional)properties% (Optional) Layer properties.% Declare layer properties here.endproperties(Learnable)% (Optional) Layer learnable parameters.% Declare learnable parameters here.endproperties(State)% (Optional) Layer state parameters.% Declare state parameters here.endproperties(Learnable, State)% (Optional) Nested dlnetwork objects with both learnable% parameters and state parameters.% Declare nested networks with learnable and state parameters here.endmethodsfunctionlayer = myLayer()% (Optional) Create a myLayer.%这个函数必须具有相同的名称作为班ss.% Define layer constructor function here.endfunction[Z,state] = predict(layer,X)% Forward input data through the layer at prediction time and% output the result and updated state.%% Inputs:% layer - Layer to forward propagate through% X - Input data% Outputs:% Z - Output of layer forward function% state - (Optional) Updated layer state%% - For layers with multiple inputs, replace X with X1,...,XN,% where N is the number of inputs.% - For layers with multiple outputs, replace Z with% Z1,...,ZM, where M is the number of outputs.% - For layers with multiple state parameters, replace state% with state1,...,stateK, where K is the number of state% parameters.% Define layer predict function here.endfunction[Z,state,memory] = forward(layer,X)% (Optional) Forward input data through the layer at training% time and output the result, the updated state, and a memory% value.%% Inputs:% layer - Layer to forward propagate through% X - Layer input data% Outputs:% Z - Output of layer forward function% state - (Optional) Updated layer state% memory - (Optional) Memory value for custom backward% function%% - For layers with multiple inputs, replace X with X1,...,XN,% where N is the number of inputs.% - For layers with multiple outputs, replace Z with% Z1,...,ZM, where M is the number of outputs.% - For layers with multiple state parameters, replace state% with state1,...,stateK, where K is the number of state% parameters.% Define layer forward function here.endfunctionlayer = resetState(layer)% (Optional) Reset layer state.% Define reset state function here.endfunction[dLdX,dLdW,dLdSin] = backward(layer,X,Z,dLdZ,dLdSout,memory)% (Optional) Backward propagate the derivative of the loss% function through the layer.%% Inputs:%层-层反向传播% X - Layer input data% Z - Layer output data% dLdZ - Derivative of loss with respect to layer% output% dLdSout - (Optional) Derivative of loss with respect% to state output% memory - Memory value from forward function% Outputs:% dLdX - Derivative of loss with respect to layer input% dLdW - (Optional) Derivative of loss with respect to% learnable parameter% dLdSin - (Optional) Derivative of loss with respect to% state input%% - For layers with state parameters, the backward syntax must% include both dLdSout and dLdSin, or neither.% - For layers with multiple inputs, replace X and dLdX with% X1,...,XN and dLdX1,...,dLdXN, respectively, where N is% the number of inputs.% - For layers with multiple outputs, replace Z and dlZ with% Z1,...,ZM and dLdZ,...,dLdZM, respectively, where M is the% number of outputs.% - For layers with multiple learnable parameters, replace% dLdW with dLdW1,...,dLdWP, where P is the number of% learnable parameters.% - For layers with multiple state parameters, replace dLdSin%和dLdSout dLdSin1,……、dLdSinK和% dLdSout1,...,dldSoutK, respectively, where K is the number% of state parameters.% Define layer backward function here.endendend

Name Layer

First, give the layer a name. In the first line of the class file, replace the existing namemyLayerwithpeepholeLSTMLayer. To allow the layer to output different data formats, for example data with the format"CBT"(channel, batch, time) for sequence output and the format"CB"(channel, batch) for single time step or feature output, also include thennet.layer.Formattablemixin.

classdefpeepholeLSTMLayer < nnet.layer.Layer & nnet.layer.Formattable...end

Next, rename themyLayerconstructor function (the first function in themethodssection) so that it has the same name as the layer.

methodsfunctionlayer = peepholeLSTMLayer() ...end...end

Save Layer

Save the layer class file in a new file namedpeepholeLSTMLayer.m. The file name must match the layer name. To use the layer, you must save the file in the current folder or in a folder on the MATLAB path.

Declare Properties, State, and Learnable Parameters

Declare the layer properties in thepropertiessection, the layer states in theproperties (State)section, and the learnable parameters in theproperties (Learnable)section.

By default, custom intermediate layers have these properties. Do not declare these properties in thepropertiessection.

Property Description
Name Layer name, specified as a character vector or a string scalar. ForLayerarray input, thetrainNetwork,assembleNetwork,layerGraph, anddlnetworkfunctions automatically assign names to layers with name''.
Description

One-line description of the layer, specified as a string scalar or a character vector. This description appears when the layer is displayed in aLayerarray.

If you do not specify a layer description, then the software displays the layer class name.

Type

Type of the layer, specified as a character vector or a string scalar. The value ofTypeappears when the layer is displayed in aLayerarray.

If you do not specify a layer type, then the software displays the layer class name.

NumInputs Number of inputs of the layer, specified as a positive integer. If you do not specify this value, then the software automatically setsNumInputsto the number of names inInputNames. The default value is 1.
InputNames 输入层的名称,指定为一个细胞array of character vectors. If you do not specify this value andNumInputsis greater than 1, then the software automatically setsInputNamesto{'in1',...,'inN'}, whereNis equal toNumInputs. The default value is{'in'}.
NumOutputs Number of outputs of the layer, specified as a positive integer. If you do not specify this value, then the software automatically setsNumOutputsto the number of names inOutputNames. The default value is 1.
OutputNames Output names of the layer, specified as a cell array of character vectors. If you do not specify this value andNumOutputsis greater than 1, then the software automatically setsOutputNamesto{'out1',...,'outM'}, whereMis equal toNumOutputs. The default value is{'out'}.

If the layer has no other properties, then you can omit thepropertiessection.

Tip

If you are creating a layer with multiple inputs, then you must set either theNumInputsorInputNamesproperties in the layer constructor. If you are creating a layer with multiple outputs, then you must set either theNumOutputsorOutputNamesproperties in the layer constructor.For an example, see违抗ne Custom Deep Learning Layer with Multiple Inputs.

Declare the following layer properties in thepropertiessection:

  • NumHiddenUnits— Number of hidden units in the peephole LSTM operation

  • OutputMode— Flag indicating whether the layer returns a sequence or a single time step

properties% Layer properties.NumHiddenUnits OutputModeend

A peephole LSTM layer has four learnable parameters: the input weights, the recurrent weights, the peephole weights, and the bias. Declare these learnable parameters in theproperties (Learnable)section with the namesInputWeights,RecurrentWeights,PeepholeWeights, andBias, respectively.

properties(Learnable)% Layer learnable parameters.InputWeights RecurrentWeights PeepholeWeights Biasend

A peephole LSTM layer has two state parameters: the hidden state and the cell state. Declare these state parameters in theproperties (State)section with the namesHiddenStateandCellState, respectively.

properties(State)% Layer state parameters.HiddenState CellStateend

Parallel training of networks containing custom layers with state parameters using thetrainNetworkfunction is not supported. When you train a network with custom layers with state parameters, theExecutionEnvironmenttraining option must be"auto","gpu", or"cpu".

Create Constructor Function

Create the function that constructs the layer and initializes the layer properties. Specify any variables required to create the layer as inputs to the constructor function.

The peephole LSTM layer constructor function requires two input arguments (the number of hidden units and the number of input channels) and two optional arguments (the layer name and output mode). Specify two input arguments namednumHiddenUnitsandinputSizein thepeepholeLSTMLayerfunction that correspond to the number of hidden units and the number of input channels, respectively. Specify the optional input arguments as a single argument with the nameargs. Add a comment to the top of the function that explains the syntaxes of the function.

functionlayer = peepholeLSTMLayer(numHiddenUnits,inputSize,args)%PEEPHOLELSTMLAYER Peephole LSTM Layer% layer = peepholeLSTMLayer(numHiddenUnits,inputSize)% creates a peephole LSTM layer with the specified number of% hidden units and input channels.%% layer = peepholeLSTMLayer(numHiddenUnits,inputSize,Name=Value)% creates a peephole LSTM layer and specifies additional% options using one or more name-value arguments:%% Name - Name of the layer, specified as a string.% The default is "".%% OutputMode - Output mode, specified as one of the% following:% "sequence" - Output the entire sequence% of data.%% "last" - Output the last time step% of the data.% The default is "sequence"....end

Initialize Layer Properties

Initialize the layer properties, including the learnable and state parameters in the constructor function. Replace the comment% Layer constructor function goes herewith code that initializes the layer properties.

Parse the input arguments using anargumentsblock and set theNameand output properties.

argumentsnumHiddenUnits inputSize args.Name =""; args.OutputMode = "sequence"endlayer.NumHiddenUnits = numHiddenUnits; layer.Name = args.Name; layer.OutputMode = args.OutputMode;

Give the layer a one-line description by setting theDescriptionproperty of the layer. Set the description to describe the type of the layer and its size.

% Set layer description.layer.Description ="Peephole LSTM with "+ numHiddenUnits +" hidden units";

Initialize the learnable parameters. Initialize the input weights using Glorot initialization. Initialize the recurrent weights using orthogonal initialization. Initialize the bias using unit-forget-gate normalization. This code uses the helper functionsinitializeGlorot,initializeOrthogonal, andinitializeUnitForgetGate. To access these functions, open the example as a live script. For more information about initializing weights, seeInitialize Learnable Parameters for Model Function.

注意that the recurrent weights of a peephole LSTM layer and standard LSTM layers have different sizes. A peephole LSTM layer does not require recurrent weights for the cell candidate calculation, so the recurrent weights is a3*NumHiddenUnits-by-NumHiddenUnitsarray.

% Initialize weights and bias.sz = [4*numHiddenUnits inputSize]; numOut = 4*numHiddenUnits; numIn = inputSize; layer.InputWeights = initializeGlorot(sz,numOut,numIn); sz = [4*numHiddenUnits numHiddenUnits]; layer.RecurrentWeights = initializeOrthogonal(sz); sz = [3*numHiddenUnits 1]; numOut = 3*numHiddenUnits; numIn = 1; layer.PeepholeWeights = initializeGlorot(sz,numOut,numIn); layer.Bias = initializeUnitForgetGate(numHiddenUnits);

Initialize the layer state parameters. For convenience, use theresetStatefunction defined in the section.

% Initialize layer states.layer = resetState(layer);

View the completed constructor function.

functionlayer = peepholeLSTMLayer(numHiddenUnits,inputSize,args)%PEEPHOLELSTMLAYER Peephole LSTM Layer % layer = peepholeLSTMLayer(numHiddenUnits,inputSize) % creates a peephole LSTM layer with the specified number of % hidden units and input channels. % % layer = peepholeLSTMLayer(numHiddenUnits,inputSize,Name=Value) % creates a peephole LSTM layer and specifies additional % options using one or more name-value arguments: % % Name - Name of the layer, specified as a string. % The default is "". % % OutputMode - Output mode, specified as one of the % following: % "sequence" - Output the entire sequence % of data. % % "last" - Output the last time step % of the data. % The default is "sequence".% Parse input arguments.argumentsnumHiddenUnits inputSize args.Name =""; args.OutputMode ="sequence";endlayer.NumHiddenUnits = numHiddenUnits; layer.Name = args.Name; layer.OutputMode = args.OutputMode;% Set layer description.layer.Description ="Peephole LSTM with "+ numHiddenUnits +" hidden units";% Initialize weights and bias.sz = [4*numHiddenUnits inputSize]; numOut = 4*numHiddenUnits; numIn = inputSize; layer.InputWeights = initializeGlorot(sz,numOut,numIn); sz = [4*numHiddenUnits numHiddenUnits]; layer.RecurrentWeights = initializeOrthogonal(sz); sz = [3*numHiddenUnits 1]; numOut = 3*numHiddenUnits; numIn = 1; layer.PeepholeWeights = initializeGlorot(sz,numOut,numIn); layer.Bias = initializeUnitForgetGate(numHiddenUnits);% Initialize layer states.layer = resetState(layer);end

With this constructor function, the commandpeepholeLSTMLayer(200,12,OutputMode="last",Name="peephole")creates a peephole LSTM layer with 200 hidden units, an input size of 12, and the name"peephole", and outputs the last time step of the peephole LSTM operation.

Create Predict Function

Create the layer forward functions to use at prediction time and training time.

Create a function namedpredictthat propagates the data forward through the layer atprediction timeand outputs the result.

Thepredictfunction syntax depends on the type of layer.

  • Z = predict(layer,X)forwards the input dataXthrough the layer and outputs the resultZ, wherelayerhas a single input and a single output.

  • [Z,state] = predict(layer,X)also outputs the updated state parameterstate, wherelayerhas a single state parameter.

You can adjust the syntaxes for layers with multiple inputs, multiple outputs, or multiple state parameters:

  • For layers with multiple inputs, replaceXwithX1,...,XN, whereNis the number of inputs. TheNumInputsproperty must matchN.

  • For layers with multiple outputs, replaceZwithZ1,...,ZM, whereMis the number of outputs. TheNumOutputsproperty must matchM.

  • For layers with multiple state parameters, replacestatewithstate1,...,stateK, whereKis the number of state parameters.

Tip

If the number of inputs to the layer can vary, then usevarargininstead ofX1,…,XN. In this case,vararginis a cell array of the inputs, wherevarargin{i}corresponds toXi.

If the number of outputs can vary, then usevarargoutinstead ofZ1,…,ZN. In this case,varargoutis a cell array of the outputs, wherevarargout{j}corresponds toZj.

Tip

If the custom layer has adlnetworkobject for a learnable parameter, then in thepredictfunction of the custom layer, use thepredictfunction for thedlnetwork. When you do so, thedlnetworkobjectpredictfunction uses the appropriate layer operations for prediction.

Because a peephole LSTM layer has only one input, one output, and two state parameters, the syntax forpredictfor a peephole LSTM layer is[Z,hiddenState,cellState] = predict(layer,X).

By default, the layer usespredictas the forward function at training time. To use a different forward function at training time, or retain a value required for a custom backward function, you must also create a function namedforward.

Because the layer inherits fromnnet.layer.Formattable, the layer inputs are formatteddlarrayobjects and thepredictfunction must also output data as formatteddlarrayobjects.

The hidden state at time steptis given by

h t = tanh ( c t ) o t ,

denotes the Hadamard product (element-wise multiplication of vectors).

The cell state at time steptis given by

c t = g t i t + c t 1 f t .

The following formulas describe the components at time stept.

Component Formula
Input gate i t = σ g ( W i x t + R i h t 1 + p i c t 1 + b i )
Forget gate f t = σ g ( W f x t + R f h t 1 + p f c t 1 + b f )
Cell candidate g t = σ c ( W g x t + R h h t 1 + b g )
Output gate o t = σ g ( W o x t + R o h t 1 + p o c t + b o )

注意that the output gate calculation requires the updated cell state c t .

In these calculations, σ g and σ c denote the gate and state activation functions. For peephole LSTM layers, use the sigmoid and hyperbolic tangent functions as the gate and state activation functions, respectively.

Implement this operation in thepredictfunction. Because the layer does not require a different forward function for training or a memory value for a custom backward function, you can remove theforwardfunction from the class file. Add a comment to the top of the function that explains the syntaxes of the function.

Tip

If you preallocate arrays using functions such aszeros, then you must ensure that the data types of these arrays are consistent with the layer function inputs. To create an array of zeros of the same data type as another array, use the"like"option ofzeros. For example, to initialize an array of zeros of sizeszwith the same data type as the arrayX, useZ = zeros(sz,"like",X).

function[Z,cellState,hiddenState] = predict(layer,X)%PREDICT Peephole LSTM predict function % [Z,hiddenState,cellState] = predict(layer,X) forward % propagates the data X through the layer and returns the % layer output Z and the updated hidden and cell states. X % is a dlarray with format "CBT" and Z is a dlarray with % format "CB" or "CBT", depending on the layer OutputMode % property.% Initialize sequence output.numHiddenUnits = layer.NumHiddenUnits; miniBatchSize = size(X,finddim(X,"B")); numTimeSteps = size(X,finddim(X,"T"));iflayer.OutputMode =="sequence"Z = zeros(numHiddenUnits,miniBatchSize,numTimeSteps,"like",X); Z = dlarray(Z,"CBT");end% Calculate WX + b.X = stripdims(X); WX = pagemtimes(layer.InputWeights,X) + layer.Bias;% Indices of concatenated weight arrays.idx1 = 1:numHiddenUnits; idx2 = 1+numHiddenUnits:2*numHiddenUnits; idx3 = 1+2*numHiddenUnits:3*numHiddenUnits; idx4 = 1+3*numHiddenUnits:4*numHiddenUnits;% Initial states.hiddenState = layer.HiddenState; cellState = layer.CellState;% Loop over time steps.fort = 1:numTimeSteps% Calculate R*h_{t-1}.Rht = layer.RecurrentWeights * hiddenState;% Calculate p*c_{t-1}.皮克特人= layer.PeepholeWeights (idx1)。* cellState;pfct = layer.PeepholeWeights(idx2) .* cellState;% Gate calculations.it = sigmoid(WX(idx1,:,t) + Rht(idx1,:) + pict); ft = sigmoid(WX(idx2,:,t) + Rht(idx2,:) + pfct); gt = tanh(WX(idx3,:,t) + Rht(idx3,:));% Calculate ot using updated cell state.cellState = gt .* it + cellState .* ft; poct = layer.PeepholeWeights(idx3) .* cellState; ot = sigmoid(WX(idx4,:,t) + Rht(idx4,:) + poct);% Update hidden state.hiddenState = tanh(cellState) .* ot;% Update sequence output.iflayer.OutputMode =="sequence"Z(:,:,t) = hiddenState;endend% Last time step output.iflayer.OutputMode =="last"Z = dlarray(hiddenState,"CB");endend

Because thepredictfunction uses only functions that supportdlarrayobjects, defining thebackwardfunction is optional. For a list of functions that supportdlarrayobjects, seeList of Functions with dlarray Support.

Create Reset State Function

WhenDAGNetworkorSeriesNetworkobjects contain layers with state parameters, you can make predictions and update the layer states using thepredictAndUpdateStateandclassifyAndUpdateStatefunctions. You can reset the network state using theresetStatefunction.

TheresetStatefunction forDAGNetwork,SeriesNetwork, anddlnetworkobjects, by default, has no effect on custom layers with state parameters. To define the layer behavior for theresetStatefunction for network objects, define the optional layerresetStatefunction in the layer definition that resets the state parameters.

TheresetStatefunction must have the syntaxlayer = resetState(layer), where the returned layer has the reset state properties.

Create a function namedresetStatethat resets the layer state parameters to vectors of zeros.

functionlayer = resetState(layer)%RESETSTATE Reset layer state % layer = resetState(layer) resets the state properties of the % layer.numHiddenUnits = layer.NumHiddenUnits; layer.HiddenState = zeros(numHiddenUnits,1); layer.CellState = zeros(numHiddenUnits,1);end

Completed Layer

View the completed layer class file.

classdefpeepholeLSTMLayer < nnet.layer.Layer & nnet.layer.Formattable%PEEPHOLELSTMLAYER Peephole LSTM Layerproperties% Layer properties.NumHiddenUnits OutputModeendproperties(Learnable)% Layer learnable parameters.InputWeights RecurrentWeights PeepholeWeights Biasendproperties(State)% Layer state parameters.HiddenState CellStateendmethodsfunctionlayer = peepholeLSTMLayer(numHiddenUnits,inputSize,args)%PEEPHOLELSTMLAYER Peephole LSTM Layer % layer = peepholeLSTMLayer(numHiddenUnits,inputSize) % creates a peephole LSTM layer with the specified number of % hidden units and input channels. % % layer = peepholeLSTMLayer(numHiddenUnits,inputSize,Name=Value) % creates a peephole LSTM layer and specifies additional % options using one or more name-value arguments: % % Name - Name of the layer, specified as a string. % The default is "". % % OutputMode - Output mode, specified as one of the % following: % "sequence" - Output the entire sequence % of data. % % "last" - Output the last time step % of the data. % The default is "sequence".% Parse input arguments.argumentsnumHiddenUnits inputSize args.Name =""; args.OutputMode ="sequence";endlayer.NumHiddenUnits = numHiddenUnits; layer.Name = args.Name; layer.OutputMode = args.OutputMode;% Set layer description.layer.Description ="Peephole LSTM with "+ numHiddenUnits +" hidden units";% Initialize weights and bias.sz = [4*numHiddenUnits inputSize]; numOut = 4*numHiddenUnits; numIn = inputSize; layer.InputWeights = initializeGlorot(sz,numOut,numIn); sz = [4*numHiddenUnits numHiddenUnits]; layer.RecurrentWeights = initializeOrthogonal(sz); sz = [3*numHiddenUnits 1]; numOut = 3*numHiddenUnits; numIn = 1; layer.PeepholeWeights = initializeGlorot(sz,numOut,numIn); layer.Bias = initializeUnitForgetGate(numHiddenUnits);% Initialize layer states.layer = resetState(layer);endfunction[Z,cellState,hiddenState] = predict(layer,X)%PREDICT Peephole LSTM predict function % [Z,hiddenState,cellState] = predict(layer,X) forward % propagates the data X through the layer and returns the % layer output Z and the updated hidden and cell states. X % is a dlarray with format "CBT" and Z is a dlarray with % format "CB" or "CBT", depending on the layer OutputMode % property.% Initialize sequence output.numHiddenUnits = layer.NumHiddenUnits; miniBatchSize = size(X,finddim(X,"B")); numTimeSteps = size(X,finddim(X,"T"));iflayer.OutputMode =="sequence"Z = zeros(numHiddenUnits,miniBatchSize,numTimeSteps,"like",X); Z = dlarray(Z,"CBT");end% Calculate WX + b.X = stripdims(X); WX = pagemtimes(layer.InputWeights,X) + layer.Bias;% Indices of concatenated weight arrays.idx1 = 1:numHiddenUnits; idx2 = 1+numHiddenUnits:2*numHiddenUnits; idx3 = 1+2*numHiddenUnits:3*numHiddenUnits; idx4 = 1+3*numHiddenUnits:4*numHiddenUnits;% Initial states.hiddenState = layer.HiddenState; cellState = layer.CellState;% Loop over time steps.fort = 1:numTimeSteps% Calculate R*h_{t-1}.Rht = layer.RecurrentWeights * hiddenState;% Calculate p*c_{t-1}.皮克特人= layer.PeepholeWeights (idx1)。* cellState;pfct = layer.PeepholeWeights(idx2) .* cellState;% Gate calculations.it = sigmoid(WX(idx1,:,t) + Rht(idx1,:) + pict); ft = sigmoid(WX(idx2,:,t) + Rht(idx2,:) + pfct); gt = tanh(WX(idx3,:,t) + Rht(idx3,:));% Calculate ot using updated cell state.cellState = gt .* it + cellState .* ft; poct = layer.PeepholeWeights(idx3) .* cellState; ot = sigmoid(WX(idx4,:,t) + Rht(idx4,:) + poct);% Update hidden state.hiddenState = tanh(cellState) .* ot;% Update sequence output.iflayer.OutputMode =="sequence"Z(:,:,t) = hiddenState;endend% Last time step output.iflayer.OutputMode =="last"Z = dlarray(hiddenState,"CB");endendfunctionlayer = resetState(layer)%RESETSTATE Reset layer state % layer = resetState(layer) resets the state properties of the % layer.numHiddenUnits = layer.NumHiddenUnits; layer.HiddenState = zeros(numHiddenUnits,1); layer.CellState = zeros(numHiddenUnits,1);endendend

GPU Compatibility

If the layer forward functions fully supportdlarrayobjects, then the layer is GPU compatible. Otherwise, to be GPU compatible, the layer functions must support inputs and return outputs of typegpuArray(Parallel Computing Toolbox).

Many MATLAB built-in functions supportgpuArray(Parallel Computing Toolbox)anddlarrayinput arguments. For a list of functions that supportdlarrayobjects, seeList of Functions with dlarray Support. For a list of functions that execute on a GPU, seeRun MATLAB Functions on a GPU(Parallel Computing Toolbox).To use a GPU for deep learning, you must also have a supported GPU device. For information on supported devices, seeGPU Support by Release(Parallel Computing Toolbox).For more information on working with GPUs in MATLAB, seeGPU Computing in MATLAB(Parallel Computing Toolbox).

In this example, the MATLAB functions used inpredictall supportdlarrayobjects, so the layer is GPU compatible.

Include Custom Layer in Network

You can use a custom layer in the same way as any other layer in Deep Learning Toolbox. Create and train a network for sequence classification using the peephole LSTM layer you created earlier.

Load the example training data.

[XTrain,TTrain] = japaneseVowelsTrainData;

违抗ne the network architecture. Create a layer array containing a peephole LSTM layer.

inputSize = 12; numHiddenUnits = 100; numClasses = 9; layers = [ sequenceInputLayer(inputSize) peepholeLSTMLayer(numHiddenUnits,inputSize,OutputMode="last") fullyConnectedLayer(numClasses) softmaxLayer classificationLayer];

Specify the training options and train the network. Train with a mini-batch size of 27 and left-pad the data.

options = trainingOptions("adam",MiniBatchSize=27,SequencePaddingDirection="left"); net = trainNetwork(XTrain,TTrain,layers,options);
Training on single CPU. |========================================================================================| | Epoch | Iteration | Time Elapsed | Mini-batch | Mini-batch | Base Learning | | | | (hh:mm:ss) | Accuracy | Loss | Rate | |========================================================================================| | 1 | 1 | 00:00:01 | 3.70% | 2.2060 | 0.0010 | | 5 | 50 | 00:00:26 | 92.59% | 0.5917 | 0.0010 | | 10 | 100 | 00:00:45 | 92.59% | 0.2182 | 0.0010 | | 15 | 150 | 00:01:01 | 100.00% | 0.0588 | 0.0010 | | 20 | 200 | 00:01:17 | 96.30% | 0.0872 | 0.0010 | | 25 | 250 | 00:01:37 | 100.00% | 0.0329 | 0.0010 | | 30 | 300 | 00:01:54 | 100.00% | 0.0141 | 0.0010 | |========================================================================================| Training finished: Max epochs completed.

Evaluate the network performance by predicting on new data and calculating the accuracy.

[XTest,TTest] = japaneseVowelsTestData; YTest = classify(net,XTest,MiniBatchSize=27); accuracy = mean(YTest==TTest)
accuracy = 0.8703

References

[1] Greff, Klaus, Rupesh K. Srivastava, Jan Koutník, Bas R. Steunebrink, and Jürgen Schmidhuber. "LSTM: A Search Space Odyssey."IEEE Transactions on Neural Networks and Learning Systems28, no. 10 (2016): 2222–2232.

See Also

|||||||||

Related Topics