Main Content

lstm

Long short-term memory

Since R2019b

Description

The long short-term memory (LSTM) operation allows a network to learn long-term dependencies between time steps in time series and sequence data.

Note

This function applies the deep learning LSTM operation todlarraydata. If you want to apply an LSTM operation within alayerGraphobject orLayerarray, use the following layer:

example

Y= lstm(X,H0,C0,weights,recurrentWeights,bias)applies a long short-term memory (LSTM) calculation to inputXusing the initial hidden stateH0, initial cell stateC0, and parametersweights,recurrentWeights, andbias. The inputXmust be a formatteddlarray. The outputYis a formatteddlarraywith the same dimension format asX, except for any'S'dimensions.

Thelstmfunction updates the cell and hidden states using the hyperbolic tangent function (tanh) as the state activation function. Thelstmfunction uses the sigmoid function given by σ ( x ) = ( 1 + e x ) 1 as the gate activation function.

[Y,hiddenState,cellState] = lstm(X,H0,C0,weights,recurrentWeights,bias)also returns the hidden state and cell state after the LSTM operation.

[___] = lstm(___,'DataFormat',FMT)also specifies the dimension formatFMTwhenXis not a formatteddlarray. The outputYis an unformatteddlarraywith the same dimension order asX, except for any'S'dimensions.

Examples

collapse all

Perform an LSTM operation using three hidden units.

Create the input sequence data as 32 observations with 10 channels and a sequence length of 64

numFeatures = 10; numObservations = 32; sequenceLength = 64; X = randn(numFeatures,numObservations,sequenceLength); dlX = dlarray(X,'CBT');

Create the initial hidden and cell states with three hidden units. Use the same initial hidden state and cell state for all observations.

numHiddenUnits = 3; H0 = zeros(numHiddenUnits,1); C0 = zeros(numHiddenUnits,1);

Create the learnable parameters for the LSTM operation.

weights = dlarray(randn(4*numHiddenUnits,numFeatures),'CU'); recurrentWeights = dlarray(randn(4*numHiddenUnits,numHiddenUnits),'CU'); bias = dlarray(randn(4*numHiddenUnits,1),'C');

Perform the LSTM calculation

[dlY,hiddenState,cellState] = lstm(dlX,H0,C0,weights,recurrentWeights,bias);

View the size and dimensions ofdlY.

size(dlY)
ans =1×33 32 64
dlY.dims
ans = 'CBT'

View the size ofhiddenStateandcellState.

size(hiddenState)
ans =1×23 32
size(cellState)
ans =1×23 32

Check that the outputhiddenStateis the same as the last time step of outputdlY.

ifextractdata(dlY(:,:,end)) == hiddenState disp("The hidden state and the last time step are equal.");elsedisp("The hidden state and the last time step are not equal.")end
The hidden state and the last time step are equal.

You can use the hidden state and cell state to keep track of the state of the LSTM operation and input further sequential data.

Input Arguments

collapse all

Input data, specified as a formatteddlarray, an unformatteddlarray,或者一个号码ic array. WhenXis not a formatteddlarray, you must specify the dimension label format using'DataFormat',FMT. IfXis a numeric array, at least one ofH0,C0,weights,recurrentWeights, orbiasmust be adlarray.

Xmust contain a sequence dimension labeled"T". IfXhas any spatial dimensions labeled"S", they are flattened into the"C"channel dimension. IfXdoes not have a channel dimension, then one is added. IfXhas any unspecified dimensions labeled"U", they must be singleton.

Data Types:single|double

Initial hidden state vector, specified as a formatteddlarray, an unformatteddlarray,或者一个号码ic array.

IfH0is a formatteddlarray, it must contain a channel dimension labeled'C'and optionally a batch dimension labeled'B'with the same size as the'B'dimension ofX. IfH0does not have a'B'dimension, the function uses the same hidden state vector for each observation inX.

The size of the'C'dimension determines the number of hidden units. The size of the'C'dimension ofH0must be equal to the size of the'C'的尺寸C0.

IfH0is a not a formatteddlarray, the size of the first dimension determines the number of hidden units and must be the same size as the first dimension or the'C'dimension ofC0.

Data Types:single|double

Initial cell state vector, specified as a formatteddlarray, an unformatteddlarray,或者一个号码ic array.

IfC0is a formatteddlarray, it must contain a channel dimension labeled'C'and optionally a batch dimension labeled'B'with the same size as the'B'dimension ofX. IfC0does not have a'B'dimension, the function uses the same cell state vector for each observation inX.

The size of the'C'dimension determines the number of hidden units. The size of the'C'dimension ofC0must be equal to the size of the'C'的尺寸H0.

IfC0is a not a formatteddlarray, the size of the first dimension determines the number of hidden units and must be the same size as the first dimension or the'C'dimension ofH0.

Data Types:single|double

Weights, specified as a formatteddlarray, an unformatteddlarray,或者一个号码ic array.

Specifyweightsas a matrix of size4*NumHiddenUnits-by-InputSize, whereNumHiddenUnitsis the size of the'C'dimension of bothC0andH0, andInputSizeis the size of the'C'dimension ofXmultiplied by the size of each'S'dimension ofX, where present.

Ifweightsis a formatteddlarray, it must contain a'C'dimension of size4*NumHiddenUnitsand a'U'dimension of sizeInputSize.

Data Types:single|double

Recurrent weights, specified as a formatteddlarray, an unformatteddlarray,或者一个号码ic array.

SpecifyrecurrentWeightsas a matrix of size4*NumHiddenUnits-by-NumHiddenUnits, whereNumHiddenUnitsis the size of the'C'dimension of bothC0andH0.

IfrecurrentWeightsis a formatteddlarray, it must contain a'C'dimension of size4*NumHiddenUnitsand a'U'dimension of sizeNumHiddenUnits.

Data Types:single|double

Bias, specified as a formatteddlarray, an unformatteddlarray,或者一个号码ic array.

Specifybiasas a vector of length4*NumHiddenUnits, whereNumHiddenUnitsis the size of the'C'dimension of bothC0andH0.

Ifbiasis a formatteddlarray, the nonsingleton dimension must be labeled with'C'.

Data Types:single|double

Dimension order of unformatted input data, specified as the comma-separated pair consisting of'DataFormat'and a character array or stringFMTthat provides a label for each dimension of the data. Each character inFMTmust be one of the following:

  • 'S'— Spatial

  • 'C'— Channel

  • 'B'— Batch (for example, samples and observations)

  • 'T'— Time (for example, sequences)

  • 'U'— Unspecified

You can specify multiple dimensions labeled'S'or'U'. You can use the labels'C','B', and'T'at most once.

You must specify'DataFormat',FMTwhen the input data is not a formatteddlarray.

Example:'DataFormat','SSCB'

Data Types:char|string

Output Arguments

collapse all

LSTM output, returned as adlarray. The outputYhas the same underlying data type as the inputX.

If the input dataXis a formatteddlarray,Yhas the same dimension format asX, except for any'S'dimensions. If the input data is not a formatteddlarray,Yis an unformatteddlarraywith the same dimension order as the input data.

The size of the'C'dimension ofYis the same as the number of hidden units, specified by the size of the'C'dimension ofH0orC0.

Hidden state vector for each observation, returned as adlarrayor a numeric array with the same data type asH0.

If the inputH0is a formatteddlarray, then the outputhiddenStateis a formatteddlarraywith the format"CB".

Cell state vector for each observation, returned as adlarrayor a numeric array.cellStateis returned with the same data type asC0.

If the inputC0is a formatteddlarray, the outputcellStateis returned as a formatteddlarraywith the format'CB'.

More About

collapse all

Long Short-Term Memory

The LSTM operation allows a network to learn long-term dependencies between time steps in time series and sequence data. For more information, see the definition ofLong Short-Term Memory Layeron thelstmLayerreference page.

Extended Capabilities

Version History

Introduced in R2019b