lstm_unit

api_attr

declarative programming (static graph)

paddle.fluid.layers.lstm_unit(x_t, hidden_t_prev, cell_t_prev, forget_bias=0.0, param_attr=None, bias_attr=None, name=None)[source]

Long-Short Term Memory (LSTM) RNN cell. This operator performs LSTM calculations for one time step, whose implementation is based on calculations described in RECURRENT NEURAL NETWORK REGULARIZATION .

We add forget_bias to the biases of the forget gate in order to reduce the scale of forgetting. The formula is as follows:

\[ \begin{align}\begin{aligned}i_{t} & = \sigma(W_{x_{i}}x_{t} + W_{h_{i}}h_{t-1} + b_{i})\\f_{t} & = \sigma(W_{x_{f}}x_{t} + W_{h_{f}}h_{t-1} + b_{f} + forget\_bias)\\c_{t} & = f_{t}c_{t-1} + i_{t} tanh (W_{x_{c}}x_{t} + W_{h_{c}}h_{t-1} + b_{c})\\o_{t} & = \sigma(W_{x_{o}}x_{t} + W_{h_{o}}h_{t-1} + b_{o})\\h_{t} & = o_{t} tanh (c_{t})\end{aligned}\end{align} \]

\(x_{t}\) stands for x_t , corresponding to the input of current time step; \(h_{t-1}\) and \(c_{t-1}\) correspond to hidden_t_prev and cell_t_prev , representing the output of from previous time step. \(i_{t}, f_{t}, c_{t}, o_{t}, h_{t}\) are input gate, forget gate, cell, output gate and hidden calculation.

Parameters
  • x_t (Variable) – A 2D Tensor representing the input of current time step. Its shape should be \([N, M]\) , where \(N\) stands for batch size, \(M\) for the feature size of input. The data type should be float32 or float64.

  • hidden_t_prev (Variable) – A 2D Tensor representing the hidden value from previous step. Its shape should be \([N, D]\) , where \(N\) stands for batch size, \(D\) for the hidden size. The data type should be same as x_t .

  • cell_t_prev (Variable) – A 2D Tensor representing the cell value from previous step. It has the same shape and data type with hidden_t_prev .

  • forget_bias (float, optional) – \(forget\_bias\) added to the biases of the forget gate. Default 0.

  • param_attr (ParamAttr, optional) – To specify the weight parameter property. Default: None, which means the default weight parameter property is used. See usage for details in ParamAttr .

  • bias_attr (ParamAttr, optional) – To specify the bias parameter property. Default: None, which means the default bias parameter property is used. See usage for details in ParamAttr .

  • name (str, optional) – For detailed information, please refer to Name. Usually name is no need to set and None by default.

Returns

The tuple contains two Tensor variables with the same shape and data type with hidden_t_prev , representing the hidden value and cell value which correspond to \(h_{t}\) and \(c_{t}\) in the formula.

Return type

tuple

Raises
  • ValueError – Rank of x_t must be 2.

  • ValueError – Rank of hidden_t_prev must be 2.

  • ValueError – Rank of cell_t_prev must be 2.

  • ValueError – The 1st dimensions of x_t, hidden_t_prev and cell_t_prev must be the same.

  • ValueError – The 2nd dimensions of hidden_t_prev and cell_t_prev must be the same.

Examples

import paddle.fluid as fluid

dict_dim, emb_dim, hidden_dim = 128, 64, 512
data = fluid.data(name='step_data', shape=[None], dtype='int64')
x = fluid.embedding(input=data, size=[dict_dim, emb_dim])
pre_hidden = fluid.data(
    name='pre_hidden', shape=[None, hidden_dim], dtype='float32')
pre_cell = fluid.data(
    name='pre_cell', shape=[None, hidden_dim], dtype='float32')
hidden = fluid.layers.lstm_unit(
    x_t=x,
    hidden_t_prev=pre_hidden,
    cell_t_prev=pre_cell)