GRUCell

Note: This API is only avaliable in [Static Graph] mode

class paddle.fluid.layers.GRUCell(hidden_size, param_attr=None, bias_attr=None, gate_activation=None, activation=None, dtype='float32', name='GRUCell')[source]

Gated Recurrent Unit cell. It is a wrapper for fluid.contrib.layers.rnn_impl.BasicGRUUnit to make it adapt to RNNCell.

The formula used is as follow:

\[ \begin{align}\begin{aligned}u_t & = act_g(W_{ux}x_{t} + W_{uh}h_{t-1} + b_u)\\r_t & = act_g(W_{rx}x_{t} + W_{rh}h_{t-1} + b_r)\\\tilde{h_t} & = act_c(W_{cx}x_{t} + W_{ch}(r_t \odot h_{t-1}) + b_c)\\h_t & = u_t \odot h_{t-1} + (1-u_t) \odot \tilde{h_t}\end{aligned}\end{align} \]

For more details, please refer to Learning Phrase Representations using RNN Encoder Decoder for Statistical Machine Translation

Examples

import paddle.fluid.layers as layers
cell = layers.GRUCell(hidden_size=256)
call(inputs, states)

Perform calculations of GRU.

Parameters
  • inputs (Variable) – A tensor with shape [batch_size, input_size], corresponding to \(x_t\) in the formula. The data type should be float32.

  • states (Variable) – A tensor with shape [batch_size, hidden_size]. corresponding to \(h_{t-1}\) in the formula. The data type should be float32.

Returns

A tuple( (outputs, new_states) ), where outputs and new_states is the same tensor shaped [batch_size, hidden_size], corresponding to \(h_t\) in the formula. The data type of the tensor is same as that of states.

Return type

tuple

state_shape

The state_shape of GRUCell is a shape [hidden_size] (-1 for batch size would be automatically inserted into shape). The shape corresponds to \(h_{t-1}\).

get_initial_states(batch_ref, shape=None, dtype=None, init_value=0)

Generate initialized states according to provided shape, data type and value.

Parameters
  • batch_ref – A (possibly nested structure of) tensor variable[s]. The first dimension of the tensor will be used as batch size to initialize states.

  • shape – A (possiblely nested structure of) shape[s], where a shape is represented as a list/tuple of integer). -1(for batch size) will beautomatically inserted if shape is not started with it. If None, property state_shape will be used. The default value is None.

  • dtype – A (possiblely nested structure of) data type[s]. The structure must be same as that of shape, except when all tensors’ in states has the same data type, a single data type can be used. If None and property cell.state_shape is not available, float32 will be used as the data type. The default value is None.

  • init_value – A float value used to initialize states.

Returns

tensor variable[s] packed in the same structure provided by shape, representing the initialized states.

Return type

Variable

state_dtype

Used to initialize states. A (possiblely nested structure of) data types[s]. The structure must be same as that of shape, except when all tensors’ in states has the same data type, a signle data type can be used. Not necessary to be implemented if states are not initialized by get_initial_states or the dtype argument is provided when using get_initial_states.