GRUCell

class paddle.nn. GRUCell ( input_size, hidden_size, weight_ih_attr=None, weight_hh_attr=None, bias_ih_attr=None, bias_hh_attr=None, name=None ) [source]

Gated Recurrent Unit (GRU) RNN cell. Given the inputs and previous states, it computes the outputs and updates states.

The formula for GRU used is as follows:

rt=σ(Wirxt+bir+Whrht1+bhr)zt=σ(Wizxt+biz+Whzht1+bhz)˜ht=tanh(Wicxt+bic+rt(Whcht1+bhc))ht=ztht1+(1zt)˜htyt=ht

where σ is the sigmoid fucntion, and * is the elemetwise multiplication operator.

Please refer to An Empirical Exploration of Recurrent Network Architectures for more details.

Parameters
  • input_size (int) – The input size.

  • hidden_size (int) – The hidden size.

  • weight_ih_attr (ParamAttr, optional) – The parameter attribute for weight_ih. Default: None.

  • weight_hh_attr (ParamAttr, optional) – The parameter attribute for weight_hh. Default: None.

  • bias_ih_attr (ParamAttr, optional) – The parameter attribute for the bias_ih. Default: None.

  • bias_hh_attr (ParamAttr, optional) – The parameter attribute for the bias_hh. Default: None.

  • name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.

Variables:
  • weight_ih (Parameter): shape (3 * hidden_size, input_size), input to hidden weight, which corresponds to the concatenation of Wir,Wiz,Wic in the formula.

  • weight_hh (Parameter): shape (3 * hidden_size, hidden_size), hidden to hidden weight, which corresponds to the concatenation of Whr,Whz,Whc in the formula.

  • bias_ih (Parameter): shape (3 * hidden_size, ), input to hidden bias, which corresponds to the concatenation of bir,biz,bic in the formula.

  • bias_hh (Parameter): shape (3 * hidden_size, ), hidden to hidden bias, swhich corresponds to the concatenation of bhr,bhz,bhc in the formula.

Inputs:
  • inputs (Tensor): A tensor with shape [batch_size, input_size], corresponding to xt in the formula.

  • states (Tensor): A tensor with shape [batch_size, hidden_size], corresponding to ht1 in the formula.

Returns

shape [batch_size, hidden_size], the output, corresponding to ht in the formula. - states (Tensor): shape [batch_size, hidden_size], the new hidden state, corresponding to ht in the formula.

Return type

  • outputs (Tensor)

Notes

All the weights and bias are initialized with Uniform(-std, std) by default. Where std = 1hidden_size. For more information about parameter initialization, please refer to s:ref:api_fluid_ParamAttr.

Examples

import paddle

x = paddle.randn((4, 16))
prev_h = paddle.randn((4, 32))

cell = paddle.nn.GRUCell(16, 32)
y, h = cell(x, prev_h)

print(y.shape)
print(h.shape)

#[4,32]
#[4,32]
forward ( inputs, states=None )

forward

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments

property state_shape

The state_shape of GRUCell is a shape [hidden_size] (-1 for batch size would be automatically inserted into shape). The shape corresponds to the shape of ht1.

extra_repr ( )

extra_repr

Extra representation of this layer, you can have custom implementation of your own layer.