GRUCell

class paddle.fluid.dygraph.rnn. GRUCell ( hidden_size, input_size, param_attr=None, bias_attr=None, gate_activation=None, activation=None, use_cudnn_impl=True, dtype='float64' ) [source]

GRU implementation using basic operators. There are two GRUCell version, the default one is compatible with CUDNN GRU implementation. The algorithm can be described as the equations below.

\[ \begin{align}\begin{aligned}u_t & = sigmoid(W_{ux} x_{t} + b_ux + W_{uh} h_{t-1} + b_uh)\\r_t & = sigmoid(W_{rx} x_{t} + b_rx + W_{rh} h_{t-1} + b_rh)\\\begin{split}\\tilde{h_{t}} & = tanh(W_{cx} x_{t} + b_cx + r_t \\odot (W_{ch} h_{t-1} + b_ch))\end{split}\\\begin{split}h_t & = u_t h_{t-1} + (1-u_t) \\tilde{h_{t}}\end{split}\end{aligned}\end{align} \]

The other LSTMCell version is compatible with the BasicGRUUnit used in static graph. The algorithm can be described as the equations below.

\[ \begin{align}\begin{aligned}u_t & = sigmoid(W_{ux} x_{t} + W_{uh} h_{t-1} + b_u)\\r_t & = sigmoid(W_{rx} x_{t} + W_{rh} h_{t-1} + b_r)\\\begin{split}\\tilde{h_{t}} & = tanh(W_{cx} x_{t} + W_{ch} \\odot(r_t, h_{t-1}) + b_m)\end{split}\\\begin{split}h_t & = u_t h_{t-1} + (1-u_t) \\tilde{h_{t}}\end{split}\end{aligned}\end{align} \]
Parameters
  • hidden_size (integer) – The hidden size used in the Cell.

  • input_size (integer) – The input size used in the Cell.

  • param_attr (ParamAttr|None) – The parameter attribute for the learnable weight matrix. Note: If it is set to None or one attribute of ParamAttr, GRUCell will create ParamAttr as param_attr. If the Initializer of the param_attr is not set, the parameter is initialized with Xavier. Default: None.

  • bias_attr (ParamAttr|None) – The parameter attribute for the bias of GRUCell. If it is set to None or one attribute of ParamAttr, GRUCell will create ParamAttr as bias_attr. If the Initializer of the bias_attr is not set, the bias is initialized zero. Default: None.

  • gate_activation (function|None) – The activation function for gates (actGate). Default: ‘fluid.layers.sigmoid’

  • activation (function|None) – The activation function for cell (actNode). Default: ‘fluid.layers.tanh’

  • use_cudnn_impl (bool|True) – whether to use CUDNN compatible LSTMCell

  • dtype (string) – data type used in this cell

Returns

None

Examples

from paddle import fluid
import paddle.fluid.core as core
from paddle.fluid.dygraph import GRUCell
import numpy as np
batch_size = 64
input_size = 128
hidden_size = 256
step_input_np = np.random.uniform(-0.1, 0.1, (
batch_size, input_size)).astype('float64')
pre_hidden_np = np.random.uniform(-0.1, 0.1, (
batch_size, hidden_size)).astype('float64')
if core.is_compiled_with_cuda():
    place = core.CUDAPlace(0)
else:
    place = core.CPUPlace()
with fluid.dygraph.guard(place):
    cudnn_gru = GRUCell(hidden_size, input_size)
    step_input_var = fluid.dygraph.to_variable(step_input_np)
    pre_hidden_var = fluid.dygraph.to_variable(pre_hidden_np)
forward ( input, pre_hidden )

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments