GRUCell¶
- class paddle.nn. GRUCell ( input_size, hidden_size, weight_ih_attr=None, weight_hh_attr=None, bias_ih_attr=None, bias_hh_attr=None, name=None ) [source]
- 
         Gated Recurrent Unit (GRU) RNN cell. Given the inputs and previous states, it computes the outputs and updates states. The formula for GRU used is as follows: \[ \begin{align}\begin{aligned}r_{t} & = \sigma(W_{ir}x_{t} + b_{ir} + W_{hr}h_{t-1} + b_{hr})\\z_{t} & = \sigma(W_{iz}x_{t} + b_{iz} + W_{hz}h_{t-1} + b_{hz})\\\widetilde{h}_{t} & = \tanh(W_{ic}x_{t} + b_{ic} + r_{t} * (W_{hc}h_{t-1} + b_{hc}))\\h_{t} & = z_{t} * h_{t-1} + (1 - z_{t}) * \widetilde{h}_{t}\\y_{t} & = h_{t}\end{aligned}\end{align} \]where \(\sigma\) is the sigmoid fucntion, and * is the elemetwise multiplication operator. Please refer to An Empirical Exploration of Recurrent Network Architectures for more details. - Parameters
- 
           - input_size (int) – The input size. 
- hidden_size (int) – The hidden size. 
- weight_ih_attr (ParamAttr, optional) – The parameter attribute for weight_ih. Default: None. 
- weight_hh_attr (ParamAttr, optional) – The parameter attribute for weight_hh. Default: None. 
- bias_ih_attr (ParamAttr, optional) – The parameter attribute for the bias_ih. Default: None. 
- bias_hh_attr (ParamAttr, optional) – The parameter attribute for the bias_hh. Default: None. 
- name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name. 
 
 - Variables:
- 
           - weight_ih (Parameter): shape (3 * hidden_size, input_size), input to hidden weight, which corresponds to the concatenation of \(W_{ir}, W_{iz}, W_{ic}\) in the formula. 
- weight_hh (Parameter): shape (3 * hidden_size, hidden_size), hidden to hidden weight, which corresponds to the concatenation of \(W_{hr}, W_{hz}, W_{hc}\) in the formula. 
- bias_ih (Parameter): shape (3 * hidden_size, ), input to hidden bias, which corresponds to the concatenation of \(b_{ir}, b_{iz}, b_{ic}\) in the formula. 
- bias_hh (Parameter): shape (3 * hidden_size, ), hidden to hidden bias, swhich corresponds to the concatenation of \(b_{hr}, b_{hz}, b_{hc}\) in the formula. 
 
- Inputs:
- 
           - inputs (Tensor): A tensor with shape [batch_size, input_size], corresponding to \(x_t\) in the formula. 
- states (Tensor): A tensor with shape [batch_size, hidden_size], corresponding to \(h_{t-1}\) in the formula. 
 
 - Returns
- 
           shape [batch_size, hidden_size], the output, corresponding to \(h_{t}\) in the formula. - states (Tensor): shape [batch_size, hidden_size], the new hidden state, corresponding to \(h_{t}\) in the formula. 
- Return type
- 
           
           - outputs (Tensor) 
 
 Notes All the weights and bias are initialized with Uniform(-std, std) by default. Where std = \(\frac{1}{\sqrt{hidden\_size}}\). For more information about parameter initialization, please refer to s:ref:api_fluid_ParamAttr. Examples import paddle x = paddle.randn((4, 16)) prev_h = paddle.randn((4, 32)) cell = paddle.nn.GRUCell(16, 32) y, h = cell(x, prev_h) print(y.shape) print(h.shape) #[4,32] #[4,32] - 
            
           forward
           (
           inputs, 
           states=None
           )
           forward¶
- 
           Defines the computation performed at every call. Should be overridden by all subclasses. - Parameters
- 
             - *inputs (tuple) – unpacked tuple arguments 
- **kwargs (dict) – unpacked dict arguments 
 
 
 - property state_shape
- 
           The state_shape of GRUCell is a shape [hidden_size] (-1 for batch size would be automatically inserted into shape). The shape corresponds to the shape of \(h_{t-1}\). 
 - 
            
           extra_repr
           (
           )
           extra_repr¶
- 
           Extra representation of this layer, you can have custom implementation of your own layer. 
 
