basic_gru¶
- paddle.fluid.contrib.layers.rnn_impl. basic_gru ( input, init_hidden, hidden_size, num_layers=1, sequence_length=None, dropout_prob=0.0, bidirectional=False, batch_first=True, param_attr=None, bias_attr=None, gate_activation=None, activation=None, dtype='float32', name='basic_gru' ) [source]
- 
         GRU implementation using basic operator, supports multiple layers and bidirectional gru. \[ \begin{align}\begin{aligned}u_t & = actGate(W_ux xu_{t} + W_uh h_{t-1} + b_u)\\r_t & = actGate(W_rx xr_{t} + W_rh h_{t-1} + b_r)\\m_t & = actNode(W_cx xm_t + W_ch dot(r_t, h_{t-1}) + b_m)\\h_t & = dot(u_t, h_{t-1}) + dot((1-u_t), m_t)\end{aligned}\end{align} \]- Parameters
- 
           - input (Variable) – GRU input tensor, if batch_first = False, shape should be ( seq_len x batch_size x input_size ) if batch_first = True, shape should be ( batch_size x seq_len x hidden_size ) 
- init_hidden (Variable|None) – The initial hidden state of the GRU This is a tensor with shape ( num_layers x batch_size x hidden_size) if is_bidirec = True, shape should be ( num_layers*2 x batch_size x hidden_size) and can be reshaped to tensor with ( num_layers x 2 x batch_size x hidden_size) to use. If it’s None, it will be set to all 0. 
- hidden_size (int) – Hidden size of the GRU 
- num_layers (int) – The total number of layers of the GRU 
- sequence_length (Variabe|None) – A Tensor (shape [batch_size]) stores each real length of each instance, This tensor will be convert to a mask to mask the padding ids If it’s None means NO padding ids 
- dropout_prob (float|0.0) – Dropout prob, dropout ONLY works after rnn output of each layers, NOT between time steps 
- bidirectional (bool|False) – If it is bidirectional 
- batch_first (bool|True) – The shape format of the input and output tensors. If true, the shape format should be - [batch_size, seq_len, hidden_size]. If false, the shape format should be- [seq_len, batch_size, hidden_size]. By default this function accepts input and emits output in batch-major form to be consistent with most of data format, though a bit less efficient because of extra transposes.
- param_attr (ParamAttr|None) – The parameter attribute for the learnable weight matrix. Note: If it is set to None or one attribute of ParamAttr, gru_unit will create ParamAttr as param_attr. If the Initializer of the param_attr is not set, the parameter is initialized with Xavier. Default: None. 
- bias_attr (ParamAttr|None) – The parameter attribute for the bias of GRU unit. If it is set to None or one attribute of ParamAttr, gru_unit will create ParamAttr as bias_attr. If the Initializer of the bias_attr is not set, the bias is initialized zero. Default: None. 
- gate_activation (function|None) – The activation function for gates (actGate). Default: ‘fluid.layers.sigmoid’ 
- activation (function|None) – The activation function for cell (actNode). Default: ‘fluid.layers.tanh’ 
- dtype (string) – data type used in this unit 
- name (string) – name used to identify parameters and biases 
 
- Returns
- 
           
           - rnn_out(Tensor),last_hidden(Tensor)
- 
             - rnn_out is result of GRU hidden, with shape (seq_len x batch_size x hidden_size) if is_bidirec set to True, shape will be ( seq_len x batch_sze x hidden_size*2) 
- last_hidden is the hidden state of the last step of GRU shape is ( num_layers x batch_size x hidden_size ) if is_bidirec set to True, shape will be ( num_layers*2 x batch_size x hidden_size), can be reshaped to a tensor with shape( num_layers x 2 x batch_size x hidden_size) 
 
 
 Examples import paddle.fluid.layers as layers from paddle.fluid.contrib.layers import basic_gru batch_size = 20 input_size = 128 hidden_size = 256 num_layers = 2 dropout = 0.5 bidirectional = True batch_first = False input = layers.data( name = "input", shape = [-1, batch_size, input_size], dtype='float32') pre_hidden = layers.data( name = "pre_hidden", shape=[-1, hidden_size], dtype='float32') sequence_length = layers.data( name="sequence_length", shape=[-1], dtype='int32') rnn_out, last_hidden = basic_gru( input, pre_hidden, hidden_size, num_layers = num_layers, \ sequence_length = sequence_length, dropout_prob=dropout, bidirectional = bidirectional, \ batch_first = batch_first) 
