row_conv

paddle.static.nn. row_conv ( input, future_context_size, param_attr=None, act=None ) [source]
Api_attr

Static Graph

The row convolution is called lookahead convolution. It was introduced in the following paper for DeepSpeech2: http://www.cs.cmu.edu/~dyogatam/papers/wang+etal.iclrworkshop2016.pdf

The main motivation is that a bidirectional RNN, useful in DeepSpeech like speech models, learns representation for a sequence by performing a forward and a backward pass through the entire sequence. However, unlike unidirectional RNNs, bidirectional RNNs are challenging to deploy in an online and low-latency setting. The lookahead convolution incorporates information from future subsequences in a computationally efficient manner to improve unidirectional recurrent neural networks. The row convolution is different from the 1D sequence convolution, and is computed as follows:

Given an input sequence \(X\) of length \(t\) and input dimension \(D\), and a filter (\(W\)) of size \(context \times D\), the output sequence is convolved as:

\[Out_{i} = \sum_{j=i}^{i + context - 1} X_{j} \cdot W_{j-i}\]

In the above equation:

  • \(Out_{i}\): The i-th row of output variable with shape [1, D].

  • \(context\): Future context size.

  • \(X_{j}\): The j-th row of input variable with shape [1, D].

  • \(W_{j-i}\): The (j-i)-th row of parameters with shape [1, D].

More details about row_conv please refer to the design document https://github.com/PaddlePaddle/Paddle/issues/2228#issuecomment-303903645 .

Parameters
  • input (Tensor) – The input is a Tensor, the shape of Tensor input has shape (B x T x N), B is batch size.

  • future_context_size (int) – Future context size. Please note, the shape of convolution kernel is [future_context_size + 1, D].

  • param_attr (ParamAttr) – Attributes of parameters, including name, initializer etc.

  • act (str) – Non-linear activation to be applied to output Tensor.

Returns

The output is a Tensor, which has same type and same shape as input.

Return type

Tensor

Examples

>>> # for LodTensor inputs
>>> import paddle
>>> paddle.enable_static()
>>> x = paddle.static.data(name='x', shape=[9, 16],
...                     dtype='float32', lod_level=1)
>>> out_x = paddle.static.nn.row_conv(input=x, future_context_size=2)

>>> # for Tensor inputs
>>> y = paddle.static.data(name='y', shape=[9, 4, 16], dtype='float32')
>>> out_y = paddle.static.nn.row_conv(input=y, future_context_size=2)