rnnt_loss¶
- paddle.nn.functional. rnnt_loss ( input, label, input_lengths, label_lengths, blank=0, fastemit_lambda=0.001, reduction='mean', name=None ) [source]
-
An operator integrating the open source Warp-Transducer library (https://github.com/b-flo/warp-transducer.git) to compute Sequence Transduction with Recurrent Neural Networks (RNN-T) loss.
- Parameters
-
input (Tensor) – The logprobs sequence with padding, which is a 4-D Tensor. The tensor shape is [B, Tmax, Umax, D], where Tmax is the longest length of input logit sequence. The data type should be float32 or float64.
label (Tensor) – The ground truth sequence with padding, which must be a 2-D Tensor. The tensor shape is [B, Umax], where Umax is the longest length of label sequence. The data type must be int32.
input_lengths (Tensor) – The length for each input sequence, it should have shape [batch_size] and dtype int64.
label_lengths (Tensor) – The length for each label sequence, it should have shape [batch_size] and dtype int64.
blank (int, optional) – The blank label index of RNN-T loss, which is in the half-opened interval [0, B). The data type must be int32. Default is 0.
fastemit_lambda (float, default 0.001) – Regularization parameter for FastEmit (https://arxiv.org/pdf/2010.11148.pdf)
reduction (string, optional) – Indicate how to average the loss, the candidates are
'none'
|'mean'
|'sum'
. Ifreduction
is'mean'
, the output will be sum of loss and be divided by the batch_size; Ifreduction
is'sum'
, return the sum of loss; Ifreduction
is'none'
, no reduction will be applied. Default is'mean'
.name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.
- Returns
-
reduction is
'none'
, the shape of loss is [batch_size], otherwise, the shape of loss is []. Data type is the same aslogprobs
. - Return type
-
Tensor, The RNN-T loss between
logprobs
andlabels
. If attr
Examples
>>> # declarative mode >>> import paddle.nn.functional as F >>> import numpy as np >>> import paddle >>> import functools >>> fn = functools.partial(F.rnnt_loss, reduction='sum', fastemit_lambda=0.0, blank=0) >>> acts = np.array([[ ... [[0.1, 0.6, 0.1, 0.1, 0.1], ... [0.1, 0.1, 0.6, 0.1, 0.1], ... [0.1, 0.1, 0.2, 0.8, 0.1]], ... [[0.1, 0.6, 0.1, 0.1, 0.1], ... [0.1, 0.1, 0.2, 0.1, 0.1], ... [0.7, 0.1, 0.2, 0.1, 0.1]] ... ]]) >>> labels = [[1, 2]] >>> acts = paddle.to_tensor(acts, stop_gradient=False) >>> lengths = [acts.shape[1]] * acts.shape[0] >>> label_lengths = [len(l) for l in labels] >>> labels = paddle.to_tensor(labels, paddle.int32) >>> lengths = paddle.to_tensor(lengths, paddle.int32) >>> label_lengths = paddle.to_tensor(label_lengths, paddle.int32) >>> costs = fn(acts, labels, lengths, label_lengths) >>> print(costs) Tensor(shape=[], dtype=float64, place=Place(cpu), stop_gradient=False, -2.85042444)