- paddle.nn.functional. rnnt_loss ( input, label, input_lengths, label_lengths, blank=0, fastemit_lambda=0.001, reduction='mean', name=None )
An operator integrating the open source Warp-Transducer library (https://github.com/b-flo/warp-transducer.git) to compute Sequence Transduction with Recurrent Neural Networks (RNN-T) loss.
input (Tensor) – The logprobs sequence with padding, which is a 4-D Tensor. The tensor shape is [B, Tmax, Umax, D], where Tmax is the longest length of input logit sequence. The data type should be float32 or float64.
label (Tensor) – The ground truth sequence with padding, which must be a 2-D Tensor. The tensor shape is [B, Umax], where Umax is the longest length of label sequence. The data type must be int32.
input_lengths (Tensor) – The length for each input sequence, it should have shape [batch_size] and dtype int64.
label_lengths (Tensor) – The length for each label sequence, it should have shape [batch_size] and dtype int64.
blank (int, optional) – The blank label index of RNN-T loss, which is in the half-opened interval [0, B). The data type must be int32. Default is 0.
fastemit_lambda (float, default 0.001) – Regularization parameter for FastEmit (https://arxiv.org/pdf/2010.11148.pdf)
reduction (string, optional) – Indicate how to average the loss, the candicates are
'mean', the output will be sum of loss and be divided by the batch_size; If
'sum', return the sum of loss; If
'none', no reduction will be applied. Default is
name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.
'none', the shape of loss is [batch_size], otherwise, the shape of loss is . Data type is the same as
- Return type
Tensor, The RNN-T loss between
labels. If attr
# declarative mode import paddle.nn.functional as F import numpy as np import paddle import functools fn = functools.partial(F.rnnt_loss, reduction='sum', fastemit_lambda=0.0, blank=0) acts = np.array([[[[0.1, 0.6, 0.1, 0.1, 0.1], [0.1, 0.1, 0.6, 0.1, 0.1], [0.1, 0.1, 0.2, 0.8, 0.1]], [[0.1, 0.6, 0.1, 0.1, 0.1], [0.1, 0.1, 0.2, 0.1, 0.1], [0.7, 0.1, 0.2, 0.1, 0.1]]]]) labels = [[1, 2]] acts = paddle.to_tensor(acts, stop_gradient=False) lengths = [acts.shape] * acts.shape label_lengths = [len(l) for l in labels] labels = paddle.to_tensor(labels, paddle.int32) lengths = paddle.to_tensor(lengths, paddle.int32) label_lengths = paddle.to_tensor(label_lengths, paddle.int32) costs = fn(acts, labels, lengths, label_lengths) print(costs) # Tensor(shape=, dtype=float64, place=Place(gpu:0), stop_gradient=False, # 4.49566677)