FusedBiasDropoutResidualLayerNorm¶

class paddle.incubate.nn. FusedBiasDropoutResidualLayerNorm ( embed_dim, dropout_rate=0.5, weight_attr=None, bias_attr=None, epsilon=1e-05, name=None ) [source]

Applies fused_bias_dropout_residual_layer_norm operation.

Parameters

embed_dim (int) – The expected feature size in the input and output.
dropout_rate (float, optional) – The dropout probability used on attention weights to drop some attention targets for the dropout after attention. 0 for no dropout. Default 0.5.
bias_attr (ParamAttr|bool, optional) – To specify the bias parameter property. Default: None, which means the default bias parameter property is used. If it is set to False, this layer will not have trainable bias parameter. See usage for details in ParamAttr.
epsilon (float, optional) – The small value added to the variance to prevent division by zero. Default: 1e-05.

Examples

           >>> 
>>> import paddle
>>> paddle.device.set_device('gpu')
>>> # input: [batch_size, seq_len, embed_dim]
>>> x = paddle.rand((2, 4, 128))
>>> # residual: [batch_size, seq_len, embed_dim]
>>> residual = paddle.rand((2, 4, 128))
>>> fused_bias_dropout_residual_ln = paddle.incubate.nn.FusedBiasDropoutResidualLayerNorm(128)
>>> output = fused_bias_dropout_residual_ln(x, residual)
>>> print(output.shape)
[2, 4, 128]

          

forward ( x, residual ) forward¶

Applies fused_bias_dropout_residual_layer_norm operation.

Parameters

x (Tensor) – The input tensor. It is a tensor with shape [batch_size, seq_len, embed_dim]. The data type should be float32 or float64.
residual (Tensor, optional) – The residual tensor. It is a tensor with shape [batch_size, value_length, vdim]. The data type should be float32 or float64.

Returns

It is a tensor that has the same shape and data type as x.

Return type

Tensor|tuple

extra_repr ( ) extra_repr¶: Extra representation of this layer, you can have custom implementation of your own layer.

FusedBiasDropoutResidualLayerNorm¶

forward¶

extra_repr¶