class paddle.incubate.nn. FusedBiasDropoutResidualLayerNorm ( embed_dim, dropout_rate=0.5, weight_attr=None, bias_attr=None, epsilon=1e-05, name=None ) [source]

Applies fused_bias_dropout_residual_layer_norm operation.

  • embed_dim (int) – The expected feature size in the input and output.

  • dropout_rate (float, optional) – The dropout probability used on attention weights to drop some attention targets for the dropout after attention. 0 for no dropout. Default 0.5.

  • bias_attr (ParamAttr|bool, optional) – To specify the bias parameter property. Default: None, which means the default bias parameter property is used. If it is set to False, this layer will not have trainable bias parameter. See usage for details in ParamAttr.

  • epsilon (float, optional) – The small value added to the variance to prevent division by zero. Default: 1e-05.


>>> import paddle
>>> paddle.device.set_device('gpu')
>>> # input: [batch_size, seq_len, embed_dim]
>>> x = paddle.rand((2, 4, 128))
>>> # residual: [batch_size, seq_len, embed_dim]
>>> residual = paddle.rand((2, 4, 128))
>>> fused_bias_dropout_residual_ln = paddle.incubate.nn.FusedBiasDropoutResidualLayerNorm(128)
>>> output = fused_bias_dropout_residual_ln(x, residual)
>>> print(output.shape)
[2, 4, 128]
forward ( x, residual )


Applies fused_bias_dropout_residual_layer_norm operation.

  • x (Tensor) – The input tensor. It is a tensor with shape [batch_size, seq_len, embed_dim]. The data type should be float32 or float64.

  • residual (Tensor, optional) – The residual tensor. It is a tensor with shape [batch_size, value_length, vdim]. The data type should be float32 or float64.


It is a tensor that has the same shape and data type as x.

Return type


extra_repr ( )


Extra representation of this layer, you can have custom implementation of your own layer.