LayerNorm

class paddle.nn. LayerNorm ( normalized_shape, epsilon=1e-05, weight_attr=None, bias_attr=None, name=None ) [source]
Alias_main

paddle.nn.LayerNorm :alias: paddle.nn.LayerNorm,paddle.nn.layer.LayerNorm,paddle.nn.layer.norm.LayerNorm :old_api: paddle.fluid.dygraph.LayerNorm

This interface is used to construct a callable object of the LayerNorm class. For more details, refer to code examples. It implements the function of the Layer Normalization Layer and can be applied to mini-batch input data. Refer to Layer Normalization

The formula is as follows:

\[ \begin{align}\begin{aligned}\begin{split}\\mu & = \\frac{1}{H}\\sum_{i=1}^{H} x_i\end{split}\\\begin{split}\\sigma & = \\sqrt{\\frac{1}{H}\sum_{i=1}^{H}{(x_i - \\mu)^2} + \\epsilon}\end{split}\\\begin{split}y & = f(\\frac{g}{\\sigma}(x - \\mu) + b)\end{split}\end{aligned}\end{align} \]
  • \(x\): the vector representation of the summed inputs to the neurons in that layer.

  • \(H\): the number of hidden units in a layers

  • \(\\epsilon\): the small value added to the variance to prevent division by zero.

  • \(g\): the trainable scale parameter.

  • \(b\): the trainable bias parameter.

Parameters
  • normalized_shape (int|list|tuple) – Input shape from an expected input of size \([*, normalized_shape[0], normalized_shape[1], ..., normalized_shape[-1]]\). If it is a single integer, this module will normalize over the last dimension which is expected to be of that specific size.

  • epsilon (float, optional) – The small value added to the variance to prevent division by zero. Default: 1e-05.

  • weight_attr (ParamAttr|bool, optional) – The parameter attribute for the learnable gain \(g\). If False, weight is None. If is None, a default ParamAttr would be added as scale. The param_attr is initialized as 1 if it is added. Default: None.

  • bias_attr (ParamAttr|bool, optional) – The parameter attribute for the learnable bias \(b\). If is False, bias is None. If is None, a default ParamAttr would be added as bias. The bias_attr is initialized as 0 if it is added. Default: None.

  • name (str, optional) – Name for the LayerNorm, default is None. For more information, please refer to Name..

Shape:
  • x: 2-D, 3-D, 4-D or 5-D tensor.

  • output: same shape as input x.

Returns

None

Examples

import paddle
import numpy as np

np.random.seed(123)
x_data = np.random.random(size=(2, 2, 2, 3)).astype('float32')
x = paddle.to_tensor(x_data)
layer_norm = paddle.nn.LayerNorm(x_data.shape[1:])
layer_norm_out = layer_norm(x)

print(layer_norm_out)
forward ( input )

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments

extra_repr ( )

Extra representation of this layer, you can have custom implementation of your own layer.