LayerNorm¶
- class paddle.nn. LayerNorm ( normalized_shape, epsilon=1e-05, weight_attr=None, bias_attr=None, name=None ) [source]
- 
         Construct a callable object of the LayerNormclass. For more details, refer to code examples. It implements the function of the Layer Normalization Layer and can be applied to mini-batch input data. Refer to Layer NormalizationThe formula is as follows: \[ \begin{align}\begin{aligned}\mu & = \frac{1}{H}\sum_{i=1}^{H} x_i\\\sigma & = \sqrt{\frac{1}{H}\sum_{i=1}^{H}{(x_i - \mu)^2} + \epsilon}\\y & = f(\frac{g}{\sigma}(x - \mu) + b)\end{aligned}\end{align} \]- \(x\): the vector representation of the summed inputs to the neurons in that layer. 
- \(H\): the number of hidden units in a layers 
- \(\epsilon\): the small value added to the variance to prevent division by zero. 
- \(g\): the trainable scale parameter. 
- \(b\): the trainable bias parameter. 
 - Parameters
- 
           - normalized_shape (int|list|tuple) – Input shape from an expected input of size \([*, normalized_shape[0], normalized_shape[1], ..., normalized_shape[-1]]\). If it is a single integer, this module will normalize over the last dimension which is expected to be of that specific size. 
- epsilon (float, optional) – The small value added to the variance to prevent division by zero. Default: 1e-05. 
- weight_attr (ParamAttr|bool, optional) – The parameter attribute for the learnable gain \(g\). If False, weight is None. If is None, a default - ParamAttrwould be added as scale. The- param_attris initialized as 1 if it is added. Default: None.
- bias_attr (ParamAttr|bool, optional) – The parameter attribute for the learnable bias \(b\). If is False, bias is None. If is None, a default - ParamAttrwould be added as bias. The- bias_attris initialized as 0 if it is added. Default: None.
- name (str, optional) – Name for the LayerNorm, default is None. For more information, please refer to Name.. 
 
 - Shape:
- 
           - x: 2-D, 3-D, 4-D or 5-D tensor. 
- output: same shape as input x. 
 
 - Returns
- 
           None 
 Examples import paddle x = paddle.rand((2, 2, 2, 3)) layer_norm = paddle.nn.LayerNorm(x.shape[1:]) layer_norm_out = layer_norm(x) print(layer_norm_out) - 
            
           forward
           (
           input
           )
           forward¶
- 
           Defines the computation performed at every call. Should be overridden by all subclasses. - Parameters
- 
             - *inputs (tuple) – unpacked tuple arguments 
- **kwargs (dict) – unpacked dict arguments 
 
 
 - 
            
           extra_repr
           (
           )
           extra_repr¶
- 
           Extra representation of this layer, you can have custom implementation of your own layer. 
 
