class paddle.nn. BatchNorm ( num_channels, act=None, is_test=False, momentum=0.9, epsilon=1e-05, param_attr=None, bias_attr=None, dtype='float32', data_layout='NCHW', in_place=False, moving_mean_name=None, moving_variance_name=None, do_model_average_for_mean_and_var=True, use_global_stats=False, trainable_statistics=False ) [source]

paddle.nn.BatchNorm :alias: paddle.nn.BatchNorm,paddle.nn.layer.BatchNorm,paddle.nn.layer.norm.BatchNorm :old_api: paddle.fluid.dygraph.BatchNorm

This interface is used to construct a callable object of the BatchNorm class. For more details, refer to code examples. It implements the function of the Batch Normalization Layer and can be used as a normalizer function for conv2d and fully connected operations. The data is normalized by the mean and variance of the channel based on the current batch data. Refer to Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift for more details.

When use_global_stats = False, the \(\\mu_{\\beta}\) and \(\\sigma_{\\beta}^{2}\) are the statistics of one mini-batch. Calculated as follows:

\[\begin{split}\\mu_{\\beta} &\\gets \\frac{1}{m} \\sum_{i=1}^{m} x_i \\qquad &//\\ \ mini-batch\ mean \\\\ \\sigma_{\\beta}^{2} &\\gets \\frac{1}{m} \\sum_{i=1}^{m}(x_i - \\ \\mu_{\\beta})^2 \\qquad &//\ mini-batch\ variance \\\\\end{split}\]
  • \(x\) : mini-batch data

  • \(m\) : the size of the mini-batch data

When use_global_stats = True, the \(\\mu_{\\beta}\) and \(\\sigma_{\\beta}^{2}\) are not the statistics of one mini-batch. They are global or running statistics (moving_mean and moving_variance). It usually got from the pre-trained model. Calculated as follows:

\[\begin{split}moving\_mean = moving\_mean * momentum + \mu_{\beta} * (1. - momentum) \quad &// global mean \\ moving\_variance = moving\_variance * momentum + \sigma_{\beta}^{2} * (1. - momentum) \quad &// global variance \\\end{split}\]

The normalization function formula is as follows:

\[\begin{split}\\hat{x_i} &\\gets \\frac{x_i - \\mu_\\beta} {\\sqrt{\\ \\sigma_{\\beta}^{2} + \\epsilon}} \\qquad &//\ normalize \\\\ y_i &\\gets \\gamma \\hat{x_i} + \\beta \\qquad &//\ scale\ and\ shift\end{split}\]
  • \(\\epsilon\) : add a smaller value to the variance to prevent division by zero

  • \(\\gamma\) : trainable proportional parameter

  • \(\\beta\) : trainable deviation parameter

  • num_channels (int) – Indicate the number of channels of the input Tensor.

  • act (str, optional) – Activation to be applied to the output of batch normalization. Default: None.

  • is_test (bool, optional) – A flag indicating whether it is in test phrase or not. This flag only has effect on static graph mode. For dygraph mode, please use eval(). Default: False.

  • momentum (float, optional) – The value used for the moving_mean and moving_var computation. Default: 0.9.

  • epsilon (float, optional) – The small value added to the variance to prevent division by zero. Default: 1e-5.

  • param_attr (ParamAttr, optional) – The parameter attribute for Parameter scale of batch_norm. If it is set to None or one attribute of ParamAttr, batch_norm will create ParamAttr as param_attr. If the Initializer of the param_attr is not set, the parameter is initialized with Xavier. Default: None.

  • bias_attr (ParamAttr, optional) – The parameter attribute for the bias of batch_norm. If it is set to None or one attribute of ParamAttr, batch_norm will create ParamAttr as bias_attr. If the Initializer of the bias_attr is not set, the bias is initialized zero. Default: None.

  • dtype (str, optional) – Indicate the data type of the input Tensor, which can be float32 or float64. Default: float32.

  • data_layout (str, optional) – Specify the input data format, the data format can be “NCHW” or “NHWC”. Default: NCHW.

  • in_place (bool, optional) – Make the input and output of batch norm reuse memory. Default: False.

  • moving_mean_name (str, optional) – The name of moving_mean which store the global Mean. Default: None.

  • moving_variance_name (str, optional) – The name of the moving_variance which store the global Variance. Default: None.

  • do_model_average_for_mean_and_var (bool, optional) – Whether parameter mean and variance should do model average when model average is enabled. Default: True.

  • use_global_stats (bool, optional) – Whether to use global mean and variance. In inference or test mode, set use_global_stats to true or is_test to true, and the behavior is equivalent. In train mode, when setting use_global_stats True, the global mean and variance are also used during train period. Default: False.

  • trainable_statistics (bool, optional) – Whether to calculate mean and var in eval mode. In eval mode, when setting trainable_statistics True, mean and variance will be calculated by current batch statistics. Default: False.




import paddle.fluid as fluid
from paddle.fluid.dygraph.base import to_variable
import numpy as np

x = np.random.random(size=(3, 10, 3, 7)).astype('float32')
with fluid.dygraph.guard():
    x = to_variable(x)
    batch_norm = fluid.BatchNorm(10)
    hidden1 = batch_norm(x)
forward ( input )

Defines the computation performed at every call. Should be overridden by all subclasses.

  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments