paddle.static.nn. data_norm ( input, act=None, epsilon=1e-05, param_attr=None, data_layout='NCHW', in_place=False, name=None, moving_mean_name=None, moving_variance_name=None, do_model_average_for_mean_and_var=True, slot_dim=- 1, sync_stats=False, summary_decay_rate=0.9999999, enable_scale_and_shift=False ) [source]

Data Normalization Layer

This op can be used as a normalizer function for conv2d and fully_connected operations. The required data format for this layer is one of the following:

  1. NHWC [batch, in_height, in_width, in_channels]

  2. NCHW [batch, in_channels, in_height, in_width]

\(input\) is the input features over a mini-batch.

\[\begin{split}\mu_{\beta} &\gets \frac{1}{m} \sum_{i=1}^{m} x_i \qquad &// \ mini-batch\ mean \\ \sigma_{\beta}^{2} &\gets \frac{1}{m} \sum_{i=1}^{m}(x_i - \mu_{\beta})^2 \qquad &//\ mini-batch\ variance \\ \hat{x_i} &\gets \frac{x_i - \mu_\beta} {\sqrt{ \sigma_{\beta}^{2} + \epsilon}} \qquad &//\ normalize \\ y_i &\gets \gamma \hat{x_i} + \beta \qquad &//\ scale\ and\ shift\end{split}\]
  • input (Tensor) – The input Tensor.

  • act (str, optional) – Activation type, linear|relu|prelu|… Default: None.

  • epsilon (float, optional) – Whether to add small values into the variance during calculations to prevent division by zero. Default: 1e-05.

  • param_attr (ParamAttr, optional) – The parameter attribute for Parameter scale. Default: None.

  • data_layout (str, optional) – Specify the data format of the input, and the data format of the output will be consistent with that of the input. An optional string from: “NCHW”, “NHWC”. The default is “NCHW”. When it is “NCHW”, the data is stored in the order of: [batch_size, input_channels, input_height, input_width]. Default: “NCHW”.

  • in_place (bool, optional) – Make the input and output of batch norm reuse memory. Default: False.

  • name (str, optional) – A name for this layer (optional). If set None, the layer will be named automatically. Default: None.

  • moving_mean_name (str, optional) – The name of moving_mean which store the global Mean. Default: None.

  • moving_variance_name (str, optional) – The name of the moving_variance which store the global Variance. Default: None.

  • do_model_average_for_mean_and_var (bool, optional) – Whether parameter mean and variance should do model average when model average is enabled. Default: True.

  • slot_dim (int, optional) – The embedding dimension of one slot. Slot is a set of one specific feature. In pslib mode, we distinguish feature ids by slot and pull their embeddings from parameter server (pslib). The first place of the embedding is the historical show number (occurence time of this feature id with a label 0). If the input of this op is concated by slot-wise embeddings, and the show number is zero when this slot is new or empty, the normalization result may be impractical. To avoid this, we add slot_dim to locate the show number and judge if the show number is zero. If so, we choose to skip normalization on this embedding. Default: -1.

  • sync_stats (bool, optional) – When running with multiple GPU cards, using allreduce to sync the summary messages. Default: False.

  • summary_decay_rate (float, optional) – The decay rate when updating summary. Default: 0.9999999.

  • enable_scale_and_shift (bool, optional) – do scale&shift after normalization. Default: False.


A tensor which is the result after applying data normalization on the input.

Return type



>>> import paddle
>>> paddle.enable_static()

>>> x = paddle.randn(shape=[32, 100])
>>> hidden2 = paddle.static.nn.data_norm(input=x)