XavierUniform

class paddle.nn.initializer. XavierUniform ( fan_in: Optional[float] = None, fan_out: Optional[float] = None, gain: float = 1.0, name: Optional[str] = None ) [source]

This class implements the Xavier weight initializer from the paper Understanding the difficulty of training deep feedforward neural networks by Xavier Glorot and Yoshua Bengio.

This initializer is designed to keep the scale of the gradients approximately same in all the layers. In case of Uniform distribution, the range is \([-x,x]\), where

\[x = gain \times \sqrt{\frac{6.0}{fan\_in + fan\_out}}.\]

Parameters

fan_in (float|None, optional) – fan_in for Xavier initialization, which is inferred from the Tensor. Default is None.
fan_out (float|None, optional) – fan_out for Xavier initialization, which is inferred from the Tensor. Default is None.
gain (float, optional) – Scaling Tensor. Default is 1.0.
name (str|None, optional) – For details, please refer to api_guide_Name. Generally, no setting is required. Default: None.

Returns

A parameter initialized by Xavier weight, using a uniform distribution.

Examples

           >>> import paddle
>>> paddle.seed(1)
>>> data = paddle.ones(shape=[3, 1, 2], dtype='float32')
>>> weight_attr = paddle.framework.ParamAttr(
...     name="linear_weight",
...     initializer=paddle.nn.initializer.XavierUniform())
>>> bias_attr = paddle.framework.ParamAttr(
...     name="linear_bias",
...     initializer=paddle.nn.initializer.XavierUniform())
>>> linear = paddle.nn.Linear(2, 2, weight_attr=weight_attr, bias_attr=bias_attr)
>>> print(linear.weight)
Parameter containing:
Tensor(shape=[2, 2], dtype=float32, place=Place(cpu), stop_gradient=False,
[[-1.18095720,  0.64892638],
 [ 0.43125069, -1.11156428]])
>>> print(linear.bias)
Parameter containing:
Tensor(shape=[2], dtype=float32, place=Place(cpu), stop_gradient=False,
[-0.27524316,  1.13808715])

>>> res = linear(data)
>>> print(res)
Tensor(shape=[3, 1, 2], dtype=float32, place=Place(cpu), stop_gradient=False,
[[[-1.02494967,  0.67544925]],
 [[-1.02494967,  0.67544925]],
 [[-1.02494967,  0.67544925]]])