Model Parameters¶

Model parameters are weights and biases in a model. In fluid, they are instances of fluid.Parameter class which is inherited from fluid, and they are all persistable variables. Model training is a process of learning and updating model parameters. The attributes related to model parameters can be configured by api_fluid_ParamAttr . The configurable contents are as follows:

Initialization method
Regularization
gradient clipping
Model Average

Initialization method¶

Fluid initializes a single parameter by setting attributes of initializer in ParamAttr .

examples：

            param_attrs = fluid.ParamAttr(name="fc_weight",
                          initializer=fluid.initializer.ConstantInitializer(1.0))
y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)

           

The following is the initialization method supported by fluid:

1. BilinearInitializer¶

Linear initialization. The deconvolution operation initialized by this method can be used as a linear interpolation operation.

Alias：Bilinear

API reference： api_fluid_initializer_BilinearInitializer

2. ConstantInitializer¶

Constant initialization. Initialize the parameter to the specified value.

Alias：Constant

API reference： api_fluid_initializer_ConstantInitializer

3. MSRAInitializer¶

Please refer to https://arxiv.org/abs/1502.01852 for initialization.

Alias：MSRA

API reference： api_fluid_initializer_MSRAInitializer

4. NormalInitializer¶

Initialization method of random Gaussian distribution.

Alias：Normal

API reference： api_fluid_initializer_NormalInitializer

5. TruncatedNormalInitializer¶

Initialization method of stochastic truncated Gauss distribution.

Alias：TruncatedNormal

API reference： api_fluid_initializer_TruncatedNormalInitializer

6. UniformInitializer¶

Initialization method of random uniform distribution.

Alias：Uniform

API reference： api_fluid_initializer_UniformInitializer

7. XavierInitializer¶

Please refer to http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf for initialization.

Alias：Xavier

API reference： api_fluid_initializer_XavierInitializer

Regularization¶

Fluid regularizes a single parameter by setting attributes of regularizer in ParamAttr .

            param_attrs = fluid.ParamAttr(name="fc_weight",
                          regularizer=fluid.regularizer.L1DecayRegularizer(0.1))
y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)

           

The following is the regularization approach supported by fluid:

api_fluid_regularizer_L1DecayRegularizer (Alias：L1Decay)
api_fluid_regularizer_L2DecayRegularizer (Alias：L2Decay)

Clipping¶

Fluid sets clipping method for a single parameter by setting attributes of gradient_clip in ParamAttr .

            param_attrs = fluid.ParamAttr(name="fc_weight",
                          regularizer=fluid.regularizer.L1DecayRegularizer(0.1))
y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)

           

The following is the clipping method supported by fluid:

1. ErrorClipByValue¶

Used to clipping the value of a tensor to a specified range.

API reference： api_fluid_clip_ErrorClipByValue

2. GradientClipByGlobalNorm¶

Used to limit the global-norm of multiple Tensors to clip_norm.

API reference： api_fluid_clip_GradientClipByGlobalNorm

3. GradientClipByNorm¶

Limit the L2-norm of Tensor to max_norm . If Tensor’s L2-norm exceeds: max_norm , it will calculate a scale . And then all values of the Tensor multiply the scale .

API reference： api_fluid_clip_GradientClipByNorm

4. GradientClipByValue¶

Limit the value of the gradient on a parameter to [min, max].

API reference： api_fluid_clip_GradientClipByValue

Model Averaging¶

Fluid determines whether to average a single parameter by setting attributes of do_model_average in ParamAttr . Examples:

            param_attrs = fluid.ParamAttr(name="fc_weight",
                          do_model_average=true)
y_predict = fluid.layers.fc(input=x, size=10, param_attr=param_attrs)

           

In the miniBatch training process, parameters will be updated once after each batch, and the average model averages the parameters generated by the latest K updates.

The averaged parameters are only used for testing and prediction, and they do not get involved in the actual training process.

API reference api_fluid_optimizer_ModelAverage