NoamDecay¶
- class paddle.fluid.dygraph.learning_rate_scheduler. NoamDecay ( d_model, warmup_steps, begin=1, step=1, dtype='float32', learning_rate=1.0 ) [source]
- 
         - Api_attr
- 
           imperative 
 Applies Noam decay to the initial learning rate. The algorithm can be described as following. \[decayed\_learning\_rate = learning\_rate * d_{model}^{-0.5} * min(global\_step^{-0.5}, global\_step * warmup\_steps^{-1.5})\]Please reference attention is all you need - Parameters
- 
           - d$_{model}$ (Variable|int) – The dimensionality of input and output feature vector of model. If type is Variable, it’s a tensor with shape [1] and the data type can be int32 or int64. The type can also be python int. 
- warmup_steps (Variable|int) – The number of warmup steps. A super parameter. If type is Variable, it’s a tensor with shape [1] and the data type can be int32 or int64. The type can also be python int. 
- begin (int, optional) – The begin step. The initial value of global_step described above. The default value is 0. 
- step (int, optional) – The step size used to calculate the new global_step in the description above. The default value is 1. 
- dtype (str, optional) – The data type used to create the learning rate variable. The data type can be set as ‘float32’, ‘float64’. The default value is ‘float32’. 
- learning_rate (Variable|float|int) – The initial learning rate. If the type is Variable, it’s a tensor with shape [1], the data type can be float32 or float64. It also can be set to python int number. Default 1.0 
 
- Returns
- 
           None. 
 Examples import paddle.fluid as fluid warmup_steps = 100 learning_rate = 0.01 with fluid.dygraph.guard(): emb = fluid.dygraph.Embedding([10, 10]) optimizer = fluid.optimizer.SGD( learning_rate = fluid.dygraph.NoamDecay( 1/(warmup_steps *(learning_rate ** 2)), warmup_steps), parameter_list = emb.parameters()) - 
            
           create_lr_var
           (
           lr
           )
           create_lr_var¶
- 
           convert lr from float to variable - Parameters
- 
             lr – learning rate 
- Returns
- 
             learning rate variable 
 
 - 
            
           set_dict
           (
           state_dict
           )
           set_dict¶
- 
           Loads the schedulers state. 
 - 
            
           set_state_dict
           (
           state_dict
           )
           set_state_dict¶
- 
           Loads the schedulers state. 
 - 
            
           state_dict
           (
           )
           state_dict¶
- 
           Returns the state of the scheduler as a dict.It is a subset of self.__dict__ . 
 
