MultiplicativeDecay
- class paddle.optimizer.lr. MultiplicativeDecay ( learning_rate: float, lr_lambda: Callable[[int], float], last_epoch: int = -1, verbose: bool = False ) [source]
- 
         Multiply the learning rate of optimizerby the factor given in functionlr_lambda.The algorithm can be described as the code below. learning_rate = 0.5 # init learning_rate lr_lambda = lambda epoch: 0.95 learning_rate = 0.5 # epoch 0, learning_rate = 0.475 # epoch 1, 0.5*0.95 learning_rate = 0.45125 # epoch 2, 0.475*0.95 - Parameters
- 
           - learning_rate (float) – The initial learning rate. It is a python float number. 
- lr_lambda (function) – A function which computes a factor by - epoch, and then multiply the last learning rate by this factor.
- last_epoch (int, optional) – The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate. 
- verbose (bool, optional) – If - True, prints a message to stdout for each update. Default:- False.
 
- Returns
- 
           MultiplicativeDecayinstance to schedule learning rate.
 Examples >>> import paddle >>> # train on default dynamic graph mode >>> linear = paddle.nn.Linear(10, 10) >>> scheduler = paddle.optimizer.lr.MultiplicativeDecay(learning_rate=0.5, lr_lambda=lambda x:0.95, verbose=True) >>> sgd = paddle.optimizer.SGD(learning_rate=scheduler, parameters=linear.parameters()) >>> for epoch in range(20): ... for batch_id in range(5): ... x = paddle.uniform([10, 10]) ... out = linear(x) ... loss = paddle.mean(out) ... loss.backward() ... sgd.step() ... sgd.clear_gradients() ... scheduler.step() # If you update learning rate each step ... # scheduler.step() # If you update learning rate each epoch ... - 
            
           get_lr
           (
           ) 
            float
           get_lr¶
- 
           For those subclass who overload LRScheduler(Base Class), User should have a custom implementation ofget_lr().Otherwise, an NotImplementedErrorexception will be thrown.
 - 
            
           set_dict
           (
           state_dict: _LRStateDict
           ) 
            None
           set_dict¶
- 
           Loads the schedulers state. 
 - 
            
           set_state_dict
           (
           state_dict: _LRStateDict
           ) 
            None
           set_state_dict¶
- 
           Loads the schedulers state. 
 - 
            
           state_dict
           (
           ) 
            _LRStateDict
           state_dict¶
- 
           Returns the state of the scheduler as a dict.It is a subset of self.__dict__.
 - 
            
           state_keys
           (
           ) 
            None
           state_keys¶
- 
           For those subclass who overload LRScheduler(Base Class). Acquiescently, “last_epoch, last_lr” will be saved byself.keys = ['last_epoch', 'last_lr'].last_epochis the current epoch num, andlast_lris the current learning rate.If you want to change the default behavior, you should have a custom implementation of _state_keys()to redefineself.keys.
 - 
            
           step
           (
           epoch: Optional[int] = None
           ) 
            None
           step¶
- 
           stepshould be called afteroptimizer.step. It will update the learning rate in optimizer according to currentepoch. The new learning rate will take effect on nextoptimizer.step.- Parameters
- 
             epoch (int, None) – specify current epoch. Default: None. Auto-increment from last_epoch=-1. 
- Returns
- 
             None 
 Examples >>> import paddle >>> value = paddle.arange(26, dtype='float32') >>> a = paddle.reshape(value, [2, 13]) >>> linear = paddle.nn.Linear(13, 5) >>> adadelta = paddle.optimizer.Adadelta(learning_rate=0.0003, epsilon=1e-06, rho=0.95, ... parameters = linear.parameters()) >>> out = linear(a) >>> out.backward() >>> adadelta.step() >>> adadelta.clear_grad() >>> import paddle >>> value = paddle.arange(26, dtype='float32') >>> a = paddle.reshape(value, [2, 13]) >>> linear = paddle.nn.Linear(13, 5) >>> adadelta = paddle.optimizer.Adadelta(learning_rate=0.0003, epsilon=1e-06, rho=0.95, ... parameters = linear.parameters()) >>> out = linear(a) >>> out.backward() >>> adadelta.step() >>> adadelta.clear_grad() 
 
