CosineAnnealingWarmRestarts
- class paddle.optimizer.lr. CosineAnnealingWarmRestarts ( learning_rate: float, T_0: int, T_mult: int = 1, eta_min: float = 0, last_epoch: int = -1, verbose: bool = False ) [source]
- 
         Set the learning rate of each parameter group using a cosine annealing schedule, where \(\eta_{max}\) is set to the initial lr, \(T_{cur}\) is the number of epochs since the last restart and \(T_{i}\) is the number of epochs between two warm restarts in SGDR: \[\eta_t = \eta_{min} + \frac{1}{2}(\eta_{max} - \eta_{min})\left(1 + \cos\left(\frac{T_{cur}}{T_{i}}\pi\right)\right)\]When \(T_{cur}=T_{i}\), set \(\eta_t = \eta_{min}\). When \(T_{cur}=0\) after restart, set \(\eta_t=\eta_{max}\). It has been proposed in SGDR: Stochastic Gradient Descent with Warm Restarts. - Parameters
- 
           - learning_rate (float) – Initial learning rate. 
- T_0 (int) – Number of iterations for the first restart. 
- T_mult (int, optional) – A factor increases \(T_{i}\) after a restart. Default: 1. 
- eta_min (float, optional) – Minimum learning rate. Default: 0. 
- last_epoch (int, optional) – The index of last epoch. Default: -1, means initial learning rate. 
- verbose (bool, optional) – If - True, prints a message to stdout for each update. Default:- False.
 
- Returns
- 
           CosineAnnealingWarmRestartsinstance to schedule learning rate.
 Examples >>> import paddle >>> import numpy as np >>> # train on default dynamic graph mode >>> linear = paddle.nn.Linear(10, 10) >>> scheduler = paddle.optimizer.lr.CosineAnnealingWarmRestarts(learning_rate=0.5, T_0=1, T_mult=2, verbose=True) >>> adam = paddle.optimizer.Adam(learning_rate=scheduler, parameters=linear.parameters()) >>> for epoch in range(10): ... for batch_id in range(10): ... x = paddle.uniform([10, 10]) ... out = linear(x) ... loss = paddle.mean(out) ... loss.backward() ... adam.step() ... adam.clear_grad() ... scheduler.step(epoch) # You should update learning rate each step >>> import paddle >>> import numpy as np >>> paddle.enable_static() >>> main_prog = paddle.static.Program() >>> start_prog = paddle.static.Program() >>> with paddle.static.program_guard(main_prog, start_prog): ... x = paddle.static.data(name='x', shape=[None, 4, 5]) ... y = paddle.static.data(name='y', shape=[None, 4, 5]) ... z = paddle.static.nn.fc(x, 100) ... loss = paddle.mean(z) ... scheduler = paddle.optimizer.lr.CosineAnnealingWarmRestarts(learning_rate=0.5, T_0=1, T_mult=2,verbose=True) ... sgd = paddle.optimizer.SGD(learning_rate=scheduler) ... sgd.minimize(loss) >>> exe = paddle.static.Executor() >>> exe.run(start_prog) >>> for epoch in range(10): ... for batch_id in range(10): ... out = exe.run( ... main_prog, ... feed={ ... 'x': np.random.randn(3, 4, 5).astype('float32'), ... 'y': np.random.randn(3, 4, 5).astype('float32') ... }, ... fetch_list=loss.name) ... scheduler.step(epoch) # You should update learning rate each step - 
            
           set_dict
           (
           state_dict: _LRStateDict
           ) 
            None
           set_dict¶
- 
           Loads the schedulers state. 
 - 
            
           set_state_dict
           (
           state_dict: _LRStateDict
           ) 
            None
           set_state_dict¶
- 
           Loads the schedulers state. 
 - 
            
           state_dict
           (
           ) 
            _LRStateDict
           state_dict¶
- 
           Returns the state of the scheduler as a dict.It is a subset of self.__dict__.
 - 
            
           state_keys
           (
           ) 
            None
           state_keys¶
- 
           For those subclass who overload LRScheduler(Base Class). Acquiescently, “last_epoch, last_lr” will be saved byself.keys = ['last_epoch', 'last_lr'].last_epochis the current epoch num, andlast_lris the current learning rate.If you want to change the default behavior, you should have a custom implementation of _state_keys()to redefineself.keys.
 - 
            
           get_lr
           (
           ) 
            float
           get_lr¶
- 
           For those subclass who overload LRScheduler(Base Class), User should have a custom implementation ofget_lr().Otherwise, an NotImplementedErrorexception will be thrown.
 - 
            
           step
           (
           epoch: Optional[int] = None
           ) 
            None
           step¶
- 
           step should be called after optimizer.step() . It will update the learning rate in optimizer. The new learning rate will take effect on next epoch. - Parameters
- 
             epoch (int|None, optional) – specify current epoch. Default: None. Auto-increment from last_epoch=-1. 
- Returns
- 
             None 
 Examples Please refer to the example of current LRScheduler. 
 
