CosineAnnealingDecay¶
- class paddle.optimizer.lr. CosineAnnealingDecay ( learning_rate, T_max, eta_min=0, last_epoch=- 1, verbose=False ) [source]
- 
         Set the learning rate using a cosine annealing schedule, where \(\eta_{max}\) is set to the initial learning_rate. \(T_{cur}\) is the number of epochs since the last restart in SGDR. The algorithm can be described as following. \[ \begin{align}\begin{aligned}\eta_t & = \eta_{min} + \frac{1}{2}(\eta_{max} - \eta_{min})\left(1 + \cos\left(\frac{T_{cur}}{T_{max}}\pi\right)\right), & T_{cur} \neq (2k+1)T_{max};\\\eta_{t+1} & = \eta_{t} + \frac{1}{2}(\eta_{max} - \eta_{min}) \left(1 - \cos\left(\frac{1}{T_{max}}\pi\right)\right), & T_{cur} = (2k+1)T_{max}.\end{aligned}\end{align} \]It has been proposed in SGDR: Stochastic Gradient Descent with Warm Restarts. Note that this only implements the cosine annealing part of SGDR, and not the restarts. - Parameters
- 
           - learning_rate (float) – The initial learning rate, that is \(\eta_{max}\) . It can be set to python float or int number. 
- T_max (int) – Maximum number of iterations. It is half of the decay cycle of learning rate. It must be a positive integer. 
- eta_min (float|int, optional) – Minimum learning rate, that is \(\eta_{min}\) . Default: 0. 
- last_epoch (int, optional) – The index of last epoch. Can be set to restart training. Default: -1, means initial learning rate. 
- verbose (bool, optional) – If - True, prints a message to stdout for each update. Default:- False.
 
- Returns
- 
           CosineAnnealingDecayinstance to schedule learning rate.
 Examples import paddle import numpy as np # train on default dynamic graph mode linear = paddle.nn.Linear(10, 10) scheduler = paddle.optimizer.lr.CosineAnnealingDecay(learning_rate=0.5, T_max=10, verbose=True) sgd = paddle.optimizer.SGD(learning_rate=scheduler, parameters=linear.parameters()) for epoch in range(20): for batch_id in range(5): x = paddle.uniform([10, 10]) out = linear(x) loss = paddle.mean(out) loss.backward() sgd.step() sgd.clear_gradients() scheduler.step() # If you update learning rate each step # scheduler.step() # If you update learning rate each epoch # train on static graph mode paddle.enable_static() main_prog = paddle.static.Program() start_prog = paddle.static.Program() with paddle.static.program_guard(main_prog, start_prog): x = paddle.static.data(name='x', shape=[None, 4, 5]) y = paddle.static.data(name='y', shape=[None, 4, 5]) z = paddle.static.nn.fc(x, 100) loss = paddle.mean(z) scheduler = paddle.optimizer.lr.CosineAnnealingDecay(learning_rate=0.5, T_max=10, verbose=True) sgd = paddle.optimizer.SGD(learning_rate=scheduler) sgd.minimize(loss) exe = paddle.static.Executor() exe.run(start_prog) for epoch in range(20): for batch_id in range(5): out = exe.run( main_prog, feed={ 'x': np.random.randn(3, 4, 5).astype('float32'), 'y': np.random.randn(3, 4, 5).astype('float32') }, fetch_list=loss.name) scheduler.step() # If you update learning rate each step # scheduler.step() # If you update learning rate each epoch - 
            
           get_lr
           (
           )
           get_lr¶
- 
           For those subclass who overload LRScheduler(Base Class), User should have a custom implementation ofget_lr().Otherwise, an NotImplementedErrorexception will be thrown.
 - 
            
           set_dict
           (
           state_dict
           )
           set_dict¶
- 
           Loads the schedulers state. 
 - 
            
           set_state_dict
           (
           state_dict
           )
           set_state_dict¶
- 
           Loads the schedulers state. 
 - 
            
           state_dict
           (
           )
           state_dict¶
- 
           Returns the state of the scheduler as a dict.It is a subset of self.__dict__.
 - 
            
           state_keys
           (
           )
           state_keys¶
- 
           For those subclass who overload LRScheduler(Base Class). Acquiescently, “last_epoch, last_lr” will be saved byself.keys = ['last_epoch', 'last_lr'].last_epochis the current epoch num, andlast_lris the current learning rate.If you want to change the default behavior, you should have a custom implementation of _state_keys()to redefineself.keys.
 - 
            
           step
           (
           epoch=None
           )
           step¶
- 
           stepshould be called afteroptimizer.step. It will update the learning rate in optimizer according to currentepoch. The new learning rate will take effect on nextoptimizer.step.- Parameters
- 
             epoch (int, None) – specify current epoch. Default: None. Auto-increment from last_epoch=-1. 
- Returns
- 
             None 
 
 
