inverse_time_decay(learning_rate, decay_steps, decay_rate, staircase=False)
Applies inverse time decay to the initial learning rate.
When training a model, it is often recommended to lower the learning rate as the training progresses. By using this function, an inverse decay function will be applied to the initial learning rate.
Decayed learning rate calcualtes as follows:
>>> if staircase == True: >>> decayed_learning_rate = learning_rate / (1 + decay_rate * floor(global_step / decay_step)) >>> else: >>> decayed_learning_rate = learning_rate / (1 + decay_rate * global_step / decay_step)
learning_rate (Variable|float) – The initial learning rate. It should be a Variable or a float
decay_steps (int) – The learning rate decay steps. See the decay computation above.
decay_rate (float) – The learning rate decay rate. See the decay computation above.
staircase (bool) – If True, decay the learning rate at discrete intervals, which means the learning rate will be decayed by decay_rate times every decay_steps. If False, learning rate will be decayed continuously and following the formula above. Default: False
The decayed learning rate. The data type is float32.
- Return type
import paddle.fluid as fluid base_lr = 0.1 sgd_optimizer = fluid.optimizer.SGD( learning_rate=fluid.layers.inverse_time_decay( learning_rate=base_lr, decay_steps=10000, decay_rate=0.5, staircase=True))