class paddle.fluid.dygraph.learning_rate_scheduler. LinearLrWarmup ( learning_rate, warmup_steps, start_lr, end_lr, begin=1, step=1, dtype='float32' ) [source]


This operator use the linear learning rate warm up strategy to adjust the learning rate preliminarily before the normal learning rate scheduling. For more information, please refer to Bag of Tricks for Image Classification with Convolutional Neural Networks

When global_step < warmup_steps, learning rate is updated as:

linear_step = end_lr - start_lr
lr = start_lr + linear_step * (global_step / warmup_steps)

where start_lr is the initial learning rate, and end_lr is the final learning rate;

When global_step >= warmup_steps, learning rate is updated as:

lr = learning_rate

where lr is the learning_rate after warm-up.

  • learning_rate (Variable|float) – Learning_rate after warm-up, it could be 1D-Tensor or single value with the data type of float32.

  • warmup_steps (int) – Steps for warm up.

  • start_lr (float) – Initial learning rate of warm up.

  • end_lr (float) – Final learning rate of warm up.

  • begin (int, optional) – The begin step. The initial value of global_step described above. The default value is 0.

  • step (int, optional) – The step size used to calculate the new global_step in the description above. The default value is 1.

  • dtype (str, optional) – The data type used to create the learning rate variable. The data type can be set as ‘float32’, ‘float64’. The default value is ‘float32’.


Warm-up learning rate with the same data type as learning_rate.

Return type



import paddle.fluid as fluid

learning_rate = 0.1
warmup_steps = 50
start_lr = 0
end_lr = 0.1

with fluid.dygraph.guard():
    lr_decay = fluid.dygraph.LinearLrWarmup( learning_rate, warmup_steps, start_lr, end_lr)