class paddle.fluid.dygraph.learning_rate_scheduler. PolynomialDecay ( learning_rate, decay_steps, end_learning_rate=0.0001, power=1.0, cycle=False, begin=0, step=1, dtype='float32' ) [source]


Applies polynomial decay to the initial learning rate.

The algorithm can be described as following.

If cycle is set to True, then:

\[ \begin{align}\begin{aligned}\begin{split}decay\_steps & = decay\_steps * math.ceil(\\frac{global\_step}{decay\_steps})\end{split}\\\begin{split}decayed\_learning\_rate & = (learning\_rate-end\_learning\_rate)*(1-\\frac{global\_step}{decay\_steps})^{power}+end\_learning\_rate\end{split}\end{aligned}\end{align} \]

If cycle is set to False, then:

\[ \begin{align}\begin{aligned}global\_step & = min(global\_step, decay\_steps)\\\begin{split}decayed\_learning\_rate & = (learning\_rate-end\_learning\_rate)*(1-\\frac{global\_step}{decay\_steps})^{power}+end\_learning\_rate\end{split}\end{aligned}\end{align} \]
  • learning_rate (Variable|float) – The initial learning rate. If the type is Variable, it’s a tensor with shape [1], the data type can be float32 or float64. It also can be set to python int number.

  • decay_steps (int) – The decay step size. It determines the decay cycle.

  • end_learning_rate (float, optional) – The minimum final learning rate. The default value is 0.0001.

  • power (float, optional) – Power of polynomial. The default value is 1.0.

  • cycle (bool, optional) – If set true, decay the learning rate every decay_steps. The default value is False.

  • begin (int, optional) – The begin step. The initial value of global_step described above. The default value is 0.

  • step (int, optional) – The step size used to calculate the new global_step in the description above. The default value is 1.

  • dtype (str, optional) – The data type used to create the learning rate variable. The data type can be set as ‘float32’, ‘float64’. The default value is ‘float32’.




import paddle.fluid as fluid
import paddle
start_lr = 0.01
total_step = 5000
end_lr = 0
with fluid.dygraph.guard():
    emb = paddle.nn.Embedding(10, 10)
    optimizer  = fluid.optimizer.SGD(
        learning_rate = fluid.dygraph.PolynomialDecay(
        start_lr, total_step, end_lr, power=1.0),
        parameter_list = emb.parameters())
create_lr_var ( lr )


convert lr from float to variable


lr – learning rate


learning rate variable

set_dict ( state_dict )


Loads the schedulers state.

set_state_dict ( state_dict )


Loads the schedulers state.

state_dict ( )


Returns the state of the scheduler as a dict.

It is a subset of self.__dict__ .