class paddle.fluid.dygraph.learning_rate_scheduler. PolynomialDecay ( learning_rate, decay_steps, end_learning_rate=0.0001, power=1.0, cycle=False, begin=0, step=1, dtype='float32' ) [source]


Applies polynomial decay to the initial learning rate.

The algorithm can be described as following.

If cycle is set to True, then:

\[ \begin{align}\begin{aligned}\begin{split}decay\_steps & = decay\_steps * math.ceil(\\frac{global\_step}{decay\_steps})\end{split}\\\begin{split}decayed\_learning\_rate & = (learning\_rate-end\_learning\_rate)*(1-\\frac{global\_step}{decay\_steps})^{power}+end\_learning\_rate\end{split}\end{aligned}\end{align} \]

If cycle is set to False, then:

\[ \begin{align}\begin{aligned}global\_step & = min(global\_step, decay\_steps)\\\begin{split}decayed\_learning\_rate & = (learning\_rate-end\_learning\_rate)*(1-\\frac{global\_step}{decay\_steps})^{power}+end\_learning\_rate\end{split}\end{aligned}\end{align} \]
  • learning_rate (Variable|float) – The initial learning rate. If the type is Variable, it’s a tensor with shape [1], the data type can be float32 or float64. It also can be set to python int number.

  • decay_steps (int) – The decay step size. It determines the decay cycle.

  • end_learning_rate (float, optional) – The minimum final learning rate. The default value is 0.0001.

  • power (float, optional) – Power of polynomial. The default value is 1.0.

  • cycle (bool, optional) – If set true, decay the learning rate every decay_steps. The default value is False.

  • begin (int, optional) – The begin step. The initial value of global_step described above. The default value is 0.

  • step (int, optional) – The step size used to calculate the new global_step in the description above. The default value is 1.

  • dtype (str, optional) – The data type used to create the learning rate variable. The data type can be set as ‘float32’, ‘float64’. The default value is ‘float32’.




import paddle.fluid as fluid
start_lr = 0.01
total_step = 5000
end_lr = 0
with fluid.dygraph.guard():
    emb = fluid.dygraph.Embedding( [10, 10])
    optimizer  = fluid.optimizer.SGD(
        learning_rate = fluid.dygraph.PolynomialDecay(
        start_lr, total_step, end_lr, power=1.0),
        parameter_list = emb.parameters())