noam_decay

paddle.fluid.layers.noam_decay(d_model, warmup_steps)[source]

Noam decay method. The numpy implementation of noam decay as follows.

import padde.fluid as fluid
import numpy as np
# set hyper parameters
d_model = 2
current_steps = 20
warmup_steps = 200
# compute
lr_value = np.power(d_model, -0.5) * np.min([
                        np.power(current_steps, -0.5),
                        np.power(warmup_steps, -1.5) * current_steps])

Please reference attention is all you need.

Parameters
  • d_model (Variable) – The dimensionality of input and output of model.

  • warmup_steps (Variable) – A super parameter.

Returns

The decayed learning rate.

Examples

import padde.fluid as fluid
warmup_steps = 100
learning_rate = 0.01
lr = fluid.layers.learning_rate_scheduler.noam_decay(
               1/(warmup_steps *(learning_rate ** 2)),
               warmup_steps)