decorate¶
- paddle.fluid.contrib.mixed_precision.decorator. decorate ( optimizer, amp_lists=None, init_loss_scaling=32768, incr_every_n_steps=1000, decr_every_n_nan_or_inf=2, incr_ratio=2.0, decr_ratio=0.8, use_dynamic_loss_scaling=True, use_pure_fp16=False, use_fp16_guard=None ) [source]
-
Decorate the given optimizer to adapt to the mixed-precision training.
- Parameters
-
optimizer (Optimizer) – A common Optimizer.
amp_lists (CustomOpLists) – An CustomOpLists object.
init_loss_scaling (float) – The initial loss scaling factor.
incr_every_n_steps (int) – Increases loss scaling every n consecutive steps with finite gradients.
decr_every_n_nan_or_inf (int) – Decreases loss scaling every n accumulated steps with nan or inf gradients.
incr_ratio (float) – The multiplier to use when increasing the loss scaling.
decr_ratio (float) – The less-than-one-multiplier to use when decreasing the loss scaling.
use_dynamic_loss_scaling (bool) – Whether to use dynamic loss scaling.
use_pure_fp16 (bool) – Whether to use the pure fp16 training. Default False.
use_fp16_guard (bool) – Whether to use fp16_guard when constructing the program. Default None, which means that its value equals to use_pure_fp16.
- Returns
-
An optimizer acting like a normal one but with mixed-precision training enabled.
- Examples 1:
-
# black&white list based strategy example import paddle import paddle.static as static
paddle.enable_static()
data = static.data(name=’X’, shape=[None, 1], dtype=’float32’) hidden = static.nn.fc(x=data, size=10) loss = paddle.mean(hidden) optimizer = paddle.optimizer.Adam(learning_rate=0.001)
- mp_optimizer = static.amp.decorate(
-
optimizer=optimizer, init_loss_scaling=8.0)
ops, param_grads = mp_optimizer.minimize(loss) scaled_loss = mp_optimizer.get_scaled_loss()
- Examples 2:
-
# pure fp16 training example import numpy as np import paddle import paddle.nn.functional as F def run_example_code(): place = paddle.CUDAPlace(0) exe = paddle.static.Executor(place) data = paddle.static.data(name='X', shape=[None, 1, 28, 28], dtype='float32') conv2d = paddle.static.nn.conv2d(input=data, num_filters=6, filter_size=3) # 1) Use fp16_guard to control the range of fp16 kernels used. with paddle.static.amp.fp16_guard(): bn = paddle.static.nn.batch_norm(input=conv2d, act="relu") pool = F.max_pool2d(bn, kernel_size=2, stride=2) hidden = paddle.static.nn.fc(pool, size=10) loss = paddle.mean(hidden) # 2) Create the optimizer and set `multi_precision` to True. # Setting `multi_precision` to True can avoid the poor accuracy # or the slow convergence in a way. optimizer = paddle.optimizer.Momentum(learning_rate=0.01, multi_precision=True) # 3) These ops in `custom_black_list` will keep in the float32 computation type. amp_list = paddle.static.amp.CustomOpLists( custom_black_list=['pool2d']) # 4) The entry of Paddle AMP. # Enable pure fp16 training by setting `use_pure_fp16` to True. optimizer = paddle.static.amp.decorate( optimizer, amp_list, init_loss_scaling=128.0, use_dynamic_loss_scaling=True, use_pure_fp16=True) # If you don't use the default_startup_program(), you sholud pass # your defined `startup_program` into `minimize`. optimizer.minimize(loss) exe.run(paddle.static.default_startup_program()) # 5) Use `amp_init` after FP32 parameters initialization(such as `exe.run(startup_program)`). # If you want to perform the testing process, you should pass `test_program` into `amp_init`. optimizer.amp_init(place, scope=paddle.static.global_scope()) if paddle.is_compiled_with_cuda() and len(paddle.static.cuda_places()) > 0: run_example_code()