append_backward(loss, parameter_list=None, no_grad_set=None, callbacks=None, checkpoints=None)
This function appends backward part to main_program.
A complete neural network training is made up of forward and backward propagation. However, when we configure a network, we only need to specify its forward part. This function uses the chain rule to automatically generate the backward part according to the forward part.
In most cases, users do not need to invoke this function manually. It will be automatically invoked by the optimizer’s minimize function.
loss (Variable) – The loss variable of the network.
parameter_list (list of str, optional) – Names of parameters that need to be updated by optimizers. If it is None, all parameters will be updated. Default: None.
no_grad_set (set of str, optional) – Variable names in the Block 0 whose gradients should be ignored. All variables with stop_gradient=True from all blocks will be automatically added into this set. If this parameter is not None, the names in this set will be added to the default set. Default: None.
callbacks (list of callable object, optional) – List of callback functions. The callbacks are used for doing some custom jobs during backward part building. All callable objects in it will be invoked once each time a new gradient operator is added into the program. The callable object must has two input parameters: ‘block’ and ‘context’. The ‘block’ is the Block which the new gradient operator will be added to. The ‘context’ is a map, whose keys are gradient variable names and values are corresponding original Variable . In addition to this, the ‘context’ has another special key-value pair: the key is string ‘__current_op_desc__’ and the value is the op_desc of the gradient operator who has just triggered the callable object. Default: None.
Pairs of parameter and its corresponding gradients. The key is the parameter and the value is gradient variable.
- Return type
AssertionError– If loss is not an instance of Variable.
import paddle.fluid as fluid x = fluid.data(name='x', shape=[None, 13], dtype='float32') y = fluid.data(name='y', shape=[None, 1], dtype='float32') y_predict = fluid.layers.fc(input=x, size=1, act=None) loss = fluid.layers.square_error_cost(input=y_predict, label=y) avg_loss = fluid.layers.mean(loss) param_grad_list = fluid.backward.append_backward(loss=avg_loss) p_g_list1 = fluid.backward.append_backward(loss=avg_loss) # len(p_g_list1) == 2 p_g_list2 = fluid.backward.append_backward(loss=avg_loss, parameter_list=[p_g_list1.name]) # len(p_g_list1) == 1 p_g_list3 = fluid.backward.append_backward(loss=avg_loss, no_grad_set=set([p_g_list1.name])) # len(p_g_list1) == 1 p_g_list4 = fluid.backward.append_backward(loss=avg_loss, parameter_list=[p_g_list1.name], no_grad_set=set([p_g_list1.name])) # len(p_g_list1) == 0