paddle.fluid.backward.append_backward(loss, parameter_list=None, no_grad_set=None, callbacks=None, checkpoints=None)[source]

This function appends backward part to main_program.

A complete neural network training is made up of forward and backward propagation. However, when we configure a network, we only need to specify its forward part. This function uses the chain rule to automatically generate the backward part according to the forward part.

In most cases, users do not need to invoke this function manually. It will be automatically invoked by the optimizer’s minimize function.

  • loss (api_guide_Variable_en) – The loss variable of the network.

  • parameter_list (list of str, optional) – Names of parameters that need to be updated by optimizers. If it is None, all parameters will be updated. Default: None.

  • no_grad_set (set of str, optional) – Variable names in the api_guide_Block_en 0 whose gradients should be ignored. All variables with stop_gradient=True from all blocks will be automatically added into this set. If this parameter is not None, the names in this set will be added to the default set. Default: None.

  • callbacks (list of callable object, optional) – List of callback functions. The callbacks are used for doing some custom jobs during backward part building. All callable objects in it will be invoked once each time a new gradient operator is added into the program. The callable object must has two input parameters: ‘block’ and ‘context’. The ‘block’ is the api_guide_Block_en which the new gradient operator will be added to. The ‘context’ is a map, whose keys are gradient variable names and values are corresponding original api_guide_Variable_en . In addition to this, the ‘context’ has another special key-value pair: the key is string ‘__current_op_desc__’ and the value is the op_desc of the gradient operator who has just triggered the callable object. Default: None.


Pairs of parameter and its corresponding gradients. The key is the parameter and the value is gradient variable.

Return type

list of tuple ( api_guide_Variable_en , api_guide_Variable_en )


AssertionError – If loss is not an instance of Variable.


import paddle.fluid as fluid
x = fluid.data(name='x', shape=[None, 13], dtype='float32')
y = fluid.data(name='y', shape=[None, 1], dtype='float32')

y_predict = fluid.layers.fc(input=x, size=1, act=None)
loss = fluid.layers.square_error_cost(input=y_predict, label=y)

avg_loss = fluid.layers.mean(loss)
param_grad_list = fluid.backward.append_backward(loss=avg_loss)
p_g_list1 = fluid.backward.append_backward(loss=avg_loss)  # len(p_g_list1) == 2
p_g_list2 = fluid.backward.append_backward(loss=avg_loss, parameter_list=[p_g_list1[0][0].name])  # len(p_g_list1) == 1
p_g_list3 = fluid.backward.append_backward(loss=avg_loss, no_grad_set=set([p_g_list1[0][0].name]))  # len(p_g_list1) == 1
p_g_list4 = fluid.backward.append_backward(loss=avg_loss, parameter_list=[p_g_list1[0][0].name], no_grad_set=set([p_g_list1[0][0].name]))  # len(p_g_list1) == 0


paddle.fluid.backward.gradients(targets, inputs, target_gradients=None, no_grad_set=None)[source]

Backpropagate the gradients of targets to inputs.

  • targets (Variable|list[Variable]) – The target variables.

  • inputs (Variable|list[Variable]) – The input variables.

  • target_gradients (Variable|list[Variable]|None) – The gradient variables of targets which has the same shape with targets, If None, ones will be created for them.

  • no_grad_set (set[string]) – The names of variables that have no gradients in Block 0. All variables with stop_gradient=True from all blocks will be automatically added.


A list of gradients for inputs If an input does not affect targets, the corresponding gradient variable will be None.

Return type



import paddle.fluid as fluid

x = fluid.layers.data(name='x', shape=[2,8,8], dtype='float32')
y = fluid.layers.conv2d(x, 4, 1, bias_attr=False)
y = fluid.layers.relu(y)
y = fluid.layers.conv2d(y, 4, 1, bias_attr=False)
y = fluid.layers.relu(y)
z = fluid.gradients([y], x)