paddle.autograd. hessian ( ys: paddle.Tensor, xs: Union[paddle.Tensor, Tuple[paddle.Tensor, ...]], batch_axis: Optional[int] = None ) Union[Tuple[Tuple[paddle.autograd.autograd.Hessian, ...], ...], paddle.autograd.autograd.Hessian] [source]

Computes the Jacobian of the dependent variable ys versus the independent variable xs.

Among them, ys means the output of xs after a certain operation, ys can only be a single Tensor, xs can be a Tensor or a Tensor tuple, and batch_axis means The position of the batch dimension of the parameter data.

When the input xs is a Tensor tuple, the returned result is a Hessian tuple, assuming that the internal shape of the xs tuple is composed of ([M1, ], [M2, ]), the shape of the returned result consists of (([M1, M1], [M1, M2]), ([M2, M1], [M2, M2]))

  • When batch_axis=None, only 0-dimensional Tensor or 1-dimensional Tensor is supported, assuming that the shape of xs is [N, ], and the shape of ys is [ ] (0-dimensional Tensor), the final output is a single Hessian matrix whose shape is [N, N].

  • When batch_axis=0, only 1-dimensional Tensor or 2-dimensional Tensor is supported, assuming that the shape of xs is [B, N], and the shape of ys is [B, ], the final output Jacobian matrix shape is [B, N, N].

After the Hessian object is created, the complete calculation process does not occur, but a partial lazy evaluation method is used for calculation. It can be multi-dimensionally indexed to obtain the entire Hessian matrix or sub-matrix. At this time, the actual Evaluates the computation and returns the result. At the same time, in the actual evaluation process, the calculated sub-matrix will be cached to avoid repeated calculations in the subsequent indexing process.

  • ys (paddle.Tensor) – Output derived from xs which contain one element.

  • xs (Union[paddle.Tensor, Tuple[paddle.Tensor, ...]]) – Input or tuple of inputs.

  • batch_axis (Optional[int], optional) – Index of batch axis. Defaults to None.


Hessian(s) of ys deriveted from xs.

Return type

Union[Tuple[Tuple[Hessian, …], …], Tuple[Hessian, …], Hessian]


import paddle

x1 = paddle.randn([3, ])
x2 = paddle.randn([4, ])
x1.stop_gradient = False
x2.stop_gradient = False

y = x1.sum() + x2.sum()

H = paddle.autograd.hessian(y, (x1, x2))
H_y_x1_x1 = H[0][0][:] # evaluate result of ddy/dx1x1
H_y_x1_x2 = H[0][1][:] # evaluate result of ddy/dx1x2
H_y_x2_x1 = H[1][0][:] # evaluate result of ddy/dx2x1
H_y_x2_x2 = H[1][1][:] # evaluate result of ddy/dx2x2

print(H_y_x1_x1.shape) # [3, 3]
print(H_y_x1_x2.shape) # [3, 4]
print(H_y_x2_x1.shape) # [4, 3]
print(H_y_x2_x2.shape) # [4, 4]