amax

paddle. amax ( x: Tensor, axis: int | Sequence[int] | None = None, keepdim: bool = False, name: str | None = None ) → Tensor [source]

Computes the maximum of tensor elements over the given axis.

Note

The difference between max and amax is: If there are multiple maximum elements, amax evenly distributes gradient between these equal values, while max propagates gradient to all of them.

Parameters

x (Tensor) – A tensor, the data type is float32, float64, int32, int64, the dimension is no more than 4.
axis (int|list|tuple|None, optional) – The axis along which the maximum is computed. If None, compute the maximum over all elements of x and return a Tensor with a single element, otherwise must be in the range \([-x.ndim(x), x.ndim(x))\). If \(axis[i] < 0\), the axis to reduce is \(x.ndim + axis[i]\).
keepdim (bool, optional) – Whether to reserve the reduced dimension in the output Tensor. The result tensor will have one fewer dimension than the x unless keepdim is true, default value is False.
name (str|None, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.

Returns

Tensor, results of maximum on the specified axis of input tensor, it’s data type is the same as x.

Examples

>>> import paddle
>>> # data_x is a Tensor with shape [2, 4] with multiple maximum elements
>>> # the axis is a int element

>>> x = paddle.to_tensor([[0.1, 0.9, 0.9, 0.9],
...                         [0.9, 0.9, 0.6, 0.7]],
...                         dtype='float64', stop_gradient=False)
>>> # There are 5 maximum elements:
>>> # 1) amax evenly distributes gradient between these equal values,
>>> #    thus the corresponding gradients are 1/5=0.2;
>>> # 2) while max propagates gradient to all of them,
>>> #    thus the corresponding gradient are 1.
>>> result1 = paddle.amax(x)
>>> result1.backward()
>>> result1
Tensor(shape=[], dtype=float64, place=Place(cpu), stop_gradient=False,
0.90000000)
>>> x.grad
Tensor(shape=[2, 4], dtype=float64, place=Place(cpu), stop_gradient=False,
[[0.        , 0.20000000, 0.20000000, 0.20000000],
 [0.20000000, 0.20000000, 0.        , 0.        ]])

>>> x.clear_grad()
>>> result1_max = paddle.max(x)
>>> result1_max.backward()
>>> result1_max
Tensor(shape=[], dtype=float64, place=Place(cpu), stop_gradient=False,
0.90000000)
>>> x.grad
Tensor(shape=[2, 4], dtype=float64, place=Place(cpu), stop_gradient=False,
[[0., 1., 1., 1.],
 [1., 1., 0., 0.]])

>>> x.clear_grad()
>>> result2 = paddle.amax(x, axis=0)
>>> result2.backward()
>>> result2
Tensor(shape=[4], dtype=float64, place=Place(cpu), stop_gradient=False,
[0.90000000, 0.90000000, 0.90000000, 0.90000000])
>>> x.grad
Tensor(shape=[2, 4], dtype=float64, place=Place(cpu), stop_gradient=False,
[[0.        , 0.50000000, 1.        , 1.        ],
 [1.        , 0.50000000, 0.        , 0.        ]])

>>> x.clear_grad()
>>> result3 = paddle.amax(x, axis=-1)
>>> result3.backward()
>>> result3
Tensor(shape=[2], dtype=float64, place=Place(cpu), stop_gradient=False,
[0.90000000, 0.90000000])
>>> x.grad
Tensor(shape=[2, 4], dtype=float64, place=Place(cpu), stop_gradient=False,
[[0.        , 0.33333333, 0.33333333, 0.33333333],
 [0.50000000, 0.50000000, 0.        , 0.        ]])

>>> x.clear_grad()
>>> result4 = paddle.amax(x, axis=1, keepdim=True)
>>> result4.backward()
>>> result4
Tensor(shape=[2, 1], dtype=float64, place=Place(cpu), stop_gradient=False,
[[0.90000000],
 [0.90000000]])
>>> x.grad
Tensor(shape=[2, 4], dtype=float64, place=Place(cpu), stop_gradient=False,
[[0.        , 0.33333333, 0.33333333, 0.33333333],
 [0.50000000, 0.50000000, 0.        , 0.        ]])

>>> # data_y is a Tensor with shape [2, 2, 2]
>>> # the axis is list
>>> y = paddle.to_tensor([[[0.1, 0.9], [0.9, 0.9]],
...                         [[0.9, 0.9], [0.6, 0.7]]],
...                         dtype='float64', stop_gradient=False)
>>> result5 = paddle.amax(y, axis=[1, 2])
>>> result5.backward()
>>> result5
Tensor(shape=[2], dtype=float64, place=Place(cpu), stop_gradient=False,
[0.90000000, 0.90000000])
>>> y.grad
Tensor(shape=[2, 2, 2], dtype=float64, place=Place(cpu), stop_gradient=False,
[[[0.        , 0.33333333],
  [0.33333333, 0.33333333]],
 [[0.50000000, 0.50000000],
  [0.        , 0.        ]]])

>>> y.clear_grad()
>>> result6 = paddle.amax(y, axis=[0, 1])
>>> result6.backward()
>>> result6
Tensor(shape=[2], dtype=float64, place=Place(cpu), stop_gradient=False,
[0.90000000, 0.90000000])
>>> y.grad
Tensor(shape=[2, 2, 2], dtype=float64, place=Place(cpu), stop_gradient=False,
[[[0.        , 0.33333333],
  [0.50000000, 0.33333333]],
 [[0.50000000, 0.33333333],
  [0.        , 0.        ]]])