paddle.nn.functional. multi_margin_loss ( input, label, p: int = 1, margin: float = 1.0, weight=None, reduction='mean', name=None ) [source]

Measures a multi-class classification hinge loss between input \(input\) and label \(label\):

For i-th mini-batch sample, the loss in terms of the 1D input \(input_i\) and scalar output \(label_i\) is:

\[\text{loss}(input_i, label_i) = \frac{\sum_{j} \max(0, \text{margin} - input_i[label_i] + input_i[j])^p}{\text{C}}\]

where \(0 \leq j \leq \text{C}-1\), \(0 \leq i \leq \text{N}-1\) and \(j \neq label_i\).

Optionally, you can give non-equal weighting on the classes by passing a 1D weight tensor into the constructor.

The loss function for i-th sample then becomes:

\[\text{loss}(input_i, label_i) = \frac{\sum_{j} \max(0, weight[label_i] * (\text{margin} - input_i[label_i] + input_i[j]))^p}{\text{C}}\]
  • input (Tensor) – Input tensor, the data type is float32 or float64. Shape is (N, C), where C is number of classes.

  • label (Tensor) – Label tensor, the data type is int32 or int64. The shape of label is (N,)

  • p (int, Optional) – The power num. Default: \(1\).

  • margin (float, Optional) – Default: \(1\).

  • weight (Tensor,optional) – a manual rescaling weight given to each class. If given, has to be a Tensor of shape (C,) and the data type is float32, float64. Default is 'None' .

  • reduction (str, Optional) – Indicate how to calculate the loss by batch_size. the candidates are 'none' | 'mean' | 'sum'. If reduction is 'none', the unreduced loss is returned; If reduction is 'mean', the reduced mean loss is returned; If reduction is 'sum', the summed loss is returned. Default: 'mean'

  • name (str, Optional) – Name for the operation (optional, default is None). For more information, please refer to Name.


Tensor. The tensor variable storing the multi_margin_loss of input and label.

Return type



>>> import paddle
>>> import paddle.nn.functional as F

>>> input = paddle.to_tensor([[1, 5, 3], [0, 3, 2], [1, 4, 1]], dtype=paddle.float32)
>>> label = paddle.to_tensor([1, 2, 1], dtype=paddle.int32)
>>> loss = F.multi_margin_loss(input, label, margin=1.0, reduction='none')
>>> print(loss)
Tensor(shape=[3], dtype=float32, place=Place(cpu), stop_gradient=True,
       [0.        , 0.66666663, 0.        ])