class paddle.nn. MultiMarginLoss ( p: int = 1, margin: float = 1.0, weight=None, reduction='mean', name=None ) [source]

Creates a criterion that optimizes a multi-class classification hinge loss (margin-based loss) between input \(input\) and label \(label\):

For i-th mini-batch sample, the loss in terms of the 1D input \(input_i\) and scalar output \(label_i\) is:

\[\text{loss}(input_i, label_i) = \frac{\sum_{j} \max(0, \text{margin} - input_i[label_i] + input_i[j])^p}{\text{C}}\]

where \(0 \leq j \leq \text{C}-1\), \(0 \leq i \leq \text{N}-1\) and \(j \neq label_i\).

Optionally, you can give non-equal weighting on the classes by passing a 1D weight tensor into the constructor.

The loss function for i-th sample then becomes:

\[\text{loss}(input_i, label_i) = \frac{\sum_{j} \max(0, weight[label_i] * (\text{margin} - input_i[label_i] + input_i[j]))^p}{\text{C}}\]
  • p (int, Optional) – The norm degree for pairwise distance. Default: \(1\).

  • margin (float, Optional) – Default: \(1\).

  • weight (Tensor,optional) – a manual rescaling weight given to each class. If given, has to be a Tensor of shape (C,) and the data type is float32, float64. Default is 'None' .

  • reduction (str, optional) – Indicate how to calculate the loss by batch_size, the candidates are 'none' | 'mean' | 'sum'. If reduction is 'none', the unreduced loss is returned; If reduction is 'mean', the reduced mean loss is returned; If reduction is 'sum', the summed loss is returned. Default: 'mean'

  • name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name.

Call parameters:

input (Tensor): Input tensor, the data type is float32 or float64.

label (Tensor): Label tensor, 0<= label < input.shape[1], the data type is int32 or int64.


input: 2-D Tensor, the shape is [N, C], N is batch size and C means number of classes.

label: 1-D Tensor, the shape is [N,].

output: scalar. If reduction is 'none', then same shape as the label.


A callable object of MultiMarginLoss.


>>> import paddle
>>> import paddle.nn as nn

>>> input = paddle.to_tensor([[1, -2, 3], [0, -1, 2], [1, 0, 1]], dtype=paddle.float32)
>>> label = paddle.to_tensor([0, 1, 2], dtype=paddle.int32)

>>> multi_margin_loss = nn.MultiMarginLoss(reduction='mean')
>>> loss = multi_margin_loss(input, label)
>>> print(loss)
Tensor(shape=[], dtype=float32, place=Place(cpu), stop_gradient=True,
forward ( input, label )


Defines the computation performed at every call. Should be overridden by all subclasses.

  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments