softmax_with_cross_entropy¶
- paddle.nn.functional. softmax_with_cross_entropy ( logits, label, soft_label=False, ignore_index=- 100, numeric_stable_mode=True, return_softmax=False, axis=- 1 ) [source]
- 
         This operator implements the cross entropy loss function with softmax. This function combines the calculation of the softmax operation and the cross entropy loss function to provide a more numerically stable gradient. Because this operator performs a softmax on logits internally, it expects unscaled logits. This operator should not be used with the output of softmax operator since that would produce incorrect results. When the attribute soft_labelis setFalse, this operators expects mutually exclusive hard labels, each sample in a batch is in exactly one class with a probability of 1.0. Each sample in the batch will have a single label.The equation is as follows: - Hard label (one-hot label, so every sample has exactly one class) 
 \[\begin{split}\\loss_j=-\text{logits}_{label_j} +\log\left(\sum_{i=0}^{K}\exp(\text{logits}_i)\right), j = 1,..., K\end{split}\]- Soft label (each sample can have a distribution over all classes) 
 \[\begin{split}\\loss_j= -\sum_{i=0}^{K}\text{label}_i\left(\text{logits}_i - \log\left(\sum_{i=0}^{K}\exp(\text{logits}_i)\right)\right), j = 1,...,K\end{split}\]- If - numeric_stable_modeis- True, softmax is calculated first by:
 \[\begin{split}\\max_j&=\max_{i=0}^{K}{\text{logits}_i} \\ log\_max\_sum_j &= \log\sum_{i=0}^{K}\exp(logits_i - max_j)\\ softmax_j &= \exp(logits_j - max_j - {log\_max\_sum}_j)\end{split}\]and then cross entropy loss is calculated by softmax and label. - Parameters
- 
           - logits (Tensor) – A multi-dimension - Tensor, and the data type is float32 or float64. The input tensor of unscaled log probabilities.
- label (Tensor) – The ground truth - Tensor, data type is the same as the- logits. If- soft_labelis set to- True, Label is a- Tensorin the same shape with- logits. If- soft_labelis set to- True, Label is a- Tensorin the same shape with- logitsexpect shape in dimension- axisas 1.
- soft_label (bool, optional) – A flag to indicate whether to interpretant the given labels as soft labels. Default False. 
- ignore_index (int, optional) – Specifies a target value that is ignored and does not contribute to the input gradient. Only valid if - soft_labelis set to- False. Default: kIgnoreIndex(-100).
- numeric_stable_mode (bool, optional) – A flag to indicate whether to use a more numerically stable algorithm. Only valid when - soft_labelis- Falseand GPU is used. When- soft_labelis- Trueor CPU is used, the algorithm is always numerically stable. Note that the speed may be slower when use stable algorithm. Default: True.
- return_softmax (bool, optional) – A flag indicating whether to return the softmax along with the cross entropy loss. Default: False. 
- axis (int, optional) – The index of dimension to perform softmax calculations. It should be in range \([-1, rank - 1]\), while \(rank\) is the rank of input - logits. Default: -1.
 
- Returns
- 
           
           - Return the cross entropy loss if
- 
             return_softmax is False, otherwise the tuple (loss, softmax), softmax is in the same shape with input logits and cross entropy loss is in the same shape with input logits except shape in dimension axisas 1.
 
- Return type
- 
           Tensoror Tuple of twoTensor
 Examples import paddle logits = paddle.to_tensor([0.4, 0.6, 0.9], dtype="float32") label = paddle.to_tensor([1], dtype="int64") out = paddle.nn.functional.softmax_with_cross_entropy(logits=logits, label=label) print(out) # Tensor(shape=[1], dtype=float32, place=Place(gpu:0), stop_gradient=True, # [1.15328646]) 
