sampled_softmax_with_cross_entropy( logits, label, num_samples, num_true=1, remove_accidental_hits=True, use_customized_samples=False, customized_samples=None, customized_probabilities=None, seed=0 )
Sampled Softmax With Cross Entropy Operator.
Cross entropy loss with sampled softmax is used as the output layer for larger output classes extensively. This operator samples a number of samples for all examples, and computes the softmax normalized values for each row of the sampled tensor, after which cross-entropy loss is computed.
Because this operator performs a softmax on logits internally, it expects unscaled logits. This operator should not be used with the output of softmax operator since that would produce incorrect results.
For examples with T true labels (T >= 1), we assume that each true label has a probability of 1/T. For each sample, S samples are generated using a log uniform distribution. True labels are concatenated with these samples to form T + S samples for each example. So, assume the shape of logits is [N x K], the shape for samples is [N x (T+S)]. For each sampled label, a probability is calculated, which corresponds to the Q(y|x) in [Jean et al., 2014](http://arxiv.org/abs/1412.2007).
Logits are sampled according to the sampled labels. Then if remove_accidental_hits is True, if a sample[i, j] accidentally hits true labels, then the corresponding sampled_logits[i, j] is minus by 1e20 to make its softmax result close to zero. Then sampled logits are subtracted by logQ(y|x), these sampled logits and re-indexed labels are used to compute a softmax with cross entropy.
logits (Variable) – The unscaled log probabilities, which is a 2-D tensor with shape [N x K]. N is the batch_size, and K is the class number.
label (Variable) – The ground truth which is a 2-D tensor. Label is a Tensor<int64> with shape [N x T], where T is the number of true labels per example.
num_samples (int) – The number for each example, num_samples should be less than the number of class.
num_true (int) – The number of target classes per training example.
remove_accidental_hits (bool) – A flag indicating whether to remove accidental hits when sampling. If True and if a sample[i, j] accidentally hits true labels, then the corresponding sampled_logits[i, j] is minus by 1e20 to make its softmax result close to zero. Default is True.
use_customized_samples (bool) – Whether to use custom samples and probabities to sample logits.
customized_samples (Variable) – User defined samples, which is a 2-D tensor with shape [N, T + S]. S is the num_samples, and T is the number of true labels per example.
customized_probabilities (Variable) – User defined probabilities of samples, a 2-D tensor which has the same shape with customized_samples.
seed (int) – The random seed for generating random number, which is used in the process of sampling. Default is 0.
- Return the cross entropy loss which is a 2-D tensor with shape
[N x 1].
- Return type
import paddle.fluid as fluid input = fluid.layers.data(name='data', shape=, dtype='float32') label = fluid.layers.data(name='label', shape=, dtype='int64') fc = fluid.layers.fc(input=input, size=100) out = fluid.layers.sampled_softmax_with_cross_entropy( logits=fc, label=label, num_samples=25)