paddle.static.nn. nce ( input, label, num_total_classes, sample_weight=None, param_attr=None, bias_attr=None, num_neg_samples=None, name=None, sampler='uniform', custom_dist=None, seed=0, is_sparse=False ) [source]

Static Graph

Compute and return the noise-contrastive estimation training loss. See Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. By default this operator uses a uniform distribution for sampling.

  • input (Tensor) – Input tensor, 2-D tensor with shape [batch_size, dim], and data type is float32 or float64.

  • label (Tensor) – Input label, 2-D tensor with shape [batch_size, num_true_class], and data type is int64.

  • num_total_classes (int) – Total number of classes in all samples.

  • sample_weight (Tensor|None) – A Tensor of shape [batch_size, 1] storing a weight for each sample. The default weight for each sample is 1.0.

  • param_attr (ParamAttr|None) – To specify the weight parameter attribute. Default: None, which means the default weight parameter property is used. See usage for details in api_fluid_ParamAttr .

  • bias_attr (ParamAttr|None) – To specify the bias parameter attribute. Default: None, which means the default bias parameter property is used. See usage for details in api_fluid_ParamAttr .

  • num_neg_samples (int) – The number of negative classes. The default value is 10.

  • name (str|None) – For detailed information, please refer to Name . Usually name is no need to set and None by default.

  • sampler (str, optional) – The sampler used to sample class from negative classes. It can be ‘uniform’, ‘log_uniform’ or ‘custom_dist’. default: ‘uniform’.

  • custom_dist (nd.array|None) – A numpy ndarray with size=num_total_classes. It is used when sampler is set to ‘custom_dist’. custom_dist[i] is the probability of i-th class to be sampled. default: None.

  • seed (int, optional) – The seed used in sampler. Default 0, means no random seed.

  • is_sparse (bool, optional) – The flag indicating whether to use sparse update, the weight@GRAD and bias@GRAD will be changed to SelectedRows. Default False.


The output nce loss.

Return type



import paddle
import numpy as np


window_size = 5
words = []
for i in range(window_size):
        name='word_{0}'.format(i), shape=[-1, 1], dtype='int64'))

dict_size = 10000
label_word = int(window_size / 2) + 1

embs = []
for i in range(window_size):
    if i == label_word:

    emb = paddle.static.nn.embedding(input=words[i], size=[dict_size, 32],
                        param_attr='embed', is_sparse=True)

embs = paddle.concat(x=embs, axis=1)
loss = paddle.static.nn.nce(input=embs, label=words[label_word],
            num_total_classes=dict_size, param_attr='nce.w_0',

#or use custom distribution
dist = np.array([0.05,0.5,0.1,0.3,0.05])
loss = paddle.static.nn.nce(input=embs, label=words[label_word],
        num_total_classes=5, param_attr='nce.w_1',