class paddle.fluid.dygraph.nn. NCE ( num_total_classes, dim, sample_weight=None, param_attr=None, bias_attr=None, num_neg_samples=None, sampler='uniform', custom_dist=None, seed=0, is_sparse=False, dtype='float32' ) [source]

This interface is used to construct a callable object of the NCE class. For more details, refer to code examples. It implements the function of the NCE loss function. By default this function uses a uniform distribution for sampling, and it compute and return the noise-contrastive estimation training loss. See Noise-contrastive estimation: A new estimation principle for unnormalized statistical models .

  • num_total_classes (int) – Total number of classes in all samples.

  • dim (int) – Dimension of input (possibly embedding dim).

  • param_attr (ParamAttr, optional) – The parameter attribute for learnable weights(Parameter) of nce. If it is set to None or one attribute of ParamAttr, nce will create ParamAttr as param_attr. If the Initializer of the param_attr is not set, the parameter is initialized with Xavier. Default: None.

  • bias_attr (ParamAttr or bool, optional) – The attribute for the bias of nce. If it is set to False, no bias will be added to the output units. If it is set to None or one attribute of ParamAttr, nce will create ParamAttr as bias_attr. If the Initializer of the bias_attr is not set, the bias is initialized zero. Default: None.

  • num_neg_samples (int, optional) – The number of negative classes. The default value is 10.

  • sampler (str, optional) – The sampler used to sample class from negative classes. It can be ‘uniform’, ‘log_uniform’ or ‘custom_dist’. default: ‘uniform’.

  • custom_dist (float[], optional) – A float[] with size=num_total_classes. It is used when sampler is set to ‘custom_dist’. custom_dist[i] is the probability of i-th class to be sampled. Default: None.

  • seed (int, optional) – The seed used in sampler. Default: 0.

  • is_sparse (bool, optional) – The flag indicating whether to use sparse update. If is_sparse is True, the weight@GRAD and bias@GRAD will be changed to SelectedRows. Default: False.

  • dtype (str, optional) – Data type, it can be “float32” or “float64”. Default: “float32”.


weight (Parameter): the learnable weights of this layer.

bias (Parameter or None): the learnable bias of this layer.




import numpy as np
import paddle.fluid as fluid

window_size = 5
dict_size = 20
label_word = int(window_size // 2) + 1
inp_word = np.array([[1], [2], [3], [4], [5]]).astype('int64')
nid_freq_arr = np.random.dirichlet(np.ones(20) * 1000).astype('float32')

with fluid.dygraph.guard():
    words = []
    for i in range(window_size):

    emb = fluid.Embedding(
        size=[dict_size, 32],

    embs3 = []
    for i in range(window_size):
        if i == label_word:

        emb_rlt = emb(words[i])

    embs3 = fluid.layers.concat(input=embs3, axis=1)
    nce = fluid.NCE(

    wl = fluid.layers.unsqueeze(words[label_word], axes=[0])
    nce_loss3 = nce(embs3, wl)
forward ( input, label, sample_weight=None )

Defines the computation performed at every call. Should be overridden by all subclasses.

  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments