crf_decoding

paddle.fluid.layers.crf_decoding(input, param_attr, label=None, length=None)[source]

The crf_decoding operator reads the emission feature weights and the transition feature weights learned by the linear_chain_crf operator and performs decoding. It implements the Viterbi algorithm which is a dynamic programming algorithm for finding the most likely sequence of hidden states, called the Viterbi path, that results in a sequence of observed tags.

The output of this operator changes according to whether Input(Label) is given:

  1. Input(Label) is given: This happens in training. This operator is used to co-work with the chunk_eval operator. When Input(Label) is given, the crf_decoding operator returns tensor with the sampe shape as Input(Label) whose values are fixed to be 0, indicating an incorrect prediction, or 1 indicating a tag is correctly predicted. Such an output is the input to chunk_eval operator.

  2. Input(Label) is not given: This is the standard decoding process.

The crf_decoding operator returns a row vector with shape [N x 1]/[B x S], here the shape depends on the inputs are LoDTensors or common tensors, whose values range from 0 to maximum tag number - 1, Each element indicates an index of a predicted tag.

Parameters
  • input (Variable) – (Tensor/LoDTensor). For a LoDTensor input, its shape is [N x D] where N is the total sequence length of the mini-batch and D is the total tag number. While for a tensor input, its shape is [B X S X D] with B the batch size and S the sequence length of each sample after padding. This input is the unscaled emission weight matrix of the linear_chain_crf operator. The data type is float32 or float64

  • param_attr (ParamAttr|None) – To specify the weight parameter attribute. Default: None, which means the default weight parameter property is used. See usage for details in ParamAttr .

  • label (Variable, optional) – (Tensor/LoDTensor). The ground truth with shape [N x 1] (for LoDTensor) or [B x S] (for Tensor). This input is optional. See more details in the operator’s comments. The data type is int64

  • length (Variable, optional) – (Tensor). The actual length of each sample before padding with shape [B x 1]. It means the Input(Emission), Input(Label) and Output(ViterbiPath) are common tensors with padding when this input is given. The data type is int64

Returns

(Tensor/LoDTensor). The decoding results. What to return changes depending on whether the Input(Label) (the ground truth) is given. See more details in the operator’s comment. The data type is int64

Return type

Variable

Examples

import paddle.fluid as fluid

# LoDTensor-based example
num_labels = 10
feature = fluid.data(name='word_emb', shape=[-1, 784], dtype='float32', lod_level=1)
label = fluid.data(name='label', shape=[-1, 1], dtype='int64', lod_level=1)
emission = fluid.layers.fc(input=feature, size=num_labels)

crf_cost = fluid.layers.linear_chain_crf(input=emission, label=label,
          param_attr=fluid.ParamAttr(name="crfw"))
crf_decode = fluid.layers.crf_decoding(input=emission,
          param_attr=fluid.ParamAttr(name="crfw"))

# Common tensor example
num_labels, max_len = 10, 20
feature = fluid.data(name='word_emb_pad', shape=[-1, max_len, 784], dtype='float32')
label = fluid.data(name='label_pad', shape=[-1, max_len, 1], dtype='int64')
length = fluid.data(name='length', shape=[-1, 1], dtype='int64')
emission = fluid.layers.fc(input=feature, size=num_labels,
                           num_flatten_dims=2)

crf_cost = fluid.layers.linear_chain_crf(input=emission, label=label, length=length,
          param_attr=fluid.ParamAttr(name="crfw_pad"))
crf_decode = fluid.layers.crf_decoding(input=emission, length=length,
          param_attr=fluid.ParamAttr(name="crfw_pad"))