declarative programming (static graph)
crf_decoding(input, param_attr, label=None, length=None)
The crf_decoding operator reads the emission feature weights and the transition feature weights learned by the linear_chain_crf operator and performs decoding. It implements the Viterbi algorithm which is a dynamic programming algorithm for finding the most likely sequence of hidden states, called the Viterbi path, that results in a sequence of observed tags.
The output of this operator changes according to whether Input(Label) is given:
Input(Label) is given: This happens in training. This operator is used to co-work with the chunk_eval operator. When Input(Label) is given, the crf_decoding operator returns tensor with the sampe shape as Input(Label) whose values are fixed to be 0, indicating an incorrect prediction, or 1 indicating a tag is correctly predicted. Such an output is the input to chunk_eval operator.
Input(Label) is not given: This is the standard decoding process.
The crf_decoding operator returns a row vector with shape [N x 1]/[B x S], here the shape depends on the inputs are LoDTensors or common tensors, whose values range from 0 to maximum tag number - 1, Each element indicates an index of a predicted tag.
input (Variable) – (Tensor/LoDTensor). For a LoDTensor input, its shape is [N x D] where N is the total sequence length of the mini-batch and D is the total tag number. While for a tensor input, its shape is [B X S X D] with B the batch size and S the sequence length of each sample after padding. This input is the unscaled emission weight matrix of the linear_chain_crf operator. The data type is float32 or float64
param_attr (ParamAttr|None) – To specify the weight parameter attribute. Default: None, which means the default weight parameter property is used. See usage for details in ParamAttr .
label (Variable, optional) – (Tensor/LoDTensor). The ground truth with shape [N x 1] (for LoDTensor) or [B x S] (for Tensor). This input is optional. See more details in the operator’s comments. The data type is int64
length (Variable, optional) – (Tensor). The actual length of each sample before padding with shape [B x 1]. It means the Input(Emission), Input(Label) and Output(ViterbiPath) are common tensors with padding when this input is given. The data type is int64
(Tensor/LoDTensor). The decoding results. What to return changes depending on whether the Input(Label) (the ground truth) is given. See more details in the operator’s comment. The data type is int64
- Return type
import paddle.fluid as fluid # LoDTensor-based example num_labels = 10 feature = fluid.data(name='word_emb', shape=[-1, 784], dtype='float32', lod_level=1) label = fluid.data(name='label', shape=[-1, 1], dtype='int64', lod_level=1) emission = fluid.layers.fc(input=feature, size=num_labels) crf_cost = fluid.layers.linear_chain_crf(input=emission, label=label, param_attr=fluid.ParamAttr(name="crfw")) crf_decode = fluid.layers.crf_decoding(input=emission, param_attr=fluid.ParamAttr(name="crfw")) # Common tensor example num_labels, max_len = 10, 20 feature = fluid.data(name='word_emb_pad', shape=[-1, max_len, 784], dtype='float32') label = fluid.data(name='label_pad', shape=[-1, max_len, 1], dtype='int64') length = fluid.data(name='length', shape=[-1, 1], dtype='int64') emission = fluid.layers.fc(input=feature, size=num_labels, num_flatten_dims=2) crf_cost = fluid.layers.linear_chain_crf(input=emission, label=label, length=length, param_attr=fluid.ParamAttr(name="crfw_pad")) crf_decode = fluid.layers.crf_decoding(input=emission, length=length, param_attr=fluid.ParamAttr(name="crfw_pad"))