beam_search(pre_ids, pre_scores, ids, scores, beam_size, end_id, level=0, is_accumulated=True, name=None, return_parent_idx=False)
Beam search is a classical algorithm for selecting candidate words in a machine translation task.
Refer to Beam search for more details.
This operator only supports LoDTensor. It is used after finishing scores calculation to perform beam search for one time step. Specifically, after
scoreshave been produced, it selects the top-K ( k is
beam_size) candidate word ids of current step from
idsaccording to the correspongding
pre_scoresare the output of beam_search at previous step, they are needed for special use to handle ended candidate translations.
Note that if
is_accumulatedis True, the
scorespassed in should be accumulated scores. Otherwise, the
scoresare considered as the probabilities of single step and would be transformed to the log field and added up with
pre_scoresfor final scores in this operator. Length penalty should be done with extra operators before calculating the accumulated scores if needed.
Please see the following demo for a fully beam search usage example:
pre_ids (Variable) – A LodTensor variable (lod level is 2), representing the selected ids of previous step. It is the output of beam_search at previous step. Its shape is [batch_size, 1] and its lod is [[0, 1, … , batch_size], [0, 1, …, batch_size]] at the first step. The data type should be int64.
pre_scores (Variable) – A LodTensor variable has the same shape and lod with
pre_ids, representing the accumulated scores corresponding to the selected ids of previous step. It is the output of beam_search at previous step. The data type should be float32.
ids (Variable|None) – A LodTensor variable containing the candidates ids. It has the same lod with
pre_idsand its shape should be [batch_size * beam_size, K], where K supposed to be greater than
beam_sizeand the first dimension size (decrease as samples reach to the end) should be same as that of
pre_ids. The data type should be int64. It can be None, which use indice in
scores (Variable) – A LodTensor variable containing the accumulated scores corresponding to
ids. Both its shape and lod are same as thoes of
ids. The data type should be float32.
beam_size (int) – The beam width used in beam search.
end_id (int) – The id of end token.
level (int) – It can be ignored and mustn’t change currently. The 2 level lod used in this operator has the following meaning: The first level describes how many beams each sample has, which would change to 0 when beams of the sample all end (batch reduce); The second level describes how many times each beam is selected. Default 0, which shouldn’t be changed currently.
is_accumulated (bool) – Whether the input
scoreis accumulated scores. Default True.
name (str, optional) – For detailed information, please refer to Name. Usually name is no need to set and None by default.
return_parent_idx (bool, optional) – Whether to return an extra Tensor variable in output, which stores the selected ids’ parent indice in
pre_idsand can be used to update RNN’s states by gather operator. Default False.
The tuple contains two or three LodTensor variables. The two LodTensor, representing the selected ids and the corresponding accumulated scores of current step, have the same shape [batch_size, beam_size] and lod with 2 levels, and have data types int64 and float32. If
return_parent_idxis True, an extra Tensor variable preserving the selected ids’ parent indice is included, whose shape is [batch_size * beam_size] and data type is int64.
- Return type
import paddle.fluid as fluid # Suppose `probs` contains predicted results from the computation # cell and `pre_ids` and `pre_scores` is the output of beam_search # at previous step. beam_size = 4 end_id = 1 pre_ids = fluid.data( name='pre_id', shape=[None, 1], lod_level=2, dtype='int64') pre_scores = fluid.data( name='pre_scores', shape=[None, 1], lod_level=2, dtype='float32') probs = fluid.data( name='probs', shape=[None, 10000], dtype='float32') topk_scores, topk_indices = fluid.layers.topk(probs, k=beam_size) accu_scores = fluid.layers.elementwise_add( x=fluid.layers.log(x=topk_scores), y=fluid.layers.reshape(pre_scores, shape=[-1]), axis=0) selected_ids, selected_scores = fluid.layers.beam_search( pre_ids=pre_ids, pre_scores=pre_scores, ids=topk_indices, scores=accu_scores, beam_size=beam_size, end_id=end_id)