TrainingHelper

class paddle.fluid.layers.TrainingHelper(inputs, sequence_length, time_major=False)[source]

TrainingHelper is a subclass of DecodeHelper. It is a decoding helper slicing from the full sequence inputs as the inputs for corresponding step. And it uses argmax to sample from the outputs of cell.call().

Since the needs of sequence inputs, it is used mostly for teach-forcing MLE (maximum likelihood) training, and the sampled would not be used.

Examples

import paddle.fluid as fluid
import paddle.fluid.layers as layers
trg_emb = fluid.data(name="trg_emb",
                     shape=[None, None, 128],
                     dtype="float32")
trg_seq_length = fluid.data(name="trg_seq_length",
                            shape=[None],
                            dtype="int64")
helper = layers.TrainingHelper(trg_emb, trg_seq_length)
decoder_cell = layers.GRUCell(hidden_size=128)
decoder = layers.BasicDecoder(decoder_cell, helper)
outputs = layers.dynamic_decode(
    decoder,
    inits=decoder_cell.get_initial_states(trg_emb),
    is_test=False)
initialize()

TrainingHelper initialization produces inputs for the first decoding step by slicing at the first time step of full sequence inputs, and it gives initial status telling whether each sequence in the batch is finished. It is the partial of the initialization of BasicDecoder.

Returns

A tuple( (initial_inputs, initial_finished) ). initial_inputs is a (possibly nested structure of) tensor variable[s], and the tensor’s shape is [batch_size, …]. initial_finished is a bool tensor with shape [batch_size].

Return type

tuple

sample(time, outputs, states)

Perform sampling by using argmax according to the outputs. Mostly the sampled ids would not be used since the inputs for next decoding step would be got by slicing.

Parameters
  • time (Variable) – An int64 tensor with shape [1] provided by the caller, representing the current time step number of decoding.

  • outputs (Variable) – A tensor variable. Usually it’s data type is float32 or float64, and it’s shape is [batch_size, vocabulary_size], representing the predicted logits of current step. It is same as outputs returned by BasicDecoder.output_fn(BasicDecoder.cell.call()).

  • states (Variable) – A (possibly nested structure of) tensor variable[s]. It is same as new_states returned by BasicDecoder.cell.call().

Returns

An int64 tensor with shape [batch_size], representing the sampled ids.

Return type

Variable

next_inputs(time, outputs, states, sample_ids)

Generate inputs for the next decoding step by slicing at corresponding step of the full sequence inputs. Simultaneously, produce the states for next time step by directly using the input states and emit status telling whether each minibatch entry reaches to the corresponding length.

Parameters
  • time (Variable) – An int64 tensor with shape [1] provided by the caller, representing the current time step number of decoding.

  • outputs (Variable) – A tensor variable. Usually it’s data type is float32 or float64, and it’s shape is [batch_size, vocabulary_size], representing the predicted logits of current step. It is same as outputs returned by BasicDecoder.output_fn(BasicDecoder.cell.call()).

  • states (Variable) – A (possibly nested structure of) tensor variable[s]. It is same as new_states returned by BasicDecoder.cell.call().

  • sample_ids (Variable) – An int64 tensor variable shaped [batch_size]. It is same as sample_ids returned by sample().

Returns

A tuple( (finished, next_inputs, next_states) ). next_inputs and next_states both are a (possibly nested structure of) tensor variable[s], and the tensor’s shape is [batch_size, …]. next_states is identical to the input argument states. finished is a bool Tensor with shape [batch_size].

Return type

tuple