conll05

Conll05 dataset. Paddle semantic role labeling Book and demo use this dataset as an example. Because Conll05 is not free in public, the default downloaded URL is test set of Conll05 (which is public). Users can change URL and MD5 to their Conll dataset. And a pre-trained word vector model based on Wikipedia corpus is used to initialize SRL model.

paddle.dataset.conll05.get_dict()[source]

Get the word, verb and label dictionary of Wikipedia corpus.

paddle.dataset.conll05.get_embedding()[source]

Get the trained word vector based on Wikipedia corpus.

paddle.dataset.conll05.test()[source]

Conll05 test set creator.

Because the training dataset is not free, the test dataset is used for training. It returns a reader creator, each sample in the reader is nine features, including sentence sequence, predicate, predicate context, predicate context flag and tagged sequence.

Returns

Training reader creator

Return type

callable