Movielens

class paddle.text.datasets. Movielens ( data_file=None, mode='train', test_ratio=0.1, rand_seed=0, download=True ) [source]

Implementation of Movielens 1-M dataset.

Parameters
  • data_file (str) – path to data tar file, can be set None if download is True. Default None

  • mode (str) – ‘train’ or ‘test’ mode. Default ‘train’.

  • test_ratio (float) – split ratio for test sample. Default 0.1.

  • rand_seed (int) – random seed. Default 0.

  • download (bool) – whether to download dataset automatically if data_file is not set. Default True

Returns

instance of Movielens 1-M dataset

Return type

Dataset

Examples

import paddle
from paddle.text.datasets import Movielens

class SimpleNet(paddle.nn.Layer):
    def __init__(self):
        super(SimpleNet, self).__init__()

    def forward(self, category, title, rating):
        return paddle.sum(category), paddle.sum(title), paddle.sum(rating)

paddle.disable_static()

movielens = Movielens(mode='train')

for i in range(10):
    category, title, rating = movielens[i][-3:]
    category = paddle.to_tensor(category)
    title = paddle.to_tensor(title)
    rating = paddle.to_tensor(rating)

    model = SimpleNet()
    category, title, rating = model(category, title, rating)
    print(category.numpy().shape, title.numpy().shape, rating.numpy().shape)