MNIST
- class paddle.vision.datasets. MNIST ( image_path=None, label_path=None, mode='train', transform=None, download=True, backend=None ) [源代码]
MNIST 数据集的实现。
参数
image_path (str,可选) - 图像文件路径,如果
download参数设置为True,image_path参数可以设置为None。默认值为None,默认存放在:~/.cache/paddle/dataset/mnist。label_path (str,可选) - 标签文件路径,如果
download参数设置为True,label_path参数可以设置为None。默认值为None,默认存放在:~/.cache/paddle/dataset/mnist。mode (str,可选) -
'train'或'test'模式两者之一,默认值为'train'。transform (Callable,可选) - 图片数据的预处理,若为
None即为不做预处理。默认值为None。download (bool,可选) - 当
data_file是None时,该参数决定是否自动下载数据集文件。默认值为True。backend (str,可选) - 指定要返回的图像类型:PIL.Image 或 numpy.ndarray。必须是 {'pil','cv2'} 中的值。如果未设置此选项,将从 paddle.vision.get_image_backend 获得这个值。默认值为
None。
代码示例
>>> import itertools
>>> import paddle.vision.transforms as T
>>> from paddle.vision.datasets import MNIST
>>> mnist = MNIST()
>>> print(len(mnist))
60000
>>> for i in range(5): # only show first 5 images
... img, label = mnist[i]
... # do something with img and label
... print(type(img), img.size, label)
... # <class 'PIL.Image.Image'> (28, 28) [5]
>>> transform = T.Compose(
... [
... T.ToTensor(),
... T.Normalize(
... mean=[127.5],
... std=[127.5],
... ),
... ]
... )
>>> mnist_test = MNIST(
... mode="test",
... transform=transform, # apply transform to every image
... backend="cv2", # use OpenCV as image transform backend
... )
>>> print(len(mnist_test))
10000
>>> for img, label in itertools.islice(iter(mnist_test), 5): # only show first 5 images
... # do something with img and label
... print(type(img), img.shape, label)
... # <class 'paddle.Tensor'> [1, 28, 28] [7]