DataFeeder¶
- class paddle.fluid.data_feeder. DataFeeder ( feed_list, place, program=None ) [source]
- 
         - Api_attr
- 
           Static Graph 
 DataFeeder converts the data that returned by a reader into a data structure that can feed into Executor. The reader is usually a python generator that returns a list of mini-batch data entries. - Parameters
- 
           - feed_list (list) – Variables or names of Variables that need to feed. 
- place (api_fluid_CPUPlace | api_fluid_CUDAPlace) – place indicates the device (CPU | GPU) the data will be fed into, if you want to feed data into GPU, please using - fluid.CUDAPlace(i)(- irepresents the GPU id), or if you want to feed data into CPU, please using- fluid.CPUPlace().
- program (api_fluid_Program , optional) – The Program that will feed data into, if program is None, it will use default_main_program(). Default None. 
 
- Raises
- 
           ValueError – 
 Example import numpy as np import paddle import paddle.fluid as fluid place = fluid.CPUPlace() def reader(): for _ in range(4): yield np.random.random([4]).astype('float32'), np.random.random([3]).astype('float32'), main_program = fluid.Program() startup_program = fluid.Program() with fluid.program_guard(main_program, startup_program): data_1 = fluid.data(name='data_1', shape=[None, 2, 2], dtype='float32') data_2 = fluid.data(name='data_2', shape=[None, 1, 3], dtype='float32') out = fluid.layers.fc(input=[data_1, data_2], size=2) # ... feeder = fluid.DataFeeder([data_1, data_2], place) exe = fluid.Executor(place) exe.run(startup_program) feed_data = feeder.feed(reader()) # print feed_data to view feed results # print(feed_data['data_1']) # print(feed_data['data_2']) outs = exe.run(program=main_program, feed=feed_data, fetch_list=[out]) print(outs) - 
            
           feed
           (
           iterable
           )
           feed¶
- 
           According to feed_listofDataFeederanditerable, converts the input into a data structure that can feed into Executor.- Parameters
- 
             iterable (generator) – user defined python generator to read the raw input data 
- Returns
- 
             a dictthat contains (variable name - converted tensor) pairs
- Return type
- 
             dict
 Example # In this example, reader - generator will return a list of ndarray of 3 elements # feed API will convert each ndarray input into a tensor # the return result is a dict with keys: data_1, data_2, data_3 # result['data_1'] a LoD-Tensor with shape of [5, 2, 1, 3]. 5 is batch size, and [2, 1, 3] is the real shape of data_1. # result['data_2'], result['data_3'] are similar. import numpy as np import paddle.fluid as fluid def reader(limit=5): for i in range(1, limit + 1): yield np.ones([6]).astype('float32') * i , np.ones([1]).astype('int64') * i, np.random.random([9]).astype('float32') data_1 = fluid.data(name='data_1', shape=[None, 2, 1, 3]) data_2 = fluid.data(name='data_2', shape=[None, 1], dtype='int64') data_3 = fluid.data(name='data_3', shape=[None, 3, 3], dtype='float32') feeder = fluid.DataFeeder(['data_1','data_2', 'data_3'], fluid.CPUPlace()) result = feeder.feed(reader()) print(result['data_1']) print(result['data_2']) print(result['data_3']) 
 - 
            
           feed_parallel
           (
           iterable, 
           num_places=None
           )
           feed_parallel¶
- 
           Similar with feed function, feed_parallel is used with multiple devices (CPU|GPU). Here iterableis a list of python generators. The data return by each generator in the list will be fed into a separate device.- Parameters
- 
             - iterable (list|tuple) – list of user-defined python generators. The element number should match the - num_places.
- num_places (int, optional) – the number of devices. If not provided (None), all available devices on the machine will be used. Default None. 
 
- Returns
- 
             a generatorthat generate dict which contains (variable name - converted tensor) pairs, the total number of dicts will be generated matches with thenum_places
- Return type
- 
             generator
 Note The number of devices - num_placesshould equal to the generator (element ofiterable) numberExample import numpy as np import paddle.fluid as fluid def generate_reader(batch_size, base=0, factor=1): def _reader(): for i in range(batch_size): yield np.ones([4]) * factor + base, np.ones([4]) * factor + base + 5 return _reader() x = fluid.data(name='x', shape=[None, 2, 2]) y = fluid.data(name='y', shape=[None, 2, 2], dtype='float32') z = fluid.layers.elementwise_add(x, y) feeder = fluid.DataFeeder(['x','y'], fluid.CPUPlace()) place_num = 2 places = [fluid.CPUPlace() for x in range(place_num)] data = [] exe = fluid.Executor(fluid.CPUPlace()) exe.run(fluid.default_startup_program()) program = fluid.CompiledProgram(fluid.default_main_program()).with_data_parallel(places=places) # print sample feed_parallel r result # for item in list(feeder.feed_parallel([generate_reader(5, 0, 1), generate_reader(3, 10, 2)], 2)): # print(item['x']) # print(item['y']) reader_list = [generate_reader(5, 0, 1), generate_reader(3, 10, 2)] res = exe.run(program=program, feed=list(feeder.feed_parallel(reader_list, 2)), fetch_list=[z]) print(res) 
 - 
            
           decorate_reader
           (
           reader, 
           multi_devices, 
           num_places=None, 
           drop_last=True
           )
           decorate_reader¶
- 
           Decorate the reader (generator) to fit multiple devices. The reader generate multiple mini-batches. Each mini-batch will be fed into a single device. - Parameters
- 
             - reader (generator) – a user defined python generator used to get - mini-batchof data. A- mini-batchcan be regarded as a python generator that returns batches of input entities, just like the below- _mini_batchin the code example.
- multi_devices (bool) – indicate whether to use multiple devices or not. 
- num_places (int, optional) – if - multi_devicesis True, you can specify the number of devices(CPU|GPU) to use, if multi_devices is None, the function will use all the devices of the current machine. Default None.
- drop_last (bool, optional) – whether to drop the last round of data if it is not enough to feed all devices. Default True. 
 
- Returns
- 
             a new generatorwhich return converted dicts that can be fed into Executor
- Return type
- 
             generator
- Raises
- 
             ValueError – If drop_last is False and the data cannot fit devices perfectly. 
 Example import numpy as np import paddle import paddle.fluid as fluid import paddle.fluid.compiler as compiler def reader(): def _mini_batch(batch_size): for i in range(batch_size): yield np.random.random([16]).astype('float32'), np.random.randint(10, size=[1]) for _ in range(10): yield _mini_batch(np.random.randint(1, 10)) place_num = 3 places = [fluid.CPUPlace() for _ in range(place_num)] # a simple network sample data = fluid.data(name='data', shape=[None, 4, 4], dtype='float32') label = fluid.data(name='label', shape=[None, 1], dtype='int64') hidden = fluid.layers.fc(input=data, size=10) feeder = fluid.DataFeeder(place=places[0], feed_list=[data, label]) reader = feeder.decorate_reader(reader, multi_devices=True, num_places=3, drop_last=True) exe = fluid.Executor(places[0]) exe.run(fluid.default_startup_program()) compiled_prog = compiler.CompiledProgram( fluid.default_main_program()).with_data_parallel(places=places) for i,data in enumerate(reader()): # print data if you like # print(i, data) ret = exe.run(compiled_prog, feed=data, fetch_list=[hidden]) print(ret) 
 
