QueueDataset¶
- class paddle.fluid.dataset. QueueDataset [source]
- 
         QueueDataset, it will process data streamly. Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset("QueueDataset") - 
            
           local_shuffle
           (
           )
           local_shuffle¶
- 
           Local shuffle data. Local shuffle is not supported in QueueDataset NotImplementedError will be raised Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset("QueueDataset") dataset.local_shuffle() - Raises
- 
             NotImplementedError – QueueDataset does not support local shuffle 
 
 - 
            
           global_shuffle
           (
           fleet=None
           )
           global_shuffle¶
- 
           Global shuffle data. Global shuffle is not supported in QueueDataset NotImplementedError will be raised - Parameters
- 
             fleet (Fleet) – fleet singleton. Default None. 
 Examples import paddle.fluid as fluid from paddle.fluid.incubate.fleet.parameter_server.pslib import fleet dataset = fluid.DatasetFactory().create_dataset("QueueDataset") dataset.global_shuffle(fleet) - Raises
- 
             NotImplementedError – QueueDataset does not support global shuffle 
 
 - 
            
           desc
           (
           )
           desc¶
- 
           Returns a protobuf message for this DataFeedDesc Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset() print(dataset.desc()) - Returns
- 
             A string message 
 
 - 
            
           set_batch_size
           (
           batch_size
           )
           set_batch_size¶
- 
           Set batch size. Will be effective during training Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset() dataset.set_batch_size(128) - Parameters
- 
             batch_size (int) – batch size 
 
 - 
            
           set_download_cmd
           (
           download_cmd
           )
           set_download_cmd¶
- 
           Set customized download cmd: download_cmd Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset() dataset.set_download_cmd("./read_from_afs") - Parameters
- 
             download_cmd (str) – customized download command 
 
 - 
            
           set_fea_eval
           (
           record_candidate_size, 
           fea_eval=True
           )
           set_fea_eval¶
- 
           set fea eval mode for slots shuffle to debug the importance level of slots(features), fea_eval need to be set True for slots shuffle. - Parameters
- 
             - record_candidate_size (int) – size of instances candidate to shuffle one slot 
- fea_eval (bool) – whether enable fea eval mode to enable slots shuffle. default is True. 
 
 Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset(“InMemoryDataset”) dataset.set_fea_eval(1000000, True) 
 - 
            
           set_filelist
           (
           filelist
           )
           set_filelist¶
- 
           Set file list in current worker. Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset() dataset.set_filelist(['a.txt', 'b.txt']) - Parameters
- 
             filelist (list) – file list 
 
 - 
            
           set_hdfs_config
           (
           fs_name, 
           fs_ugi
           )
           set_hdfs_config¶
- 
           Set hdfs config: fs name ad ugi Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset() dataset.set_hdfs_config("my_fs_name", "my_fs_ugi") - Parameters
- 
             - fs_name (str) – fs name 
- fs_ugi (str) – fs ugi 
 
 
 - 
            
           set_pipe_command
           (
           pipe_command
           )
           set_pipe_command¶
- 
           Set pipe command of current dataset A pipe command is a UNIX pipeline command that can be used only Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset() dataset.set_pipe_command("python my_script.py") - Parameters
- 
             pipe_command (str) – pipe command 
 
 - 
            
           set_pv_batch_size
           (
           pv_batch_size
           )
           set_pv_batch_size¶
- 
           Set pv batch size. It will be effective during enable_pv_merge Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset() dataset.set_pv_batch(128) - Parameters
- 
             pv_batch_size (int) – pv batch size 
 
 - 
            
           set_rank_offset
           (
           rank_offset
           )
           set_rank_offset¶
- 
           Set rank_offset for merge_pv. It set the message of Pv. Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset() dataset.set_rank_offset("rank_offset") - Parameters
- 
             rank_offset (str) – rank_offset’s name 
 
 - 
            
           set_so_parser_name
           (
           so_parser_name
           )
           set_so_parser_name¶
- 
           Set so parser name of current dataset Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset() dataset.set_so_parser_name("./abc.so") - Parameters
- 
             pipe_command (str) – pipe command 
 
 - 
            
           set_thread
           (
           thread_num
           )
           set_thread¶
- 
           Set thread num, it is the num of readers. Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset() dataset.set_thread(12) - Parameters
- 
             thread_num (int) – thread num 
 
 - 
            
           set_use_var
           (
           var_list
           )
           set_use_var¶
- 
           Set Variables which you will use. Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset() dataset.set_use_var([data, label]) - Parameters
- 
             var_list (list) – variable list 
 
 - 
            
           slots_shuffle
           (
           slots
           )
           slots_shuffle¶
- 
           Slots Shuffle Slots Shuffle is a shuffle method in slots level, which is usually used in sparse feature with large scale of instances. To compare the metric, i.e. auc while doing slots shuffle on one or several slots with baseline to evaluate the importance level of slots(features). - Parameters
- 
             slots (list[string]) – the set of slots(string) to do slots shuffle. 
 Examples import paddle.fluid as fluid dataset = fluid.DatasetFactory().create_dataset(“InMemoryDataset”) dataset.set_merge_by_lineid() #suppose there is a slot 0 dataset.slots_shuffle([‘0’]) 
 
- 
            
           local_shuffle
           (
           )
           
