Single-node training


To perform single-node training in PaddlePaddle Fluid, you need to read Prepare Data and Set up Simple Model . When you have finished reading Set up Simple Model , you can get two fluid.Program, namely startup_program and main_program . By default, you can use fluid.default_startup_program() and fluid.default_main_program() to get global fluid.Program .

For example:

import paddle.fluid as fluid

image ="image", shape=[None, 784], dtype='float32')
label ="label", shape=[None, 1], dtype='int64')
hidden = fluid.layers.fc(input=image, size=100, act='relu')
prediction = fluid.layers.fc(input=hidden, size=10, act='softmax')
loss = fluid.layers.cross_entropy(input=prediction, label=label)
loss = fluid.layers.mean(loss)

sgd = fluid.optimizer.SGD(learning_rate=0.001)

# Here the fluid.default_startup_program() and fluid.default_main_program()
# has been constructed.

After the configuration of model, the configurations of fluid.default_startup_program() and fluid.default_main_program() have been finished.

Initialize Parameters

Random Initialization of Parameters

After the configuration of model,the initialization of parameters will be written into fluid.default_startup_program() . By running this program in fluid.Executor() , the random initialization of parameters will be finished in global scope, i.e. fluid.global_scope() .For example:

exe = fluid.Executor(fluid.CUDAPlace(0))

Load Predefined Parameters

In the neural network training, predefined models are usually loaded to continue training. For how to load predefined parameters, please refer to Save, Load Models or Variables & Incremental Learning.

Single-card Training

Single-card training can be performed through calling run() of fluid.Executor() to run training fluid.Program . In the runtime, users can feed data with run(feed=...) and get output data with run(fetch=...) . For example:

import paddle.fluid as fluid
import numpy

train_program = fluid.Program()
startup_program = fluid.Program()
with fluid.program_guard(train_program, startup_program):
    data ='X', shape=[None, 1], dtype='float32')
    hidden = fluid.layers.fc(input=data, size=10)
    loss = fluid.layers.mean(hidden)
    sgd = fluid.optimizer.SGD(learning_rate=0.001)

use_cuda = True
place = fluid.CUDAPlace(0) if use_cuda else fluid.CPUPlace()
exe = fluid.Executor(place)

# Run the startup program once and only once.
# Not need to optimize/compile the startup program.

# Run the main program directly without compile.
x = numpy.random.random(size=(10, 1)).astype('float32')
loss_data, =,
                     feed={"X": x},
# Or use CompiledProgram:
compiled_prog = fluid.CompiledProgram(train_program)
loss_data, =,
             feed={"X": x},

Multi-card Training

In multi-card training, you can use fluid.CompiledProgram to compile the fluid.Program, and then call with_data_parallel. For example:

# NOTE: If you use CPU to run the program, you need
# to specify the CPU_NUM, otherwise, fluid will use
# all the number of the logic core as the CPU_NUM,
# in that case, the batch size of the input should be
# greater than CPU_NUM, if not, the process will be
# failed by an exception.
if not use_cuda:
    os.environ['CPU_NUM'] = str(2)

compiled_prog = fluid.CompiledProgram(
loss_data, =,
                     feed={"X": x},


  1. CompiledProgram will convert the input Program into a computational graph, and compiled_prog is a completely different object from the incoming train_program. At present, compiled_prog can not be saved.

  2. Multi-card training can also be used: ref:api_fluid_ParallelExecutor , but now it is recommended to use: CompiledProgram.

  3. If exe is initialized with CUDAPlace, the model will be run in GPU. In the mode of graphics card training, all graphics card will be occupied. Users can configure `CUDA_VISIBLE_DEVICES `_ to change graphics cards that are being used.

    System Message: WARNING/2 (/home/work/paddledoc/FluidDoc/doc/fluid/beginners_guide/coding_practice/single_node_en.rst, line 115); backlink

    Inline interpreted text or phrase reference start-string without end-string.

    System Message: WARNING/2 (/home/work/paddledoc/FluidDoc/doc/fluid/beginners_guide/coding_practice/single_node_en.rst, line 115); backlink

    Inline interpreted text or phrase reference start-string without end-string.

  4. If exe is initialized with CPUPlace, the model will be run in CPU. In this situation, the multi-threads are used to run the model, and the number of threads is equal to the number of logic cores. Users can configure CPU_NUM to change the number of threads that are being used.