paddle.fluid.io. load_inference_model ( dirname, executor, model_filename=None, params_filename=None, pserver_endpoints=None ) [source]

Load the inference model from a given directory. By this API, you can get the model structure(Inference Program) and model parameters. If you just want to load parameters of the pre-trained model, please use the api_fluid_io_load_params API. You can refer to Save and Load a Model for more details.

  • dirname (str) – One of the following: - The given directory path. - Set to None when reading the model from memory.

  • executor (Executor) – The executor to run for loading inference model. See Executor for more details about it.

  • model_filename (str, optional) – One of the following: - The name of file to load the inference program. - If it is None, the default filename __model__ will be used. - When dirname is None, it must be set to a string containing model. Default: None.

  • params_filename (str, optional) –

    It is only used for the case that all

    parameters were saved in a single binary file. One of the following:

    System Message: WARNING/2 (/usr/local/lib/python3.8/site-packages/paddle/fluid/io.py:docstring of paddle.fluid.io.load_inference_model, line 22)

    Definition list ends without a blank line; unexpected unindent.

    • The name of file to load all parameters.

    • When dirname is None, it must be set to a string containing all the parameters.

    • If parameters were saved in separate files, set it as None. Default: None.

  • pserver_endpoints (list, optional) – It is only needed by the distributed inference. If using a distributed look up table during the training, this table is also needed by the inference process. Its value is a list of pserver endpoints.


The return of this API is a list with three elements: (program, feed_target_names, fetch_targets). The program is a Program (refer to Basic Concept), which is used for inference. The feed_target_names is a list of str, which contains names of variables that need to feed data in the inference program. The fetch_targets is a list of Variable (refer to Basic Concept). It contains variables from which we can get inference results.

Return type



import paddle
import paddle.fluid as fluid
import numpy as np

# Build the model
main_prog = fluid.Program()
startup_prog = fluid.Program()
with fluid.program_guard(main_prog, startup_prog):
    data = paddle.static.data(name="img", shape=[-1, 64, 784])
    w = paddle.create_parameter(shape=[784, 200], dtype='float32')
    b = paddle.create_parameter(shape=[200], dtype='float32')
    hidden_w = paddle.matmul(x=data, y=w)
    hidden_b = paddle.add(hidden_w, b)
place = fluid.CPUPlace()
exe = fluid.Executor(place)

# Save the inference model
path = "./infer_model"
fluid.io.save_inference_model(dirname=path, feeded_var_names=['img'],
             target_vars=[hidden_b], executor=exe, main_program=main_prog)

# Demo one. Not need to set the distributed look up table, because the
# training doesn't use a distributed look up table.
[inference_program, feed_target_names, fetch_targets] = (
    fluid.io.load_inference_model(dirname=path, executor=exe))
tensor_img = np.array(np.random.random((1, 64, 784)), dtype=np.float32)
results = exe.run(inference_program,
              feed={feed_target_names[0]: tensor_img},

# Demo two. If the training uses a distributed look up table, the pserver
# endpoints list should be supported when loading the inference model.
# The below is just an example.
endpoints = ["",""]
[dist_inference_program, dist_feed_target_names, dist_fetch_targets] = (

# In this example, the inference program was saved in the file
# "./infer_model/__model__" and parameters were saved in
# separate files under the directory "./infer_model".
# By the inference program, feed_target_names and
# fetch_targets, we can use an executor to run the inference
# program for getting the inference result.