Inference Engine

Inference engine provides interfaces to save inference model api_fluid_io_save_inference_model and load inference model api_fluid_io_load_inference_model .

Format of Saved Inference Model

There are two formats of saved inference model, which are controlled by model_filename and params_filename parameters in the two interfaces above.

  • Parameters are saved into independent separate files, such as model_filename set as None and params_filename set as None

    ls recognize_digits_conv.inference.model/*
    __model__ conv2d_1.w_0 conv2d_2.w_0 fc_1.w_0 conv2d_1.b_0 conv2d_2.b_0 fc_1.b_0
  • Parameters are saved into the same file, such as model_filename set as None and params_filename set as __params__

    ls recognize_digits_conv.inference.model/*
    __model__ __params__

Save Inference model

To save an inference model, we normally use to tailor the default fluid.Program and only keep the parts useful for predicting predict_var. After being tailored, program will be saved under ./infer_model/__model__ while the parameters will be saved into independent files under ./infer_model .

Sample Code:

exe = fluid.Executor(fluid.CPUPlace())
path = "./infer_model", feeded_var_names=['img'],
    target_vars=[predict_var], executor=exe)

Load Inference Model

exe = fluid.Executor(fluid.CPUPlace())
path = "./infer_model"
[inference_program, feed_target_names, fetch_targets] =, executor=exe)
results =,
              feed={feed_target_names[0]: tensor_img},

In this example, at first we call to get inference inference_program , feed_target_names-name of input data and fetch_targets of output; then call executor to run inference inference_program to get inferred result.