Basic Usage¶
The recommended way to transform dygraph to static graph is source-code-translate based ProgramTranslator. The basic idea is analyzing Python source code and turning into static graph code, then run the static graph code using Executor. Users could use Python syntax including control flow to build neural networks. Besides, PaddlePaddle has another tracing-based API for transforming dygraph to static graph which called TracedLayer. You can use it as a back-up API in case ProgramTranslator has problem.
ProgramTranslator¶
The basic idea of source-code-translate based ProgramTranslator is analyzing Python source code and turning it into static graph code, then run the static graph code using Executor. The basic usage of ProgramTranslator is simple, put a decorator @paddle.jit.to_static
before the definition of the function to transform (the function can also be a method of a class, e.g., the forward
function of user-defined imperative Layer). An example is:
import paddle
@paddle.jit.to_static
def func(input_var):
# if condition depends on the shape of input_var
if input_var.shape[0] > 1:
out = paddle.cast(input_var, "float64")
else:
out = paddle.cast(input_var, "int64")
return out
in_np = np.array([-2]).astype('int')
input_var = paddle.to_tensor(in_np)
func(input_var)
To save the transformed model, we can call paddle.jit.save
. Let’s take a fully connected network called SimpleFcLayer
as an example, we put decorator at the forward
method of SimpleFcLayer
:
import numpy as np
import paddle
class SimpleFcLayer(paddle.nn.Layer):
def __init__(self, batch_size, feature_size, fc_size):
super(SimpleFcLayer, self).__init__()
self._linear = paddle.nn.Linear(feature_size, fc_size)
self._offset = paddle.to_tensor(
np.random.random((batch_size, fc_size)).astype('float32'))
@paddle.jit.to_static
def forward(self, x):
fc = self._linear(x)
return fc + self._offset
Call paddle.jit.save
to save above model:
import paddle
fc_layer = SimpleFcLayer(3, 4, 2)
in_np = np.random.random([3, 4]).astype('float32')
input_var = paddle.to_tensor(in_np)
out = fc_layer(input_var)
paddle.jit.save(fc_layer, "./fc_layer_dy2stat")
TracedLayer¶
Tracing means recording the operators when running a model. TracedLayer is based on this technique. It runs dygraph program once and records all operators, then constructs static graph model and saves it. Now take a glance at an usage example:
Define a simple fully connected network, note that we don’t add a decorator before forward
function as we did in ProgramTranslator example:
import numpy as np
import paddle
class SimpleFcLayer(paddle.nn.Layer):
def __init__(self, batch_size, feature_size, fc_size):
super(SimpleFcLayer, self).__init__()
self._linear = paddle.nn.Linear(feature_size, fc_size)
self._offset = paddle.to_tensor(
np.random.random((batch_size, fc_size)).astype('float32'))
def forward(self, x):
fc = self._linear(x)
return fc + self._offset
Save model by TracedLayer:
import paddle
from paddle.jit import TracedLayer
fc_layer = SimpleFcLayer(3, 4, 2)
in_np = np.random.random([3, 4]).astype('float32')
# Turn numpy ndarray into Tensor
input_var = paddle.to_tensor(in_np)
# Transforming imperative mode into declarative mode by TracerLayer.trace
out_dygraph, static_layer = TracedLayer.trace(fc_layer, inputs=[input_var])
save_dirname = './saved_infer_model'
# Save the transformed model
static_layer.save_inference_model(save_dirname, feed=[0], fetch=[0])
Load model and run it in static graph mode:
place = paddle.CPUPlace()
exe = paddle.Executor(place)
program, feed_vars, fetch_vars = paddle.static.load_inference_model(save_dirname, exe)
fetch, = exe.run(program, feed={feed_vars[0]: in_np}, fetch_list=fetch_vars)
However, as tracing only records operators once, if user’s code contains Tensor-dependent (including Tensor value or Tensor shape) control flow, that is the Tensor can cause different operators being executed, then TracedLayer cannot handle this case. For instance:
import paddle
def func(input_var)
# if condition depends on the shape of input_var
if input_var.shape[0] > 1:
return paddle.cast(input_var, "float64")
else:
return paddle.cast(input_var, "int64")
in_np = np.array([-2]).astype('int')
input_var = paddle.to_tensor(in_np)
out = func(input_var)
If we apply TracedLayer.trace(func, inputs=[input_var]) on above example, tracing can take record of operators in only one branch of if-else, then the model can not be saved as what user orignally means. The similar situations applies to while/for loop.
Comparing ProgramTranslator and TracedLayer¶
Compared to tracing-based TracedLayer, source-code-translate based ProgramTranslator can handle the Tensor-dependent control flow. So we recommend users to use ProgramTranslator, use TracedLayer as a back-up plan when ProgramTranslator doesn’t work.