Basic Concept¶
Program¶
Fluid describes neural network configuration in the form of abstract grammar tree similar to that of a programming language, and the user’s description of computation will be written into a Program. Program in Fluid replaces the concept of models in traditional frameworks. It can describe any complex model through three execution structures: sequential execution, conditional selection and loop execution. Writing Program is very close to writing a common program. If you have tried programming before, you will naturally apply your expertise to it.
In brief:
A model is a Fluid
Programand can contain more than oneProgram;Programconsists of nestedBlock, and the concept ofBlockcan be analogized to a pair of braces in C++ or Java, or an indentation block in Python.Computing in
Blockis composed of three ways: sequential execution, conditional selection or loop execution, which constitutes complex computational logic.Blockcontains descriptions of computation and computational objects. The description of computation is called Operator; the object of computation (or the input and output of Operator) is unified as Tensor. In Fluid, Tensor is represented by 0-leveled LoD-Tensor .
Block¶
Block is the concept of variable scope in advanced languages. In programming languages, Block is a pair of braces, which contains local variable definitions and a series of instructions or operators. Control flow structures if-else and for in programming languages can be equivalent to the following counterparts in deep learning:
programming languages |
Fluid |
|---|---|
for, while loop |
RNN,WhileOP |
if-else, switch |
IfElseOp, SwitchOp |
execute sequentially |
a series of layers |
As mentioned above, Block in Fluid describes a set of Operators that include sequential execution, conditional selection or loop execution, and the operating object of Operator: Tensor.
Operator¶
In Fluid, all operations of data are represented by Operator . In Python, Operator in Fluid is encapsulated into modules like paddle.fluid.layers , paddle.fluid.nets .
This is because some common operations on Tensor may consist of more basic operations. For simplicity, some encapsulation of the basic Operator is carried out inside the framework, including the creation of learnable parameters relied by an Operator, the initialization details of learnable parameters, and so on, so as to reduce the cost of further development.
More information can be read for reference. Fluid Design Idea
Variable¶
In Fluid, Variable can contain any type of value – in most cases a LoD-Tensor.
All the learnable parameters in the model are kept in the memory space in form of Variable . In most cases, you do not need to create the learnable parameters in the network by yourself. Fluid provides encapsulation for almost common basic computing modules of the neural network. Taking the simplest full connection model as an example, calling fluid.layers.fc directly creates two learnable parameters for the full connection layer, namely, connection weight (W) and bias, without explicitly calling Variable related interfaces to create learnable parameters.
Name¶
In Fluid, some layers contain the parameter name , such as api_fluid_layers_fc . This name is generally used as the prefix identification of output and weight in network layers. The specific rules are as follows:
Prefix identification for output of layers. If
nameis specified in the layer, Fluid will name the output withnameValue.tmp_number. If thenameis not specified,OPName_number.tmp_numberis automatically generated to name the layer. The numbers are automatically incremented to distinguish different network layers under the same operator.Prefix identification for weight or bias variable. If the weight and bias variables are created by
param_attrandbias_attrin operator, such as api_fluid_layers_embedding 、 api_fluid_layers_fc , Fluid will generateprefix.w_numberorprefix.b_numberas unique identifier to name them, where theprefixisnamespecified by users orOPName_numbergenerated by default. Ifnameis specified inparam_attrandbias_attr, thenameis no longer generated automatically. Refer to the sample code for details.
In addition, the weights of multiple network layers can be shared by specifying the name parameter in api_fluid_ParamAttr.
Sample Code:
import paddle.fluid as fluid
import numpy as np
x = fluid.layers.data(name='x', shape=[1], dtype='int64', lod_level=1)
emb = fluid.layers.embedding(input=x, size=(128, 100)) # embedding_0.w_0
emb = fluid.layers.Print(emb) # Tensor[embedding_0.tmp_0]
# default name
fc_none = fluid.layers.fc(input=emb, size=1) # fc_0.w_0, fc_0.b_0
fc_none = fluid.layers.Print(fc_none) # Tensor[fc_0.tmp_1]
fc_none1 = fluid.layers.fc(input=emb, size=1) # fc_1.w_0, fc_1.b_0
fc_none1 = fluid.layers.Print(fc_none1) # Tensor[fc_1.tmp_1]
# name in ParamAttr
w_param_attrs = fluid.ParamAttr(name="fc_weight", learning_rate=0.5, trainable=True)
print(w_param_attrs.name) # fc_weight
# name == 'my_fc'
my_fc1 = fluid.layers.fc(input=emb, size=1, name='my_fc', param_attr=w_param_attrs) # fc_weight, my_fc.b_0
my_fc1 = fluid.layers.Print(my_fc1) # Tensor[my_fc.tmp_1]
my_fc2 = fluid.layers.fc(input=emb, size=1, name='my_fc', param_attr=w_param_attrs) # fc_weight, my_fc.b_1
my_fc2 = fluid.layers.Print(my_fc2) # Tensor[my_fc.tmp_3]
place = fluid.CPUPlace()
x_data = np.array([[1],[2],[3]]).astype("int64")
x_lodTensor = fluid.create_lod_tensor(x_data, [[1, 2]], place)
exe = fluid.Executor(place)
exe.run(fluid.default_startup_program())
ret = exe.run(feed={'x': x_lodTensor}, fetch_list=[fc_none, fc_none1, my_fc1, my_fc2], return_numpy=False)
In the above example, fc_none and fc_none1 are not specified name parameter, so this two layers are named with fc_0.tmp_1 and fc_1.tmp_1 in the form OPName_number.tmp_number , where the numbers in fc_0 and fc_1 are automatically incremented to distinguish this two fully connected layers. The other two fully connected layers my_fc1 and my_fc2 both specify the name parameter with same values. Fluid will distinguish the two layers by suffix tmp_number . That is my_fc.tmp_1 and my_fc.tmp_3 .
Variables created in emb layer and fc_none , fc_none1 are named by the OPName_number , such as embedding_0.w_0 、 fc_0.w_0 、 fc_0.b_0 . And the prefix is consistent with the prefix of network layer. The my_fc1 layer and my_fc2 layer preferentially name the shared weight with fc_weight specified in ParamAttr . The bias variables my_fc.b_0 and my_fc.b_1 are identified suboptimally with name int the operator as prefix.
In the above example, the my_fc1 and my_fc2 two fully connected layers implement the sharing of weight parameters by constructing ParamAttr and specifying the name parameter.
