AvgPool3D

class paddle.compat.nn. AvgPool3D ( kernel_size: Size3, stride: Size3 | None = None, padding: Size3 = 0, ceil_mode: bool = False, count_include_pad: bool = True, divisor_override: int | None = None ) [source]

This operation applies 3D max pooling over input features based on the input, and kernel_size, stride, padding parameters. Input(X) and Output(Out) are in NCDHW format, where N is batch size, C is the number of channels, H is the height of the feature, D is the depth of the feature, and W is the width of the feature.

Parameters
  • kernel_size (int|list|tuple) – The pool kernel size. If pool kernel size is a tuple or list, it must contain three integers, (kernel_size_Depth, kernel_size_Height, kernel_size_Width). Otherwise, the pool kernel size will be the cube of an int.

  • stride (int|list|tuple|None, optional) – The pool stride size. If pool stride size is a tuple or list, it must contain three integers, [stride_Depth, stride_Height, stride_Width). Otherwise, the pool stride size will be a cube of an int. Default None, then stride will be equal to the kernel_size.

  • padding (str|int|list|tuple, optional) –

    The padding size. Padding could be in one of the following forms.

    1. A string in [‘valid’, ‘same’].

    2. An int, which means the feature map is zero padded by size of padding on every sides.

    3. A list[int] or tuple(int) whose length is 3, [pad_depth, pad_height, pad_weight] whose value means the padding size of each dimension.

    4. A list[int] or tuple(int) whose length is 6. [pad_depth_front, pad_depth_back, pad_height_top, pad_height_bottom, pad_width_left, pad_width_right] whose value means the padding size of each side.

    5. A list or tuple of pairs of integers. It has the form [[pad_before, pad_after], [pad_before, pad_after], …]. Note that, the batch dimension and channel dimension should be [0,0] or (0,0).

    The default value is 0.

  • ceil_mode (bool, optional) – ${ceil_mode_comment}

  • count_include_pad (bool, optional) – Whether to include padding points in average pooling mode, default is True.

  • divisor_override (int|float, optional) – if specified, it will be used as divisor, otherwise kernel_size will be used. Default None.

Returns

A callable object of AvgPool3D.

Shape:
  • x(Tensor): The input tensor of avg pool3d operator, which is a 5-D tensor. The data type can be float16, float32, float64.

  • output(Tensor): The output tensor of avg pool3d operator, which is a 5-D tensor. The data type is same as input x.

Examples

>>> import paddle
>>> import paddle.compat.nn as nn

>>> # avg pool3d
>>> input = paddle.uniform([1, 2, 3, 32, 32], dtype="float32", min=-1, max=1)
>>> AvgPool3D = nn.AvgPool3D(kernel_size=2, stride=2, padding=0)
>>> output = AvgPool3D(input)
>>> print(output.shape)
paddle.Size([1, 2, 1, 16, 16])
forward ( input: Tensor ) Tensor

forward

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments

extra_repr ( ) str

extra_repr

Extra representation of this layer, you can have custom implementation of your own layer.

add_module ( name: str, module: paddle.nn.layer.layers.Layer | None ) None

add_module

Adds a sub layer instance. Added layer can be accessed by self.name

Parameters
  • name (str) – name of this sublayer.

  • layer (Layer) – an instance of Layer.

Returns

None

add_parameter ( name: str, parameter: Tensor ) Tensor

add_parameter

Adds a Parameter instance.

Added parameter can be accessed by self.name

Parameters
  • name (str) – name of this sublayer.

  • parameter (Parameter) – an instance of Parameter.

Returns

Parameter, the parameter passed in.

Examples

>>> import paddle
>>> paddle.seed(100)

>>> class MyLayer(paddle.nn.Layer):
...     def __init__(self):
...         super().__init__()
...         self._linear = paddle.nn.Linear(1, 1)
...         w_tmp = self.create_parameter([1,1])
...         self.add_parameter("w_tmp", w_tmp)
...
...     def forward(self, input):
...         return self._linear(input)
...
>>> mylayer = MyLayer()
>>> for name, param in mylayer.named_parameters():
...     print(name, param)
w_tmp Parameter containing:
Tensor(shape=[1, 1], dtype=float32, place=Place(cpu), stop_gradient=False,
[[-1.01448846]])
_linear.weight Parameter containing:
Tensor(shape=[1, 1], dtype=float32, place=Place(cpu), stop_gradient=False,
[[0.18551230]])
_linear.bias Parameter containing:
Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=False,
[0.])
add_sublayer ( name: str, sublayer: Layer ) Layer

add_sublayer

Adds a sub Layer instance.

Added sublayer can be accessed by self.name

Parameters
  • name (str) – name of this sublayer.

  • sublayer (Layer) – an instance of Layer.

Returns

Layer, the sublayer passed in.

Examples

>>> import paddle

>>> class MySequential(paddle.nn.Layer):
...     def __init__(self, *layers):
...         super().__init__()
...         if len(layers) > 0 and isinstance(layers[0], tuple):
...             for name, layer in layers:
...                 self.add_sublayer(name, layer)
...         else:
...             for idx, layer in enumerate(layers):
...                 self.add_sublayer(str(idx), layer)
...
...     def forward(self, input):
...         for layer in self._sub_layers.values():
...             input = layer(input)
...         return input
...
>>> fc1 = paddle.nn.Linear(10, 3)
>>> fc2 = paddle.nn.Linear(3, 10, bias_attr=False)
>>> model = MySequential(fc1, fc2)
>>> for prefix, layer in model.named_sublayers():
...     print(prefix, layer)
0 Linear(in_features=10, out_features=3, dtype=float32)
1 Linear(in_features=3, out_features=10, dtype=float32)
apply ( fn: Callable[[Self], None] ) Self

apply

Applies fn recursively to every sublayer (as returned by .sublayers()) as well as self. Typical use includes initializing the parameters of a model.

Parameters

fn (function) – a function to be applied to each sublayer

Returns

Layer, self

Examples

>>> import paddle
>>> import paddle.nn as nn
>>> paddle.seed(2023)

>>> net = nn.Sequential(nn.Linear(2, 2), nn.Linear(2, 2))

>>> def init_weights(layer):
...     if type(layer) == nn.Linear:
...         print('before init weight:', layer.weight.numpy())
...         new_weight = paddle.full(shape=layer.weight.shape, dtype=layer.weight.dtype, fill_value=0.9)
...         layer.weight.set_value(new_weight)
...         print('after init weight:', layer.weight.numpy())
...
>>> net.apply(init_weights)

>>> print(net.state_dict())
before init weight: [[ 0.89611185  0.04935038]
                     [-0.5888344   0.99266374]]
after init weight: [[0.9 0.9]
                    [0.9 0.9]]
before init weight: [[-0.18615901 -0.22924072]
                     [ 1.1517721   0.59859073]]
after init weight: [[0.9 0.9]
                    [0.9 0.9]]
OrderedDict([('0.weight', Parameter containing:
Tensor(shape=[2, 2], dtype=float32, place=Place(cpu), stop_gradient=False,
[[0.89999998, 0.89999998],
 [0.89999998, 0.89999998]])), ('0.bias', Parameter containing:
Tensor(shape=[2], dtype=float32, place=Place(cpu), stop_gradient=False,
[0., 0.])), ('1.weight', Parameter containing:
Tensor(shape=[2, 2], dtype=float32, place=Place(cpu), stop_gradient=False,
[[0.89999998, 0.89999998],
 [0.89999998, 0.89999998]])), ('1.bias', Parameter containing:
Tensor(shape=[2], dtype=float32, place=Place(cpu), stop_gradient=False,
[0., 0.]))])
astype ( dtype: DTypeLike | None = None ) Self

astype

Casts all parameters and buffers to dtype and then return the Layer.

Parameters

dtype (str|paddle.dtype|numpy.dtype) – target data type of layer. If set str, it can be “bool”, “bfloat16”, “float16”, “float32”, “float64”, “int8”, “int16”, “int32”, “int64”, “uint8”, “complex64”, “complex128”. Default: None

Returns

Layer, self

Examples

>>> import paddle
>>> import paddle.nn as nn
>>> weight_attr = paddle.ParamAttr(name="weight",initializer=paddle.nn.initializer.Constant(value=1.5))
>>> bias_attr = paddle.ParamAttr(name="bias",initializer=paddle.nn.initializer.Constant(value=2.5))

>>> linear = paddle.nn.Linear(2, 2, weight_attr=weight_attr, bias_attr=bias_attr).to(device="cpu",dtype="float32")
>>> print(linear)
Linear(in_features=2, out_features=2, dtype=float32)
>>> print(linear.parameters())
[Parameter containing:
Tensor(shape=[2, 2], dtype=float32, place=Place(cpu), stop_gradient=False,
    [[1.50000000, 1.50000000],
        [1.50000000, 1.50000000]]), Parameter containing:
Tensor(shape=[2], dtype=float32, place=Place(cpu), stop_gradient=False,
    [2.50000000, 2.50000000])]

>>> linear=linear.astype("int8")
>>> print(linear)
Linear(in_features=2, out_features=2, dtype=paddle.int8)
>>> print(linear.parameters())
>>> 
[Parameter containing:
Tensor(shape=[2, 2], dtype=int8, place=Place(cpu), stop_gradient=False,
    [[1, 1],
        [1, 1]]), Parameter containing:
Tensor(shape=[2], dtype=int8, place=Place(cpu), stop_gradient=False,
    [2, 2])]
>>> 
bfloat16 ( excluded_layers: Layer | Sequence[Layer] | None = None ) Self

bfloat16

Casts all floating point parameters and buffers to bfloat16 data type.

Note

nn.BatchNorm does not support bfloat16 weights, so it would not be converted by default.

Parameters

excluded_layers (nn.Layer|list|tuple|None, optional) – Specify the layers that need to be kept original data type. if excluded_layers is None, casts all floating point parameters and buffers except nn.BatchNorm. Default: None.

Returns

self

Return type

Layer

Examples

>>> 
>>> import paddle

>>> class Model(paddle.nn.Layer):
...     def __init__(self):
...         super().__init__()
...         self.linear = paddle.nn.Linear(1, 1)
...         self.dropout = paddle.nn.Dropout(p=0.5)
...
...     def forward(self, input):
...         out = self.linear(input)
...         out = self.dropout(out)
...         return out
...
>>> model = Model()
>>> model.bfloat16()
>>> #UserWarning: Paddle compiled by the user does not support bfloat16, so keep original data type.
Model(
    (linear): Linear(in_features=1, out_features=1, dtype=float32)
    (dropout): Dropout(p=0.5, axis=None, mode=upscale_in_train)
)
buffers ( include_sublayers: bool = True ) list[paddle.Tensor]

buffers

Returns a list of all buffers from current layer and its sub-layers.

Parameters

include_sublayers (bool, optional) – Whether include the buffers of sublayers. If True, also include the buffers from sublayers. Default: True.

Returns

list of Tensor, a list of buffers.

Examples

>>> import numpy as np
>>> import paddle

>>> linear = paddle.nn.Linear(10, 3)
>>> value = np.array([0]).astype("float32")
>>> buffer = paddle.to_tensor(value)
>>> linear.register_buffer("buf_name", buffer, persistable=True)

>>> print(linear.buffers())
[Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
[0.])]
children ( ) Iterable[Layer]

children

Returns an iterator over immediate children layers.

Yields

Layer – a child layer

Examples

>>> import paddle

>>> linear1 = paddle.nn.Linear(10, 3)
>>> linear2 = paddle.nn.Linear(3, 10, bias_attr=False)
>>> model = paddle.nn.Sequential(linear1, linear2)

>>> layer_list = list(model.children())

>>> print(layer_list)
[Linear(in_features=10, out_features=3, dtype=float32), Linear(in_features=3, out_features=10, dtype=float32)]
clear_gradients ( set_to_zero: bool = True ) None

clear_gradients

Clear the gradients of all parameters for this layer.

Parameters

set_to_zero (bool, optional) – Whether to set the trainable parameters’ gradients to zero or None. Default is True.

Returns

None

Examples

>>> import paddle
>>> import numpy as np

>>> value = np.arange(26).reshape(2, 13).astype("float32")
>>> a = paddle.to_tensor(value)
>>> linear = paddle.nn.Linear(13, 5)
>>> adam = paddle.optimizer.Adam(learning_rate=0.01,
...                              parameters=linear.parameters())
>>> out = linear(a)
>>> out.backward()
>>> adam.step()
>>> linear.clear_gradients()
cpu ( ) Self

cpu

Move all model parameters and buffers to the CPU.

Returns

self

Return type

Layer

create_parameter ( shape: ShapeLike, attr: ParamAttrLike | None = None, dtype: DTypeLike | None = None, is_bias: bool = False, default_initializer: Initializer | None = None, device: PlaceLike | None = None ) Tensor

create_parameter

Create parameters for this layer.

Parameters
  • shape (list) – Shape of the parameter. The data type in the list must be int.

  • attr (ParamAttr, optional) – Parameter attribute of weight. Please refer to ParamAttr. Default: None.

  • dtype (str, optional) – Data type of this parameter. If set str, it can be “bool”, “float16”, “float32”, “float64”, “int8”, “int16”, “int32”, “int64”, “uint8” or “uint16”. Default: “float32”.

  • is_bias (bool, optional) – if this is a bias parameter. Default: False.

  • default_initializer (Initializer, optional) – the default initializer for this parameter. If set None, default initializer will be set to paddle.nn.initializer.Xavier and paddle.nn.initializer.Constant for non-bias and bias parameter, respectively. Default: None.

  • device (PlaceLike, optional) – the device place for the parameter. Default: None.

Returns

Tensor, created parameter.

Examples

>>> import paddle
>>> paddle.seed(2023)

>>> class MyLayer(paddle.nn.Layer):
...     def __init__(self):
...         super().__init__()
...         self._linear = paddle.nn.Linear(1, 1)
...         w_tmp = self.create_parameter([1,1])
...         self.add_parameter("w_tmp", w_tmp)
...
...     def forward(self, input):
...         return self._linear(input)
...
>>> mylayer = MyLayer()
>>> for name, param in mylayer.named_parameters():
...     print(name, param)      # will print w_tmp,_linear.weight,_linear.bias
w_tmp Parameter containing:
Tensor(shape=[1, 1], dtype=float32, place=Place(cpu), stop_gradient=False,
[[0.06979191]])
_linear.weight Parameter containing:
Tensor(shape=[1, 1], dtype=float32, place=Place(cpu), stop_gradient=False,
[[1.26729357]])
_linear.bias Parameter containing:
Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=False,
[0.])
create_tensor ( name: str | None = None, persistable: bool | None = None, dtype: DTypeLike | None = None ) Tensor

create_tensor

Create Tensor for this layer.

Parameters
  • name (str, optional) – name of the tensor. Please refer to api_guide_Name . Default: None.

  • persistable (bool, optional) – if set this tensor persistable. Default: False.

  • dtype (str, optional) – data type of this parameter. If set str, it can be “bool”, “float16”, “float32”, “float64”, “int8”, “int16”, “int32”, “int64”, “uint8” or “uint16”. If set None, it will be “float32”. Default: None.

Returns

Tensor, created Tensor.

Examples

>>> import paddle

>>> class MyLinear(paddle.nn.Layer):
...     def __init__(self,
...                  in_features,
...                  out_features):
...         super().__init__()
...         self.linear = paddle.nn.Linear(10, 10)
...
...         self.back_var = self.create_tensor(name = "linear_tmp_0", dtype=self._dtype)
...
...     def forward(self, input):
...         out = self.linear(input)
...         paddle.assign(out, self.back_var)
...
...         return out
create_variable ( name: str | None = None, persistable: bool | None = None, dtype: DTypeLike | None = None ) Tensor

create_variable

Warning

API “paddle.nn.layer.layers.create_variable” is deprecated since 2.0.0, and will be removed in future versions. Please use “paddle.nn.Layer.create_tensor” instead. Reason: New api in create_tensor, easier to use.

Create Tensor for this layer.

Parameters
  • name (str, optional) – name of the tensor. Please refer to api_guide_Name . Default: None

  • persistable (bool, optional) – if set this tensor persistable. Default: False

  • dtype (str, optional) – data type of this parameter. If set str, it can be “bool”, “float16”, “float32”, “float64”,”int8”, “int16”, “int32”, “int64”, “uint8” or “uint16”. If set None, it will be “float32”. Default: None

Returns

Tensor, created Tensor.

Examples

>>> import paddle

>>> class MyLinear(paddle.nn.Layer):
...     def __init__(self,
...                 in_features,
...                 out_features):
...         super().__init__()
...         self.linear = paddle.nn.Linear( 10, 10)
...
...         self.back_var = self.create_variable(name = "linear_tmp_0", dtype=self._dtype)
...
...     def forward(self, input):
...         out = self.linear(input)
...         paddle.assign( out, self.back_var)
...
...         return out
cuda ( device: int | PlaceLike | None = None ) Self

cuda

Move all model parameters and buffers to the GPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing the optimizer if the layer will live on GPU while being optimized.

Parameters

device (int, optional) – if specified, all parameters will be copied to that device.

Returns

self

Return type

Layer

double ( ) Self

double

Casts all floating point parameters and buffers to double datatype.

Returns

self

Return type

Module

eval ( ) Self

eval

Sets this Layer and all its sublayers to evaluation mode. This only effects certain modules like Dropout and BatchNorm.

Returns

self

Return type

Layer

Examples

>>> import paddle
>>> paddle.seed(100)
>>> class MyLayer(paddle.nn.Layer):
...     def __init__(self):
...         super().__init__()
...         self._linear = paddle.nn.Linear(1, 1)
...         self._dropout = paddle.nn.Dropout(p=0.5)
...
...     def forward(self, input):
...         temp = self._linear(input)
...         temp = self._dropout(temp)
...         return temp
...
>>> x = paddle.randn([10, 1], 'float32')
>>> mylayer = MyLayer()
>>> mylayer.eval()  # set mylayer._dropout to eval mode
>>> out = mylayer(x)
>>> print(out)
Tensor(shape=[10, 1], dtype=float32, place=Place(cpu), stop_gradient=False,
[[-1.72439659],
 [ 0.31532824],
 [ 0.01192369],
 [-0.36912638],
 [-1.63426113],
 [-0.93169814],
 [ 0.32222399],
 [-1.61092973],
 [ 0.77209264],
 [-0.34038994]])
float ( excluded_layers: Layer | Sequence[Layer] | None = None ) Self

float

Casts all floating point parameters and buffers to float data type.

Parameters

excluded_layers (nn.Layer|list|tuple|None, optional) – Specify the layers that need to be kept original data type. if excluded_layers is None, casts all floating point parameters and buffers. Default: None.

Returns

self

Return type

Layer

Examples

>>> import paddle

>>> class Model(paddle.nn.Layer):
...     def __init__(self):
...         super().__init__()
...         self.linear = paddle.nn.Linear(1, 1)
...         self.dropout = paddle.nn.Dropout(p=0.5)
...
...     def forward(self, input):
...         out = self.linear(input)
...         out = self.dropout(out)
...         return out
>>> model = Model()
>>> model.float()
Model(
    (linear): Linear(in_features=1, out_features=1, dtype=paddle.float32)
    (dropout): Dropout(p=0.5, axis=None, mode=upscale_in_train, inplace=False)
)
float16 ( excluded_layers: Layer | Sequence[Layer] | None = None ) Self

float16

Casts all floating point parameters and buffers to float16 data type.

Note

nn.BatchNorm does not support bfloat16 weights, so it would not be converted by default.

Parameters

excluded_layers (nn.Layer|list|tuple|None, optional) – Specify the layers that need to be kept original data type. if excluded_layers is None, casts all floating point parameters and buffers except nn.BatchNorm. Default: None.

Returns

self

Return type

Layer

Examples

>>> 
>>> import paddle

>>> class Model(paddle.nn.Layer):
...     def __init__(self):
...         super().__init__()
...         self.linear = paddle.nn.Linear(1, 1)
...         self.dropout = paddle.nn.Dropout(p=0.5)
...
...     def forward(self, input):
...         out = self.linear(input)
...         out = self.dropout(out)
...         return out
...
>>> model = Model()
>>> model.float16()
Model(
    (linear): Linear(in_features=1, out_features=1, dtype=float32)
    (dropout): Dropout(p=0.5, axis=None, mode=upscale_in_train)
)
full ( aoa_config: dict[str:list[str]] | None = None, **kwargs )

full

Returns an iterator over the full, unsharded model parameters. The output parameters can be customized using the aoa_config argument.

Args: sharded_state_dict (ShardedStateDict):

System Message: ERROR/3 (/usr/local/lib/python3.10/site-packages/paddle/compat/nn/__init__.py:docstring of paddle.nn.layer.layers.Layer.full, line 6)

Unexpected indentation.

The state dict containing parameter shards local to the current process.

System Message: WARNING/2 (/usr/local/lib/python3.10/site-packages/paddle/compat/nn/__init__.py:docstring of paddle.compat.nn.AvgPool3D, line 2)

Block quote ends without a blank line; unexpected unindent.

aoa_config (dict[str, list[str]] | None, optional):

AoA (Almost AllReduce) configuration. Default is None.

kwargs:

Optional keyword arguments: - h_group: The horizontal communication group.

System Message: ERROR/3 (/usr/local/lib/python3.10/site-packages/paddle/compat/nn/__init__.py:docstring of paddle.nn.layer.layers.Layer.full, line 12)

Unexpected indentation.

If using group communication, both h_group and v_group must be provided.

System Message: WARNING/2 (/usr/local/lib/python3.10/site-packages/paddle/compat/nn/__init__.py:docstring of paddle.compat.nn.AvgPool3D, line 8)

Block quote ends without a blank line; unexpected unindent.

  • v_group: The vertical communication group.

  • process_group: The communication group in single-group setups (when h_group and v_group are not used).

  • num_splits (int): The number of splits to divide the parameters.

  • shard_idx (int): The index of the split handled by the current process. Default is 0.

  • memory_growth_threshold (int): The memory threshold (in bytes) for controlling memory growth during parameter assembly.

    Default is 8 * (2 ** 30), i.e., 8GB.

Returns

An iterator over the full, unsharded model parameters, optionally filtered and customized according to aoa_config.

Return type

Iterator

full_name ( ) str

full_name

Full name for this layer, composed by name_scope + “/” + MyLayer.__class__.__name__

Returns

str, full name of this layer.

Examples

>>> import paddle

>>> class LinearNet(paddle.nn.Layer):
...     def __init__(self):
...         super().__init__(name_scope = "demo_linear_net")
...         self._linear = paddle.nn.Linear(1, 1)
...
...     def forward(self, x):
...         return self._linear(x)
...
>>> linear_net = LinearNet()
>>> print(linear_net.full_name())
demo_linear_net_0
get_buffer ( target: str ) Tensor

get_buffer

Return the buffer given by target if it exists, otherwise throw an error.

See the docstring for get_sublayer for a more detailed explanation of this method’s functionality as well as how to correctly specify target.

Parameters

target (str) – The fully-qualified string name of the buffer to look for.

Returns

The buffer referenced by target.

Return type

Tensor

get_parameter ( target: str ) Parameter

get_parameter

Return the parameter given by target if it exists, otherwise throw an error. :param target: The fully-qualified string name of the Parameter to look for. :type target: str

Returns

The Parameter referenced by target.

Return type

Parameter

get_sublayer ( target: str ) Layer

get_sublayer

Return the submodule given by target if it exists, otherwise throw an error.

Parameters

target (str) – The fully-qualified string name of the submodule to look for.

Returns

The sublayer referenced by target.

Return type

Layer

get_submodule ( target: str ) Layer

get_submodule

Return the submodule given by target if it exists, otherwise throw an error.

Parameters

target (str) – The fully-qualified string name of the submodule to look for.

Returns

The sublayer referenced by target.

Return type

Layer

half ( ) Self

half

Casts all floating point parameters and buffers to half datatype.

Returns

self

Return type

Module

load_dict ( state_dict: Union[dict[str, paddle.Tensor], OrderedDict[str, Tensor]], use_structured_name: bool = True ) tuple[list[str], list[str]]

load_dict

Set parameters and persistable buffers from state_dict. All the parameters and buffers will be reset by the tensor in the state_dict

Parameters
  • state_dict (dict) – Dict contains all the parameters and persistable buffers.

  • use_structured_name (bool, optional) – If true, use structured name as key, otherwise, use parameter or buffer name as key. Default: True.

Returns

A list of str containing the missing keys unexpected_keys(list):A list of str containing the unexpected keys

Return type

missing_keys(list)

Examples

>>> import paddle

>>> emb = paddle.nn.Embedding(10, 10)

>>> state_dict = emb.state_dict()
>>> paddle.save(state_dict, "paddle_dy.pdparams")
>>> para_state_dict = paddle.load("paddle_dy.pdparams")
>>> emb.set_state_dict(para_state_dict)
load_state_dict ( state_dict: Mapping[str, Any], strict: bool = True, assign: bool = False )

load_state_dict

Copy parameters and buffers from state_dict into this module and its descendants.

If strict is True, then the keys of state_dict must exactly match the keys returned by this module’s state_dict() function.

Parameters
  • state_dict (dict) – a dict containing parameters and persistent buffers.

  • strict (bool, optional) – whether to strictly enforce that the keys in state_dict match the keys returned by this module’s state_dict() function. Default: True

  • assign (bool, optional) – When set to False, the properties of the tensors in the current module are preserved whereas setting it to True preserves properties of the Tensors in the state dict. The only exception is the requires_grad field of Parameter for which the value from the module is preserved. Default: False

Returns

  • missing_keys is a list of str containing any keys that are expected

    by this module but missing from the provided state_dict.

  • unexpected_keys is a list of str containing the keys that are not

    expected by this module but present in the provided state_dict.

Return type

NamedTuple with missing_keys and unexpected_keys fields

modules ( ) Iterator[Layer]

modules

Return an iterator over all modules in the network.

Yields

Layer – a layer in the network.

named_buffers ( prefix: str = '', include_sublayers: bool = True, remove_duplicate: bool = True ) Iterable[tuple[str, Tensor]]

named_buffers

Returns an iterator over all buffers in the Layer, yielding tuple of name and Tensor.

Parameters
  • prefix (str, optional) – Prefix to prepend to all buffer names. Default: ‘’.

  • include_sublayers (bool, optional) – Whether include the buffers of sublayers. If True, also include the named buffers from sublayers. Default: True.

  • remove_duplicate (bool, optional) – Whether to remove duplicated buffers in the result. Default: True.

Yields

(string, Tensor) – Tuple of name and tensor

Examples

>>> import numpy as np
>>> import paddle

>>> fc1 = paddle.nn.Linear(10, 3)
>>> buffer1 = paddle.to_tensor(np.array([0]).astype("float32"))
>>> # register a tensor as buffer by specific `persistable`
>>> fc1.register_buffer("buf_name_1", buffer1, persistable=True)

>>> fc2 = paddle.nn.Linear(3, 10)
>>> buffer2 = paddle.to_tensor(np.array([1]).astype("float32"))
>>> # register a buffer by assigning an attribute with Tensor.
>>> # The `persistable` can only be False by this way.
>>> fc2.buf_name_2 = buffer2

>>> model = paddle.nn.Sequential(fc1, fc2)

>>> # get all named buffers
>>> for name, buffer in model.named_buffers():
...     print(name, buffer)
0.buf_name_1 Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
[0.])
1.buf_name_2 Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
[1.])
named_children ( ) Iterable[tuple[str, Layer]]

named_children

Returns an iterator over immediate children layers, yielding both the name of the layer as well as the layer itself.

Yields

(string, Layer) – Tuple containing a name and child layer

Examples

>>> import paddle

>>> linear1 = paddle.nn.Linear(10, 3)
>>> linear2 = paddle.nn.Linear(3, 10, bias_attr=False)
>>> model = paddle.nn.Sequential(linear1, linear2)
>>> for prefix, layer in model.named_children():
...     print(prefix, layer)
0 Linear(in_features=10, out_features=3, dtype=float32)
1 Linear(in_features=3, out_features=10, dtype=float32)
named_modules ( memo: Optional[set[paddle.nn.layer.layers.Layer]] = None, prefix: str = '', remove_duplicate: bool = True )

named_modules

Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer. The duplicate sublayer will only be yielded once.

Parameters
  • memo (set, optional) – The set to record duplicate sublayers. Default: None.

  • prefix (str, optional) – Prefix to prepend to all parameter names. Default: ‘’.

  • remove_duplicate (bool, optional) – Whether to remove duplicated sublayers in the result. Default: True.

Yields

(string, Layer) – Tuple of name and Layer

named_parameters ( prefix: str = '', include_sublayers: bool = True, remove_duplicate: bool = True ) Iterable[tuple[str, Tensor]]

named_parameters

Returns an iterator over all parameters in the Layer, yielding tuple of name and parameter.

Parameters
  • prefix (str, optional) – Prefix to prepend to all parameter names. Default: ‘’.

  • include_sublayers (bool, optional) – Whether include the parameters of sublayers. If True, also include the named parameters from sublayers. Default: True.

  • remove_duplicate (bool, optional) – Whether to remove duplicated parameters in the result. Default: True.

Yields

(string, Parameter) – Tuple of name and Parameter

Examples

>>> import paddle
>>> paddle.seed(100)

>>> fc1 = paddle.nn.Linear(10, 3)
>>> fc2 = paddle.nn.Linear(3, 10, bias_attr=False)
>>> model = paddle.nn.Sequential(fc1, fc2)
>>> for name, param in model.named_parameters():
...     print(name, param)
0.weight Parameter containing:
Tensor(shape=[10, 3], dtype=float32, place=Place(cpu), stop_gradient=False,
[[ 0.07276392, -0.39791510, -0.66356444],
 [ 0.02143478, -0.18519843, -0.32485050],
 [-0.42249614,  0.08450919, -0.66838276],
 [ 0.38208580, -0.24303678,  0.55127048],
 [ 0.47745085,  0.62117910, -0.08336520],
 [-0.28653207,  0.47237599, -0.05868882],
 [-0.14385653,  0.29945642,  0.12832761],
 [-0.21237159,  0.38539791, -0.62760031],
 [ 0.02637231,  0.20621127,  0.43255770],
 [-0.19984481, -0.26259184, -0.29696006]])
0.bias Parameter containing:
Tensor(shape=[3], dtype=float32, place=Place(cpu), stop_gradient=False,
[0., 0., 0.])
1.weight Parameter containing:
Tensor(shape=[3, 10], dtype=float32, place=Place(cpu), stop_gradient=False,
[[ 0.01985580, -0.40268910,  0.41172385, -0.47249708, -0.09002256,
 -0.00533628, -0.52048630,  0.62360322,  0.20848787, -0.02033746],
 [ 0.58281910,  0.12841827,  0.12907702,  0.02325618, -0.07746267,
 0.31950659, -0.37924835, -0.59209681, -0.11732036, -0.58378261],
 [-0.62100595,  0.22293305,  0.28229684, -0.03687060, -0.59323978,
 0.08411229,  0.53275704,  0.40431368,  0.03171402, -0.17922515]])
named_sublayers ( prefix: str = '', include_self: bool = False, layers_set: set[Layer] | None = None, remove_duplicate: bool = True ) Iterable[tuple[str, Layer]]

named_sublayers

Returns an iterator over all sublayers in the Layer, yielding tuple of name and sublayer. The duplicate sublayer will only be yielded once.

Parameters
  • prefix (str, optional) – Prefix to prepend to all parameter names. Default: ‘’.

  • include_self (bool, optional) – Whether include the Layer itself. Default: False.

  • layers_set (set, optional) – The set to record duplicate sublayers. Default: None.

  • remove_duplicate (bool, optional) – Whether to remove duplicated sublayers in the result. Default: True.

Yields

(string, Layer) – Tuple of name and Layer

Examples

>>> import paddle

>>> fc1 = paddle.nn.Linear(10, 3)
>>> fc2 = paddle.nn.Linear(3, 10, bias_attr=False)
>>> model = paddle.nn.Sequential(fc1, fc2)
>>> for prefix, layer in model.named_sublayers():
...     print(prefix, layer)
0 Linear(in_features=10, out_features=3, dtype=float32)
1 Linear(in_features=3, out_features=10, dtype=float32)

>>> l = paddle.nn.Linear(10, 3)
>>> model = paddle.nn.Sequential(l, l)
>>> for prefix, layer in model.named_sublayers(include_self=True, remove_duplicate=True):
...     print(prefix, layer)
 Sequential(
  (0): Linear(in_features=10, out_features=3, dtype=float32)
  (1): Linear(in_features=10, out_features=3, dtype=float32)
)
0 Linear(in_features=10, out_features=3, dtype=float32)

>>> l = paddle.nn.Linear(10, 3)
>>> model = paddle.nn.Sequential(l, l)
>>> for prefix, layer in model.named_sublayers(include_self=True, remove_duplicate=False):
...     print(prefix, layer)
 Sequential(
  (0): Linear(in_features=10, out_features=3, dtype=float32)
  (1): Linear(in_features=10, out_features=3, dtype=float32)
)
0 Linear(in_features=10, out_features=3, dtype=float32)
1 Linear(in_features=10, out_features=3, dtype=float32)
parameters ( include_sublayers: bool = True ) list[paddle.Tensor]

parameters

Returns a list of all Parameters from current layer and its sub-layers.

Parameters

include_sublayers (bool, optional) – Whether to return the parameters of the sublayer. If True, the returned list contains the parameters of the sublayer. Default: True.

Returns

list, list of Tensor, a list of Parameters.

Examples

>>> import paddle
>>> paddle.seed(100)

>>> linear = paddle.nn.Linear(1, 1)
>>> print(linear.parameters())
[Parameter containing:
Tensor(shape=[1, 1], dtype=float32, place=Place(cpu), stop_gradient=False,
[[0.18551230]]), Parameter containing:
Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=False,
[0.])]
register_buffer ( name: str, tensor: Tensor, persistable: bool = True ) None

register_buffer

Registers a tensor as buffer into the layer.

buffer is a non-trainable tensor and will not be updated by optimizer, but is necessary for evaluation and inference. For example, the mean and variance in BatchNorm layers. The registered buffer is persistable by default, and will be saved into state_dict alongside parameters. If set persistable=False, it registers a non-persistable buffer, so that it will not be a part of state_dict .

Buffers can be accessed as attributes using given names.

Parameters
  • name (string) – name of the buffer. The buffer can be accessed from this layer using the given name

  • tensor (Tensor) – the tensor to be registered as buffer.

  • persistable (bool) – whether the buffer is part of this layer’s state_dict.

Returns

None

Examples

>>> import numpy as np
>>> import paddle

>>> linear = paddle.nn.Linear(10, 3)
>>> value = np.array([0]).astype("float32")
>>> buffer = paddle.to_tensor(value)
>>> linear.register_buffer("buf_name", buffer, persistable=True)

>>> # get the buffer by attribute.
>>> print(linear.buf_name)
Tensor(shape=[1], dtype=float32, place=Place(cpu), stop_gradient=True,
[0.])
register_forward_hook ( hook: Union[Callable[[Layer, Tensor, Tensor], Tensor], Callable[[Layer, Tensor, dict[str, Any], Tensor], Tensor]], *, prepend: bool = False, with_kwargs: bool = False, always_call: bool = False ) HookRemoveHelper

register_forward_hook

Register a forward post-hook for Layer. The hook will be called after forward function has been computed.

It should have the following form, input and output of the hook is input and output of the Layer respectively. User can use forward post-hook to change the output of the Layer or perform information statistics tasks on the Layer.

hook(Layer, input, output) -> None or modified output

Parameters
  • hook (function) – a function registered as a forward post-hook

  • prepend (bool) – If True, the provided hook will be fired before all existing forward_post hooks on this paddle.nn.Layer. Default: False

  • with_kwargs (bool) – If True, the hook will be passed the kwargs given to the forward function. Default: False

  • always_call (bool) – If True the hook will be run regardless of whether an exception is raised while calling the Module. Default: False

Returns

HookRemoveHelper, a HookRemoveHelper object that can be used to remove the added hook by calling hook_remove_helper.remove() .

Examples

>>> import paddle
>>> import numpy as np

>>> # the forward_post_hook change the output of the layer: output = output * 2
>>> def forward_post_hook(layer, input, output):
...     # user can use layer, input and output for information statistics tasks
...
...     # change the output
...     return output * 2
...
>>> linear = paddle.nn.Linear(13, 5)

>>> # register the hook
>>> forward_post_hook_handle = linear.register_forward_post_hook(forward_post_hook)

>>> value1 = np.arange(26).reshape(2, 13).astype("float32")
>>> in1 = paddle.to_tensor(value1)

>>> out0 = linear(in1)

>>> # remove the hook
>>> forward_post_hook_handle.remove()

>>> out1 = linear(in1)

>>> # hook change the linear's output to output * 2, so out0 is equal to out1 * 2.
>>> assert (out0.numpy() == (out1.numpy()) * 2).any()
register_forward_post_hook ( hook: Union[Callable[[Layer, Tensor, Tensor], Tensor], Callable[[Layer, Tensor, dict[str, Any], Tensor], Tensor]], *, prepend: bool = False, with_kwargs: bool = False, always_call: bool = False ) HookRemoveHelper

register_forward_post_hook

Register a forward post-hook for Layer. The hook will be called after forward function has been computed.

It should have the following form, input and output of the hook is input and output of the Layer respectively. User can use forward post-hook to change the output of the Layer or perform information statistics tasks on the Layer.

hook(Layer, input, output) -> None or modified output

Parameters
  • hook (function) – a function registered as a forward post-hook

  • prepend (bool) – If True, the provided hook will be fired before all existing forward_post hooks on this paddle.nn.Layer. Default: False

  • with_kwargs (bool) – If True, the hook will be passed the kwargs given to the forward function. Default: False

  • always_call (bool) – If True the hook will be run regardless of whether an exception is raised while calling the Module. Default: False

Returns

HookRemoveHelper, a HookRemoveHelper object that can be used to remove the added hook by calling hook_remove_helper.remove() .

Examples

>>> import paddle
>>> import numpy as np

>>> # the forward_post_hook change the output of the layer: output = output * 2
>>> def forward_post_hook(layer, input, output):
...     # user can use layer, input and output for information statistics tasks
...
...     # change the output
...     return output * 2
...
>>> linear = paddle.nn.Linear(13, 5)

>>> # register the hook
>>> forward_post_hook_handle = linear.register_forward_post_hook(forward_post_hook)

>>> value1 = np.arange(26).reshape(2, 13).astype("float32")
>>> in1 = paddle.to_tensor(value1)

>>> out0 = linear(in1)

>>> # remove the hook
>>> forward_post_hook_handle.remove()

>>> out1 = linear(in1)

>>> # hook change the linear's output to output * 2, so out0 is equal to out1 * 2.
>>> assert (out0.numpy() == (out1.numpy()) * 2).any()
register_forward_pre_hook ( hook: Union[Callable[[Layer, Tensor], Tensor], Callable[[Layer, Tensor, dict[str, Any]], tuple[paddle.Tensor, dict[str, Any]]]], *, prepend: bool = False, with_kwargs: bool = False ) HookRemoveHelper

register_forward_pre_hook

Register a forward pre-hook for Layer. The hook will be called before forward function has been computed.

It should have the following form, input of the hook is input of the Layer, hook can either return a tuple or a single modified value in the hook. We will wrap the value into a tuple if a single value is returned(unless that value is already a tuple). User can use forward pre-hook to change the input of the Layer or perform information statistics tasks on the Layer.

hook(Layer, input) -> None or modified input

Parameters
  • hook (function) – a function registered as a forward pre-hook

  • prepend (bool) – If True, the provided hook will be fired before all existing forward_pre hooks on this paddle.nn.Layer. Default: False

  • with_kwargs (bool) – If true, the hook will be passed the kwargs given to the forward function. Default: False

Returns

HookRemoveHelper, a HookRemoveHelper object that can be used to remove the added hook by calling hook_remove_helper.remove() .

Examples

>>> import paddle
>>> import numpy as np

>>> # the forward_pre_hook change the input of the layer: input = input * 2
>>> def forward_pre_hook(layer, input):
...     # user can use layer and input for information statistics tasks
...
...     # change the input
...     input_return = (input[0] * 2)
...     return input_return
...
>>> linear = paddle.nn.Linear(13, 5)

>>> # register the hook
>>> forward_pre_hook_handle = linear.register_forward_pre_hook(forward_pre_hook)

>>> value0 = np.arange(26).reshape(2, 13).astype("float32")
>>> in0 = paddle.to_tensor(value0)
>>> out0 = linear(in0)

>>> # remove the hook
>>> forward_pre_hook_handle.remove()

>>> value1 = value0 * 2
>>> in1 = paddle.to_tensor(value1)
>>> out1 = linear(in1)

>>> # hook change the linear's input to input * 2, so out0 is equal to out1.
>>> assert (out0.numpy() == out1.numpy()).any()
register_module ( name: str, module: paddle.nn.layer.layers.Layer | None ) None

register_module

Adds a sub layer instance. Added layer can be accessed by self.name

Parameters
  • name (str) – name of this sublayer.

  • layer (Layer) – an instance of Layer.

Returns

None

register_parameter ( name: str, param: paddle.base.framework.Parameter | None ) None

register_parameter

Adds a Parameter instance. Added parameter can be accessed by self.name

Parameters
  • name (str) – name of this submodule.

  • parameter (Optional[Parameter]) – an instance of Parameter.

Returns

None

requires_grad_ ( requires_grad: bool = True ) Self

requires_grad_

Change if autograd should record operations on parameters in this layer.

Parameters

requires_grad (bool) – whether autograd should record operations on parameters in this layer. Default: True.

Returns

self

Return type

Layer

set_dict ( state_dict: Union[dict[str, paddle.Tensor], OrderedDict[str, Tensor]], use_structured_name: bool = True ) tuple[list[str], list[str]]

set_dict

Set parameters and persistable buffers from state_dict. All the parameters and buffers will be reset by the tensor in the state_dict

Parameters
  • state_dict (dict) – Dict contains all the parameters and persistable buffers.

  • use_structured_name (bool, optional) – If true, use structured name as key, otherwise, use parameter or buffer name as key. Default: True.

Returns

A list of str containing the missing keys unexpected_keys(list):A list of str containing the unexpected keys

Return type

missing_keys(list)

Examples

>>> import paddle

>>> emb = paddle.nn.Embedding(10, 10)

>>> state_dict = emb.state_dict()
>>> paddle.save(state_dict, "paddle_dy.pdparams")
>>> para_state_dict = paddle.load("paddle_dy.pdparams")
>>> emb.set_state_dict(para_state_dict)
set_state_dict ( state_dict: Union[dict[str, paddle.Tensor], OrderedDict[str, Tensor]], use_structured_name: bool = True ) tuple[list[str], list[str]]

set_state_dict

Set parameters and persistable buffers from state_dict. All the parameters and buffers will be reset by the tensor in the state_dict

Parameters
  • state_dict (dict) – Dict contains all the parameters and persistable buffers.

  • use_structured_name (bool, optional) – If true, use structured name as key, otherwise, use parameter or buffer name as key. Default: True.

Returns

A list of str containing the missing keys unexpected_keys(list):A list of str containing the unexpected keys

Return type

missing_keys(list)

Examples

>>> import paddle

>>> emb = paddle.nn.Embedding(10, 10)

>>> state_dict = emb.state_dict()
>>> paddle.save(state_dict, "paddle_dy.pdparams")
>>> para_state_dict = paddle.load("paddle_dy.pdparams")
>>> emb.set_state_dict(para_state_dict)
set_sublayer ( target: str, layer: Layer, strict: bool = False ) None

set_sublayer

Set the sublayer given by target if it exists, otherwise throw an error.

Parameters
  • target (str) – The fully-qualified string name of the sublayer to look for.

  • layer (Layer) – The layer to set the sublayer to.

  • strict (bool) – If False, the method will replace an existing sublayer or create a new sublayer if the parent module exists. If True, the method will only attempt to replace an existing sublayer and throw an error if the sublayer doesn’t already exist.

set_submodule ( target: str, layer: Layer, strict: bool = False ) None

set_submodule

Set the sublayer given by target if it exists, otherwise throw an error.

Parameters
  • target (str) – The fully-qualified string name of the sublayer to look for.

  • layer (Layer) – The layer to set the sublayer to.

  • strict (bool) – If False, the method will replace an existing sublayer or create a new sublayer if the parent module exists. If True, the method will only attempt to replace an existing sublayer and throw an error if the sublayer doesn’t already exist.

sharded_state_dict ( structured_name_prefix: str = '' ) Union[dict[str, paddle.distributed.flex_checkpoint.dcp.sharded_weight.ShardedWeight], OrderedDict[str, ShardedWeight]]

sharded_state_dict

Recursively builds a sharded state dictionary for the model and its sub-layers.

Parameters

structured_name_prefix – Prefix to prepend to all tensor names for hierarchical naming.

Returns

Dictionary mapping tensor names to ShardedWeight. The dictionary contains both the current layer’s parameters and all sub-layer parameters.

state_dict ( *args: Any, **kwargs: Any ) Union[dict[str, paddle.Tensor], OrderedDict[str, Tensor]]

state_dict

Get all parameters and persistable buffers of current layer and its sub-layers. And set them into a dict

Parameters
  • destination (dict, optional) – If provide, all the parameters and persistable buffers will be set to this dict . Default: None.

  • include_sublayers (bool, optional) – If true, also include the parameters and persistable buffers from sublayers. Default: True.

  • use_hook (bool, optional) – If true, the operations contained in _state_dict_hooks will be appended to the destination. Default: True.

  • keep_vars (bool, optional) – If false, the returned tensors in the state dict are detached from autograd. Default: True.

Returns

a dict contains all the parameters and persistable buffers.

Return type

dict

Examples

>>> import paddle

>>> emb = paddle.nn.Embedding(10, 10)

>>> state_dict = emb.state_dict()
>>> paddle.save(state_dict, "paddle_dy.pdparams")
sublayers ( include_self: bool = False ) list[paddle.nn.layer.layers.Layer]

sublayers

Returns a list of sub layers.

Parameters

include_self (bool, optional) – Whether return self as sublayers. Default: False.

Returns

list of Layer, a list of sub layers.

Examples

>>> import paddle

>>> class MyLayer(paddle.nn.Layer):
...     def __init__(self):
...         super().__init__()
...         self._linear = paddle.nn.Linear(1, 1)
...         self._dropout = paddle.nn.Dropout(p=0.5)
...
...     def forward(self, input):
...         temp = self._linear(input)
...         temp = self._dropout(temp)
...         return temp
>>> mylayer = MyLayer()
>>> print(mylayer.sublayers())
[Linear(in_features=1, out_features=1, dtype=float32), Dropout(p=0.5, axis=None, mode=upscale_in_train, inplace=False)]
to ( device: PlaceLike | None = None, dtype: DTypeLike | None = None, blocking: bool | None = None, non_blocking: bool | None = None ) Self

to

Cast the parameters and buffers of Layer by the give device, dtype and blocking.

Parameters
  • device (str|paddle.CPUPlace()|paddle.CUDAPlace()|paddle.CUDAPinnedPlace()|paddle.XPUPlace()|None, optional) – The device of the Layer which want to be stored.

  • None (If) –

  • string (the device is the same with the original Tensor. If device is) –

  • cpu (it can be) –

  • xpu:x (gpu:x and) –

  • the (where x is) –

  • Default (index of the GPUs or XPUs.) – None.

  • dtype (str|numpy.dtype|paddle.dtype|None, optional) – The type of the data. If None, the dtype is the same with the original Tensor. Default: None.

  • blocking (bool|None, optional) – If False and the source is in pinned memory, the copy will be asynchronous with respect to the host. Otherwise, the argument has no effect. If None, the blocking is set True. Default: None.

  • non_blocking (bool|None, optional) – If True and the source is in pinned memory, the copy will be asynchronous with respect to the host. Otherwise, the argument has no effect. If None, the non_blocking is set False. Default: None.

Returns

self

Examples

>>> import paddle
>>> paddle.seed(2023)

>>> linear=paddle.nn.Linear(2, 2)
>>> linear.weight
>>> print(linear.weight)
Parameter containing:
Tensor(shape=[2, 2], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[ 0.89611185,  0.04935038],
 [-0.58883440,  0.99266374]])

>>> linear.to(dtype='float64')
>>> linear.weight
>>> print(linear.weight)
Parameter containing:
Tensor(shape=[2, 2], dtype=float64, place=Place(gpu:0), stop_gradient=False,
[[ 0.89611185,  0.04935038],
 [-0.58883440,  0.99266374]])

>>> linear.to(device='cpu')
>>> linear.weight
>>> print(linear.weight)
Parameter containing:
Tensor(shape=[2, 2], dtype=float64, place=Place(cpu), stop_gradient=False,
[[ 0.89611185,  0.04935038],
 [-0.58883440,  0.99266374]])

>>> 
>>> linear.to(device=paddle.CUDAPinnedPlace(), blocking=False)
>>> linear.weight
>>> print(linear.weight)
Parameter containing:
Tensor(shape=[2, 2], dtype=float64, place=Place(gpu_pinned), stop_gradient=False,
[[ 0.89611185,  0.04935038],
 [-0.58883440,  0.99266374]])
to_static_state_dict ( destination: Optional[Union[dict[str, paddle.Tensor], OrderedDict[str, Tensor]]] = None, include_sublayers: bool = True, structured_name_prefix: str = '', use_hook: bool = True, keep_vars: bool = True ) Union[dict[str, paddle.Tensor], OrderedDict[str, Tensor]]

to_static_state_dict

Get all parameters and buffers of current layer and its sub-layers. And set them into a dict

Parameters
  • destination (dict, optional) – If provide, all the parameters and persistable buffers will be set to this dict . Default: None.

  • include_sublayers (bool, optional) – If true, also include the parameters and persistable buffers from sublayers. Default: True.

  • use_hook (bool, optional) – If true, the operations contained in _state_dict_hooks will be appended to the destination. Default: True.

  • keep_vars (bool, optional) – If false, the returned tensors in the state dict are detached from autograd. Default: True.

Returns

dict, a dict contains all the parameters and persistable buffers.

Examples

>>> import paddle

>>> emb = paddle.nn.Embedding(10, 10)

>>> state_dict = emb.to_static_state_dict()
>>> paddle.save( state_dict, "paddle_dy.pdparams")
train ( mode: bool = True ) Self

train

Sets this Layer and all its sublayers to training mode. This only effects certain modules like Dropout and BatchNorm.

Returns

self

Return type

Layer

Examples

>>> import paddle
>>> paddle.seed(100)

>>> class MyLayer(paddle.nn.Layer):
...     def __init__(self):
...         super().__init__()
...         self._linear = paddle.nn.Linear(1, 1)
...         self._dropout = paddle.nn.Dropout(p=0.5)
...
...     def forward(self, input):
...         temp = self._linear(input)
...         temp = self._dropout(temp)
...         return temp
...
>>> x = paddle.randn([10, 1], 'float32')
>>> mylayer = MyLayer()
>>> mylayer.eval()  # set mylayer._dropout to eval mode
>>> out = mylayer(x)
>>> mylayer.train()  # set mylayer._dropout to train mode
>>> out = mylayer(x)
>>> print(out)
Tensor(shape=[10, 1], dtype=float32, place=Place(cpu), stop_gradient=False,
[[-3.44879317],
 [ 0.        ],
 [ 0.        ],
 [-0.73825276],
 [ 0.        ],
 [ 0.        ],
 [ 0.64444798],
 [-3.22185946],
 [ 0.        ],
 [-0.68077987]])
type ( dst_type: paddle.dtype | str ) Self

type

Casts all parameters and buffers to dst_type.

Parameters

dtype (str|paddle.dtype) – target data type of layer. If set str, it can be “bool”, “bfloat16”, “float16”, “float32”, “float64”, “int8”, “int16”, “int32”, “int64”, “uint8”, “complex64”, “complex128”. Default: None

Returns

self

Return type

Layer

xpu ( device: int | PlaceLike | None = None ) Self

xpu

Move all model parameters and buffers to the XPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the layer will live on XPU while being optimized.

Parameters

device (int, optional) – if specified, all parameters will be copied to that device.

Returns

self

Return type

Layer

zero_grad ( set_to_none: bool = True ) None

zero_grad

Reset gradients of all model parameters.

Parameters
  • set_to_none (bool) – instead of setting to zero, set the grads to None. Currently, set_to_none=True

  • supported. (is not fully) –