ImperativeQuantAware¶

class paddle.fluid.contrib.slim.quantization.imperative.qat. ImperativeQuantAware ( quantizable_layer_type=['Conv2D', 'Linear', 'Conv2DTranspose', 'ColumnParallelLinear', 'RowParallelLinear'], weight_quantize_type='abs_max', activation_quantize_type='moving_average_abs_max', weight_bits=8, activation_bits=8, moving_rate=0.9, fuse_conv_bn=False, weight_preprocess_layer=None, act_preprocess_layer=None, weight_quantize_layer=None, act_quantize_layer=None, onnx_format=False ) [source]

Applying quantization aware training (QAT) to the dgraph model.

quantize ( model ) quantize¶

According to weights’ and activations’ quantization types, the model will be added some fake quant ops, such as fake_quantize_dequantize_moving_average_abs_max, fake_quantize_dequantize_abs_max and so on. At the same time, the out_scale value of outputs would be calculated.

Parameters: model (paddle.nn.Layer) – the model to be quantized.
Returns: None

Examples: .. code-block:: python

import paddle from paddle.fluid.contrib.slim.quantization import ImperativeQuantAware

class ImperativeModel(paddle.nn.Layer):

def __init__(self):

super(ImperativeModel, self).__init__() # self.linear_0 would skip the quantization. self.linear_0 = paddle.nn.Linear(784, 400) self.linear_0.skip_quant = True

# self.linear_1 would not skip the quantization. self.linear_1 = paddle.nn.Linear(400, 10) self.linear_1.skip_quant = False

def forward(self, inputs):

x = self.linear_0(inputs) x = self.linear_1(inputs) return x

model = ImperativeModel() imperative_qat = ImperativeQuantAware(

System Message: ERROR/3 (/usr/local/lib/python3.8/site-packages/paddle/fluid/contrib/slim/quantization/imperative/qat.py:docstring of paddle.fluid.contrib.slim.quantization.imperative.qat.ImperativeQuantAware.quantize, line 36)

Unexpected indentation.

weight_quantize_type=’abs_max’, activation_quantize_type=’moving_average_abs_max’)

# Add the fake quant logical. # The original model will be rewrite. # # There is only one Layer(self.linear1) would be added the # fake quant logical. imperative_qat.quantize(model)