PTQ
- class paddle.quantization. PTQ ( config: QuantConfig ) [source]
- 
         Applying post training quantization to the model. - 
            
           quantize
           (
           model: Layer, 
           inplace: bool = False
           ) 
            Layer
           [source]
           quantize¶
- 
           Create a model for post-training quantization. The quantization configuration will be propagated in the model. And it will insert observers into the model to collect and compute quantization parameters. - Parameters
- 
             - model (Layer) – The model to be quantized. 
- inplace (bool) – Whether to modify the model in-place. 
 
 Return: The prepared model for post-training quantization. Examples >>> from paddle.quantization import PTQ, QuantConfig >>> from paddle.quantization.observers import AbsmaxObserver >>> from paddle.vision.models import LeNet >>> observer = AbsmaxObserver() >>> q_config = QuantConfig(activation=observer, weight=observer) >>> ptq = PTQ(q_config) >>> model = LeNet() >>> model.eval() >>> quant_model = ptq.quantize(model) >>> print(quant_model) LeNet( (features): Sequential( (0): QuantedConv2D( (weight_quanter): AbsmaxObserverLayer() (activation_quanter): AbsmaxObserverLayer() ) (1): ObserveWrapper( (_observer): AbsmaxObserverLayer() (_observed): ReLU() ) (2): ObserveWrapper( (_observer): AbsmaxObserverLayer() (_observed): MaxPool2D(kernel_size=2, stride=2, padding=0) ) (3): QuantedConv2D( (weight_quanter): AbsmaxObserverLayer() (activation_quanter): AbsmaxObserverLayer() ) (4): ObserveWrapper( (_observer): AbsmaxObserverLayer() (_observed): ReLU() ) (5): ObserveWrapper( (_observer): AbsmaxObserverLayer() (_observed): MaxPool2D(kernel_size=2, stride=2, padding=0) ) ) (fc): Sequential( (0): QuantedLinear( (weight_quanter): AbsmaxObserverLayer() (activation_quanter): AbsmaxObserverLayer() ) (1): QuantedLinear( (weight_quanter): AbsmaxObserverLayer() (activation_quanter): AbsmaxObserverLayer() ) (2): QuantedLinear( (weight_quanter): AbsmaxObserverLayer() (activation_quanter): AbsmaxObserverLayer() ) ) ) 
 - 
            
           convert
           (
           model: Layer, 
           inplace=False, 
           remain_weight=False
           )
           convert¶
- 
           Convert the quantization model to ONNX style. And the converted model can be saved as inference model by calling paddle.jit.save. :param model: The quantized model to be converted. :type model: Layer :param inplace: Whether to modify the model in-place, default is False. :type inplace: bool, optional :param remain_weight: Whether to remain weights in floats, default is False. :type remain_weight: bool, optional Return: The converted model Examples >>> import paddle >>> from paddle.quantization import QAT, QuantConfig >>> from paddle.quantization.quanters import FakeQuanterWithAbsMaxObserver >>> from paddle.vision.models import LeNet >>> quanter = FakeQuanterWithAbsMaxObserver(moving_rate=0.9) >>> q_config = QuantConfig(activation=quanter, weight=quanter) >>> qat = QAT(q_config) >>> model = LeNet() >>> quantized_model = qat.quantize(model) >>> converted_model = qat.convert(quantized_model) >>> dummy_data = paddle.rand([1, 1, 32, 32], dtype="float32") >>> paddle.jit.save(converted_model, "./quant_deploy", [dummy_data]) 
 
- 
            
           quantize
           (
           model: Layer, 
           inplace: bool = False
           ) 
            Layer
           [source]
           
