class paddle.nn.quant.quant_layers. QuantizedMatmul ( layer=None, weight_bits=8, activation_bits=8, moving_rate=0.9, weight_quantize_type='abs_max', activation_quantize_type='abs_max', weight_pre_layer=None, act_pre_layer=None, weight_quant_layer=None, act_quant_layer=None ) [source]

The computational logic of QuantizedMatmul is the same with Matmul. The only difference is that its inputs are all fake quantized.

forward ( x, y, transpose_x=False, transpose_y=False, name=None )


Defines the computation performed at every call. Should be overridden by all subclasses.

  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments