paddle.nn.quant. weight_dequantize ( x, scale, algo='weight_only_int8', out_dtype='float16', group_size=- 1 ) [source]

Dequantization function for weight_only and llm.int8’s weight.

  • x (Tensor) – The input Tensor to be dequantized, the data type is int8.

  • scale (Tensor) – The scale Tensor which is the output of weight_quantize, the data type is float32.

  • algo (str) – The algo that is x will be apply, must be one of ‘weight_only_int8’, ‘weight_only_int4’ and ‘llm.int8’, default: ‘weight_only_int8’.

  • out_dtype (str|np.dtype) – The output Tensor’s data type, must be one of ‘float16’ and ‘bfloat16’, default: ‘float16’.


The Tensor which is the dequantitative results, the data type is float16 or bfloat16, the shape is transposition of x.

Return type

out (Tensor)


>>> import paddle
>>> from paddle.nn.quant import weight_quantize, weight_dequantize

>>> paddle.seed(2023)
>>> x = paddle.rand(shape=[64, 32], dtype=paddle.float16)
>>> out, scale = weight_quantize(x, algo='weight_only_int8')
>>> x_dequant = weight_dequantize(out, scale)