weight_dequantize

paddle.nn.quant. weight_dequantize ( x: Tensor, scale: Tensor, algo: _Algo = 'weight_only_int8', out_dtype: DTypeLike = 'float16', group_size: _GroupSize = -1 ) → Tensor [source]

Dequantization function for weight_only and llm.int8’s weight.

Parameters

x (Tensor) – The input Tensor to be dequantized, the data type is int8.
scale (Tensor) – The scale Tensor which is the output of weight_quantize, the data type is float32.
algo (str) – The algo that is x will be apply, must be one of ‘weight_only_int8’, ‘weight_only_int4’ and ‘llm.int8’, default: ‘weight_only_int8’.
out_dtype (str|np.dtype) – [Deprecated][Not used] The output Tensor’s data type, must be one of ‘float16’ and ‘bfloat16’, default: ‘float16’.

Returns

The Tensor which is the dequantitative results, the data type is float16 or bfloat16, the shape is transposition of x.

Return type

out (Tensor)

Examples

>>> 
>>> import paddle
>>> from paddle.nn.quant import weight_quantize, weight_dequantize

>>> paddle.seed(2023)
>>> x = paddle.rand(shape=[64, 32], dtype=paddle.float16)
>>> out, scale = weight_quantize(x, algo='weight_only_int8')
>>> x_dequant = weight_dequantize(out, scale)