Introduction to Data Type Promotion¶
This article introduces PaddlePaddle’s data type promotion mechanism, aiding better usage of PaddlePaddle.
Background¶
Type promotion automatically determines the resulting data type when performing operations (+/-/*//) on different data types. This essential function facilitates data calculations between varied types.
Type promotion calculations are categorized based on the call method:
Operator Overloading: Uses operators directly for computations, such as a + b or a - b.
Binary API: Uses API functions for computations, such as paddle.add(a, b).
Data type promotion automatically handles type promotion without user intervention.
Type Promotion Rules¶
The scope and rules of type promotion differ between Tensor-to-Tensor and Tensor-to-Scalar operations. The following will be introduced separately.
import paddle
a = paddle.rand([3,3], dtype = 'float16')
b = paddle.rand([3,3], dtype = 'float32')
print (a + b) # when both a and b is tensor, treats as Tensor-to-Tensor
a = paddle.rand([3,3], dtype = 'float16')
b = 1.0
print (a + b) # when either a or b is Scalar, treats as Tensor-to-Scalar
Type Promotion Rules in Tensor-to-Tensor
In model training, computations between different data types are usually limited to floating-point types. To help users quickly troubleshoot type-related issues, automatic type promotion between Tensors will only support calculations between floating-point types, as well as between complex and real numbers. The principle is to return the larger data type of the two Tensors. More details are shown in the table below:
+/-/* | bf16 | f16 | f32 | f64 | bool | u8 | i8 | i16 | i32 | i64 | c64 | c128 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
bf16 | bf16 | f32 | f32 | f64 | - | - | - | - | - | - | c64 | c128 |
f16 | f32 | f16 | f32 | f64 | - | - | - | - | - | - | c64 | c128 |
f32 | f32 | f32 | f32 | f64 | - | - | - | - | - | - | c64 | c128 |
f64 | f64 | f64 | f64 | f64 | - | - | - | - | - | - | c128 | c128 |
bool | - | - | - | - | - | - | - | - | - | - | c64 | c128 |
u8 | - | - | - | - | - | - | - | - | - | - | c64 | c128 |
i8 | - | - | - | - | - | - | - | - | - | - | c64 | c128 |
i16 | - | - | - | - | - | - | - | - | - | - | c64 | c128 |
i32 | - | - | - | - | - | - | - | - | - | - | c64 | c128 |
i64 | - | - | - | - | - | - | - | - | - | - | c64 | c128 |
c64 | c64 | c64 | c64 | c64 | c64 | c64 | c64 | c64 | c64 | c128 | c64 | c128 |
c128 | c128 | c128 | c128 | c128 | c128 | c128 | c128 | c128 | c128 | c128 | c128 | c128 |
Taking Paddle add (a, b) as an example, in the table above, the row represents ‘a’ and the column represents ‘b’.
Sample Code:
import paddle
# Calculation between floating points
a = paddle.rand([3,3], dtype = 'float16')
b = paddle.rand([3,3], dtype = 'float32')
c = a + b # type promotion will automatically occur, casting 'a' to float32, and no additional user actions required
print (c.dtype) # the dtype of 'c' is float32
a = paddle.rand([3,3], dtype = 'bfloat16')
b = paddle.rand([3,3], dtype = 'float64')
c = a + b # type promotion will automatically occur, casting 'a' to float64, and no additional user actions required
print (c.dtype) # the dtype of 'c' is float64
# Calculation between complex and real number
a = paddle.ones([3,3], dtype = 'complex64')
b = paddle.rand([3,3], dtype = 'float64')
c = a + b # type promotion will automatically occur, casting both 'a' and 'b' to complex128, and no additional user actions required
print (c.dtype) # the dtype of 'c' is complex128
# Calculation between complex numbers
a = paddle.ones([3,3], dtype = 'complex128')
b = paddle.ones([3,3], dtype = 'complex64')
c = a + b # type promotion will automatically occur, casting 'b' to complex128, and no additional user actions required
print (c.dtype) # the dtype of 'c' is complex128
Type Promotion Rules in Tensor-to-Scalar
Type promotion between Tensor and Scalar supports all types. The principle is to promote towards the Tensor’s type when the Scalar’s broad type (both are integers or both are floating-point, etc.) matches. Otherwise, the result follows the Tensor-to-Tensor rules.
The scalar operand has default dtype: int -> int64,float -> float32, bool -> bool, complex -> complex64. More Details show in this table:
+/-/* | bool | int | float | complex |
---|---|---|---|---|
bool | bool | i64 | f32 | c64 |
u8 | u8 | u8 | f32 | c64 |
i8 | i8 | i8 | f32 | c64 |
i16 | i16 | i16 | f32 | c64 |
i32 | i32 | i32 | f32 | c64 |
i64 | i64 | i64 | f32 | c64 |
bf16 | bf16 | bf16 | bf16 | c64 |
f16 | f16 | f16 | f16 | c64 |
f32 | f32 | f32 | f32 | c64 |
f64 | f64 | f64 | f64 | c128 |
c64 | c64 | c64 | c64 | c64 |
c128 | c128 | c128 | c128 | c128 |
Sample Code:
import paddle
# Both Scalar and Tensor are floating-point, then return Tensor's type
a = paddle.rand([3,3], dtype = 'float16')
b = 1.0
c = a + b # type promotion will automatically occur, casting 'b' to float16, and no additional user actions required
print (c.dtype) # the dtype of 'c' is float16
# Scalar and Tensor unmatch in broad type, the result follows the Tensor-to-Tensor rules
a = 1.0
b = paddle.ones([3,3], dtype = 'int64')
c = a + b # type promotion will automatically occur, casting 'b' to float32, and no additional user actions required
print (c.dtype) # the dtype of 'c' is float16
How to Use Type Promotion¶
For Supported Case
import paddle
a = paddle.rand([3,3], dtype = 'float16')
b = paddle.rand([3,3], dtype = 'float32')
c = a + b # type promotion will automatically occur, casting 'a' to float32, and no additional user actions required
print (c.dtype) # float32
# Coincidence computative law
d = b + a
print (d.dtype) # float32
print (paddle.allclose(c, d)) # Tensor(shape=[], dtype=bool, place=Place(gpu:0), stop_gradient=True, True)
# Same with binary API
e = paddle.add(a, b)
print (e.dtype) # float32
print (paddle.allclose(c, e)) # Tensor(shape=[], dtype=bool, place=Place(gpu:0), stop_gradient=True, True)
# Same with static graph
paddle.enable_static()
exe = paddle.static.Executor()
train_program = paddle.static.Program()
startup_program = paddle.static.Program()
with paddle.static.program_guard(train_program, startup_program):
a = paddle.rand([3,3], dtype = 'float16')
b = paddle.rand([3,3], dtype = 'float32')
f = paddle.add(a, b)
res = exe.run(train_program, fetch_list=[f])
print (res[0].dtype) # float32
paddle.disable_static()
print (paddle.allclose(c, paddle.to_tensor(res[0]))) # Tensor(shape=[], dtype=bool, place=Place(gpu:0), stop_gradient=True, True)
2、For Unsupported Case
import paddle
a = paddle.ones([3,3], dtype = 'int64')
b = paddle.rand([3,3], dtype = 'float32')
c = a + b # due to the unsupported of automatic type promotion between int and float, TypeError will be raised
# For those unsupported cases, we suggest that users manually perform type promotion
# method 1: use astype API
a = a.astype("float32")
a = a.astype(b.dtype)
# method 2:use cast API
a = paddle.cast(a, "float32")
a = paddle.cast(a, b.dtype)
The Scope of Type Promotion¶
As of Paddle version 2.6, the supported binary APIs and their rules are as follows:
Number | API | Tensor-to-Tensor | Tensor-to-Scalar |
---|---|---|---|
1 | add | Common | Common |
2 | subtract | Common Rule | Common Rule |
3 | multiply | Common Rule | Common Rule |
4 | divide | Common Rule | Divide Rule |
5 | floor_divide | Common Rule | Common Rule |
6 | pow | Common Rule | Common Rule |
7 | equal | Logic Rule | Logic Rule |
8 | not_equal | Logic Rule | Logic Rule |
9 | less_than | Logic Rule | Logic Rule |
10 | less_equal | Logic Rule | Logic Rule |
11 | greater_than | Logic Rule | Logic Rule |
12 | greater_equal | Logic Rule | Logic Rule |
13 | logical_and | Logic Rule | Logic Rule |
14 | logical_or | Logic Rule | Logic Rule |
15 | logical_xor | Logic Rule | Logic Rule |
16 | bitwise_and | - | Common Rule |
17 | bitwise_or | - | Common Rule |
18 | bitwise_xor | - | Common Rule |
19 | where | Common Rule | Common Rule |
20 | fmax | Common Rule | - |
21 | fmin | Common Rule | - |
22 | logaddexp | Common Rule | - |
23 | maximum | Common Rule | - |
24 | minimum | Common Rule | - |
25 | remainder(mod) | Common Rule | Common Rule |
26 | huber_loss | Common Rule | - |
27 | nextafter | Common Rule | - |
28 | atan2 | Common Rule | - |
29 | poisson_nll_loss | Common Rule | - |
30 | l1_loss | Common Rule | - |
31 | huber_loss | Common Rule | - |
32 | mse_loss | Common Rule | - |
There are two specail rules in this table above:
Divide Rule: For divide API, it will not return dtype smaller than float. Such as int32 / Scalar returns float32.
import paddle
a = paddle.ones([3,3], dtype = 'int32')
b = 1
c = a / b
print (c.dtype) # float32
Logic Rule: For logical API, since complex types cannot be directly used for logical operations, calculations involving complex types are not within the scope of type promotion support. Within the supported scope, all results return bool type.
import paddle
a = paddle.rand([3,3], dtype = 'float32')
b = paddle.rand([3,3], dtype = 'float16')
c = a == b
print (c.dtype) # bool
Summary¶
Paddle ensures that calculations comply with the commutative property while supporting data type promotion, with consistent results for operator overloading and binary APIs, as well as for dynamic and static graphs. This article clarifies the rules and scope of data type promotion, summarizes the types of binary APIs that support data type promotion in the current version, and aims to enhance user convenience when using PaddlePaddle by introducing usage methods.