Introduction to Data Type Promotion

This article introduces PaddlePaddle's data type promotion mechanism, aiding better usage of PaddlePaddle.

Background

Type promotion automatically determines the resulting data type when performing operations (+/-/*//) on different data types. This essential function facilitates data calculations between varied types.

Type promotion calculations are categorized based on the call method:

Operator Overloading: Uses operators directly for computations, such as a + b or a - b.
Binary API: Uses API functions for computations, such as paddle.add(a, b).

Data type promotion automatically handles type promotion without user intervention.

Type Promotion Rules

The scope and rules of type promotion differ between Tensor-to-Tensor and Tensor-to-Scalar operations. The following will be introduced separately.

   import paddle

   a = paddle.rand([3,3], dtype = 'float16')
   b = paddle.rand([3,3], dtype = 'float32')
   print (a + b) # when both a and b is tensor, treats as Tensor-to-Tensor

   a = paddle.rand([3,3], dtype = 'float16')
   b = 1.0
   print (a + b) # when either a or b is Scalar, treats as Tensor-to-Scalar

Type Promotion Rules in Tensor-to-Tensor

In model training, computations between different data types are usually limited to floating-point types. To help users quickly troubleshoot type-related issues, automatic type promotion between Tensors will only support calculations between floating-point types, as well as between complex and real numbers. The principle is to return the larger data type of the two Tensors. More details are shown in the table below:

+/-/*	bf16	f16	f32	f64	bool	u8	i8	i16	i32	i64	c64	c128
bf16	bf16	f32	f32	f64	-	-	-	-	-	-	c64	c128
f16	f32	f16	f32	f64	-	-	-	-	-	-	c64	c128
f32	f32	f32	f32	f64	-	-	-	-	-	-	c64	c128
f64	f64	f64	f64	f64	-	-	-	-	-	-	c128	c128
bool	-	-	-	-	-	-	-	-	-	-	c64	c128
u8	-	-	-	-	-	-	-	-	-	-	c64	c128
i8	-	-	-	-	-	-	-	-	-	-	c64	c128
i16	-	-	-	-	-	-	-	-	-	-	c64	c128
i32	-	-	-	-	-	-	-	-	-	-	c64	c128
i64	-	-	-	-	-	-	-	-	-	-	c64	c128
c64	c64	c64	c64	c128	c64	c64	c64	c64	c64	c64	c64	c128
c128	c128	c128	c128	c128	c128	c128	c128	c128	c128	c128	c128	c128

Taking Paddle add (a, b) as an example, in the table above, the row represents 'a' and the column represents 'b'.

Sample Code:

   import paddle

   # Calculation between floating points
   a = paddle.rand([3,3], dtype = 'float16')
   b = paddle.rand([3,3], dtype = 'float32')
   c = a + b # type promotion will automatically occur, casting 'a' to float32, and no additional user actions required
   print (c.dtype) # the dtype of 'c' is float32

   a = paddle.rand([3,3], dtype = 'bfloat16')
   b = paddle.rand([3,3], dtype = 'float64')
   c = a + b # type promotion will automatically occur, casting 'a' to float64, and no additional user actions required
   print (c.dtype) # the dtype of 'c' is float64

   # Calculation between complex and real number
   a = paddle.ones([3,3], dtype = 'complex64')
   b = paddle.rand([3,3], dtype = 'float64')
   c = a + b # type promotion will automatically occur, casting both 'a' and 'b' to complex128, and no additional user actions required
   print (c.dtype) # the dtype of 'c' is complex128

   # Calculation between complex numbers
   a = paddle.ones([3,3], dtype = 'complex128')
   b = paddle.ones([3,3], dtype = 'complex64')
   c = a + b # type promotion will automatically occur, casting 'b' to complex128, and no additional user actions required
   print (c.dtype) # the dtype of 'c' is complex128

Type Promotion Rules in Tensor-to-Scalar

Type promotion between Tensor and Scalar supports all types. The principle is to promote towards the Tensor's type when the Scalar's broad type (both are integers or both are floating-point, etc.) matches. Otherwise, the result follows the Tensor-to-Tensor rules.
The scalar operand has default dtype: int -> int64，float -> float32, bool -> bool, complex -> complex64. More Details show in this table:

+/-/*	bool	int	float	complex
bool	bool	i64	f32	c64
u8	u8	u8	f32	c64
i8	i8	i8	f32	c64
i16	i16	i16	f32	c64
i32	i32	i32	f32	c64
i64	i64	i64	f32	c64
bf16	bf16	bf16	bf16	c64
f16	f16	f16	f16	c64
f32	f32	f32	f32	c64
f64	f64	f64	f64	c128
c64	c64	c64	c64	c64
c128	c128	c128	c128	c128

Sample Code:

   import paddle

   # Both Scalar and Tensor are floating-point, then return Tensor's type
   a = paddle.rand([3,3], dtype = 'float16')
   b = 1.0
   c = a + b # type promotion will automatically occur, casting 'b' to float16, and no additional user actions required
   print (c.dtype) # the dtype of 'c' is float16

   # Scalar and Tensor unmatch in broad type, the result follows the Tensor-to-Tensor rules
   a = 1.0
   b = paddle.ones([3,3], dtype = 'int64')
   c = a + b # type promotion will automatically occur, casting 'b' to float32, and no additional user actions required
   print (c.dtype) # the dtype of 'c' is float16

How to Use Type Promotion

For Supported Case

   import paddle
   paddle.seed(2024)
   a = paddle.rand([3,3], dtype = 'float16')
   b = paddle.rand([3,3], dtype = 'float32')
   c = a + b # type promotion will automatically occur, casting 'a' to float32, and no additional user actions required
   print (c.dtype) # float32

   # Coincidence computative law
   d = b + a
   print (d.dtype) # float32
   print (paddle.allclose(c, d)) # Tensor(shape=[], dtype=bool, place=Place(gpu:0), stop_gradient=True, True)

   # Same with binary API
   e = paddle.add(a, b)
   print (e.dtype) # float32
   print (paddle.allclose(c, e)) # Tensor(shape=[], dtype=bool, place=Place(gpu:0), stop_gradient=True, True)

   # Same with static graph
   paddle.enable_static()
   exe = paddle.static.Executor()
   train_program = paddle.static.Program()
   startup_program = paddle.static.Program()
   with paddle.static.program_guard(train_program, startup_program):
       paddle.seed(2024)
       a = paddle.rand([3,3], dtype = 'float16')
       b = paddle.rand([3,3], dtype = 'float32')
       f = paddle.add(a, b)
       res = exe.run(train_program, fetch_list=[f])
   print (res[0].dtype) # float32
   paddle.disable_static()
   print (paddle.allclose(c, paddle.to_tensor(res[0]))) # Tensor(shape=[], dtype=bool, place=Place(gpu:0), stop_gradient=True, True)

2、For Unsupported Case

   import paddle
   a = paddle.ones([3,3], dtype = 'int64')
   b = paddle.rand([3,3], dtype = 'float32')
   c = a + b # due to the unsupported of automatic type promotion between int and float, TypeError will be raised

   # For those unsupported cases, we suggest that users manually perform type promotion
   # method 1: use astype API
   a = a.astype("float32")
   a = a.astype(b.dtype)

   # method 2：use cast API
   a = paddle.cast(a, "float32")
   a = paddle.cast(a, b.dtype)

The Scope of Type Promotion

As of Paddle version 2.6, the supported binary APIs and their rules are as follows:

Number	API	Tensor-to-Tensor	Tensor-to-Scalar
1	add	Common	Common
2	subtract	Common Rule	Common Rule
3	multiply	Common Rule	Common Rule
4	divide	Common Rule	Divide Rule
5	floor_divide	Common Rule	Common Rule
6	pow	Common Rule	Common Rule
7	equal	Logic Rule	Logic Rule
8	not_equal	Logic Rule	Logic Rule
9	less_than	Logic Rule	Logic Rule
10	less_equal	Logic Rule	Logic Rule
11	greater_than	Logic Rule	Logic Rule
12	greater_equal	Logic Rule	Logic Rule
13	logical_and	Logic Rule	Logic Rule
14	logical_or	Logic Rule	Logic Rule
15	logical_xor	Logic Rule	Logic Rule
16	bitwise_and	-	Common Rule
17	bitwise_or	-	Common Rule
18	bitwise_xor	-	Common Rule
19	where	Common Rule	Common Rule
20	fmax	Common Rule	-
21	fmin	Common Rule	-
22	logaddexp	Common Rule	-
23	maximum	Common Rule	-
24	minimum	Common Rule	-
25	remainder(mod)	Common Rule	Common Rule
26	huber_loss	Common Rule	-
27	nextafter	Common Rule	-
28	atan2	Common Rule	-
29	poisson_nll_loss	Common Rule	-
30	l1_loss	Common Rule	-
31	huber_loss	Common Rule	-
32	mse_loss	Common Rule	-

There are two specail rules in this table above:

Divide Rule: For divide API, it will not return dtype smaller than float. Such as int32 / Scalar returns float32.

   import paddle
   a = paddle.ones([3,3], dtype = 'int32')
   b = 1
   c = a / b
   print (c.dtype) # float32

Logic Rule: For logical API, since complex types cannot be directly used for logical operations, calculations involving complex types are not within the scope of type promotion support. Within the supported scope, all results return bool type.

   import paddle
   a = paddle.rand([3,3], dtype = 'float32')
   b = paddle.rand([3,3], dtype = 'float16')
   c = a == b
   print (c.dtype) # bool

Summary

Paddle ensures that calculations comply with the commutative property while supporting data type promotion, with consistent results for operator overloading and binary APIs, as well as for dynamic and static graphs. This article clarifies the rules and scope of data type promotion, summarizes the types of binary APIs that support data type promotion in the current version, and aims to enhance user convenience when using PaddlePaddle by introducing usage methods.