dropout¶
- paddle.nn.functional. dropout ( x, p=0.5, axis=None, training=True, mode='upscale_in_train', name=None ) [source]
- 
         Dropout is a regularization technique for reducing overfitting by preventing neuron co-adaption during training. The dropout operator randomly sets the outputs of some units to zero, while upscale others according to the given dropout probability. - Parameters
- 
           - x (Tensor) – The input tensor. The data type is float32 or float64. 
- p (float|int, optional) – Probability of setting units to zero. Default 0.5. 
- axis (int|list|tuple, optional) – The axis along which the dropout is performed. Default None. 
- training (bool, optional) – A flag indicating whether it is in train phrase or not. Default True. 
- mode (str, optional) – - [‘upscale_in_train’(default) | ‘downscale_in_infer’]. - upscale_in_train(default), upscale the output at training time - train: out = input * mask / ( 1.0 - dropout_prob ) 
- inference: out = input 
 
- downscale_in_infer, downscale the output at inference - train: out = input * mask 
- inference: out = input * (1.0 - dropout_prob) 
 
 
- name (str, optional) – Name for the operation (optional, default is None). For more information, please refer to Name. 
 
- Returns
- 
           A Tensor representing the dropout, has same shape and data type as x . 
 Examples We use p=0.5in the following description for simplicity.- When - axis=None, this is commonly used dropout, which dropout each element of x randomly.
 Let's see a simple case when x is a 2d tensor with shape 2*3: [[1 2 3] [4 5 6]] we generate mask with the same shape as x, which is 2*3. The value of mask is sampled from a Bernoulli distribution randomly. For example, we may get such mask: [[0 1 0] [1 0 1]] So the output is obtained from elementwise multiply of x and mask: [[0 2 0] [4 0 6]] Using default setting, i.e. ``mode='upscale_in_train'`` , if in training phase, the final upscale output is: [[0 4 0 ] [8 0 12]] if in test phase, the output is the same as input: [[1 2 3] [4 5 6]] we can also set ``mode='downscale_in_infer'`` , then if in training phase, the final output is: [[0 2 0] [4 0 6]] if in test phase, the scale output is: [[0.5 1. 1.5] [2. 2.5 3. ]] - When - axis!=None, this is useful for dropping whole channels from an image or sequence.
 Let's see the simple case when x is a 2d tensor with shape 2*3 again: [[1 2 3] [4 5 6]] (1) If ``axis=0`` , this means the dropout is only performed in axis `0` . we generate mask with the shape 2*1. Only in axis `0` the value is randomly selected. For example, we may get such mask: [[1] [0]] The output is obtained from elementwise multiply of x and mask. Doing that the mask will be broadcast from 2*1 to 2*3: [[1 1 1] [0 0 0]] and the result after elementwise multiply is: [[1 2 3] [0 0 0]] then we can do upscale or downscale according to the setting of other arguments. (2) If ``axis=1`` , this means the dropout is only performed in axis `1` . we generate mask with the shape 1*3. Only in axis `1` the value is randomly selected. For example, we may get such mask: [[1 0 1]] Doing elementwise multiply the mask will be broadcast from 1*3 to 2*3: [[1 0 1] [1 0 1]] and the result after elementwise multiply is: [[1 0 3] [4 0 6]] (3) What about ``axis=[0, 1]`` ? This means the dropout is performed in all axes of x, which is the same case as default setting ``axis=None`` . (4) You may note that logically `axis=None` means the dropout is performed in none axis of x, We generate mask with the shape 1*1. Whole input is randomly selected or dropped. For example, we may get such mask: [[0]] Doing elementwise multiply the mask will be broadcast from 1*1 to 2*3: [[0 0 0] [0 0 0]] and the result after elementwise multiply is: [[0 0 0] [0 0 0]] Actually this is not what we want because all elements may set to zero~When x is a 4d tensor with shape NCHW, we can set axis=[0,1]and the dropout will be performed in channel N and C, H and W is tied, i.e. paddle.nn.dropout(x, p, axis=[0,1]) . Please refer topaddle.nn.functional.dropout2dfor more details. Similarly, when x is a 5d tensor with shape NCDHW, we can setaxis=[0,1]to perform dropout3d. Please refer topaddle.nn.functional.dropout3dfor more details.import paddle x = paddle.to_tensor([[1,2,3], [4,5,6]]).astype(paddle.float32) y_train = paddle.nn.functional.dropout(x, 0.5) y_test = paddle.nn.functional.dropout(x, 0.5, training=False) y_0 = paddle.nn.functional.dropout(x, axis=0) y_1 = paddle.nn.functional.dropout(x, axis=1) y_01 = paddle.nn.functional.dropout(x, axis=[0,1]) print(x) # Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True, # [[1., 2., 3.], # [4., 5., 6.]]) print(y_train) # Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True, # [[2. , 0. , 6. ], # [8. , 0. , 12.]]) print(y_test) # Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True, # [[1., 2., 3.], # [4., 5., 6.]]) print(y_0) # Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True, # [[0. , 0. , 0. ], # [8. , 10., 12.]]) print(y_1) # Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True, # [[2. , 0. , 6. ], # [8. , 0. , 12.]]) print(y_01) # Tensor(shape=[2, 3], dtype=float32, place=Place(cpu), stop_gradient=True, # [[0. , 0. , 0. ], # [8. , 0. , 12.]]) 
