- paddle.vision.ops. psroi_pool ( x, boxes, boxes_num, output_size, spatial_scale=1.0, name=None )
Position sensitive region of interest pooling (also known as PSROIPooling) is to perform position-sensitive average pooling on regions of interest specified by input. It performs on inputs of nonuniform sizes to obtain fixed-size feature maps.
PSROIPooling is proposed by R-FCN. Please refer to https://arxiv.org/abs/1605.06409 for more details.
x (Tensor) – Input features with shape (N, C, H, W). The data type can be float32 or float64.
boxes (Tensor) – Box coordinates of ROIs (Regions of Interest) to pool over. It should be a 2-D Tensor with shape (num_rois, 4). Given as [[x1, y1, x2, y2], …], (x1, y1) is the top left coordinates, and (x2, y2) is the bottom right coordinates.
boxes_num (Tensor) – The number of boxes contained in each picture in the batch.
output_size (int|Tuple(int, int)) The pooled output size(H, W) – is int32. If int, H and W are both equal to output_size.
spatial_scale (float, optional) – Multiplicative spatial scale factor to translate ROI coords from their input scale to the scale used when pooling. Default: 1.0
name (str, optional) – The default value is None. Normally there is no need for user to set this property. For more information, please refer to Name
4-D Tensor. The pooled ROIs with shape (num_rois, output_channels, pooled_h, pooled_w). The output_channels equal to C / (pooled_h * pooled_w), where C is the channels of input.
import paddle x = paddle.uniform([2, 490, 28, 28], dtype='float32') boxes = paddle.to_tensor([[1, 5, 8, 10], [4, 2, 6, 7], [12, 12, 19, 21]], dtype='float32') boxes_num = paddle.to_tensor([1, 2], dtype='int32') pool_out = paddle.vision.ops.psroi_pool(x, boxes, boxes_num, 7, 1.0) print(pool_out.shape) # [3, 10, 7, 7]