psroi_pool

paddle.vision.ops. psroi_pool ( x, boxes, boxes_num, output_size, spatial_scale=1.0, name=None ) [source]

Position sensitive region of interest pooling (also known as PSROIPooling) is to perform position-sensitive average pooling on regions of interest specified by input. It performs on inputs of nonuniform sizes to obtain fixed-size feature maps.

PSROIPooling is proposed by R-FCN. Please refer to https://arxiv.org/abs/1605.06409 for more details.

Parameters
  • x (Tensor) – Input features with shape (N, C, H, W). The data type can be float32 or float64.

  • boxes (Tensor) – Box coordinates of ROIs (Regions of Interest) to pool over. It should be a 2-D Tensor with shape (num_rois, 4). Given as [[x1, y1, x2, y2], …], (x1, y1) is the top left coordinates, and (x2, y2) is the bottom right coordinates.

  • boxes_num (Tensor) – The number of boxes contained in each picture in the batch.

  • output_size (int|Tuple(int, int)) The pooled output size(H, W) – is int32. If int, H and W are both equal to output_size.

  • spatial_scale (float, optional) – Multiplicative spatial scale factor to translate ROI coords from their input scale to the scale used when pooling. Default: 1.0

  • name (str, optional) – The default value is None. Normally there is no need for user to set this property. For more information, please refer to Name

Returns

4-D Tensor. The pooled ROIs with shape (num_rois, output_channels, pooled_h, pooled_w). The output_channels equal to C / (pooled_h * pooled_w), where C is the channels of input.

Examples

>>> import paddle
>>> x = paddle.uniform([2, 490, 28, 28], dtype='float32')
>>> boxes = paddle.to_tensor([[1, 5, 8, 10], [4, 2, 6, 7], [12, 12, 19, 21]], dtype='float32')
>>> boxes_num = paddle.to_tensor([1, 2], dtype='int32')
>>> pool_out = paddle.vision.ops.psroi_pool(x, boxes, boxes_num, 7, 1.0)
>>> print(pool_out.shape)
[3, 10, 7, 7]