deformable_roi_pooling

paddle.fluid.layers.deformable_roi_pooling(input, rois, trans, no_trans=False, spatial_scale=1.0, group_size=[1, 1], pooled_height=1, pooled_width=1, part_size=None, sample_per_part=1, trans_std=0.1, position_sensitive=False, name=None)[source]

Deformable ROI Pooling Layer

Performs deformable region-of-interest pooling on inputs. As described in Deformable Convolutional Networks, it will get offset for each bin after roi pooling so that pooling at correct region. Batch_size will change to the number of region bounding boxes after deformable_roi_pooling.

The operation has three steps:

  1. Dividing each region proposal into equal-sized sections with the pooled_width and pooled_height.

  2. Add offset to pixel in ROI to get new location and the new value which are computed directly through bilinear interpolation with four nearest pixel.

  3. Sample several points in each bin to get average values as output.

Parameters
  • input (Variable) – The input of deformable roi pooling and it is tensor which value type is float32. The shape of input is [N, C, H, W]. Where N is batch size, C is number of input channels, H is height of the feature, and W is the width of the feature.

  • rois (Variable) – ROIs (Regions of Interest) with type float32 to pool over. It should be a 2-D LoDTensor of shape (num_rois, 4), and the lod level is 1. Given as [[x1, y1, x2, y2], …], (x1, y1) is the top left coordinates, and (x2, y2) is the bottom right coordinates, which value type is float32.

  • trans (Variable) – Offset of features on ROIs while pooling which value type is float32. The format is [N, C, H, W], where N is number of ROIs, C is number of channels, which indicate the offset distance in the x and y directions, H is pooled height, and W is pooled width.

  • no_trans (bool) – Whether to add offset to get new value or not while roi pooling, which value with type bool is True or False. If value is True, no offset will be added in operation. Default: False.

  • spatial_scale (float) – Ratio of input feature map height (or width) to raw image height (or width), which value type is float32. Equals the reciprocal of total stride in convolutional layers, Default: 1.0.

  • group_size (list|tuple) – The number of groups which input channels are divided and the input is list or tuple, which value type is int32. (eg.number of input channels is k1 * k2 * (C + 1), which k1 and k2 are group width and height and C+1 is number of output channels.) eg.(4, 6), which 4 is height of group and 6 is width of group. Default: [1, 1].

  • pooled_height (int) – The pooled output height which value type is int32. Default: 1.

  • pooled_width (int) – The pooled output width which value type is int32. Default: 1.

  • part_size (list|tuple) – The height and width of offset which values in list or tuple is int32, eg.(4, 6), which height is 4 and width is 6, and values always equal to pooled_height and pooled_width. Default: if None, default value is [pooled_height, pooled_width].

  • sample_per_part (int) – The number of samples in each bin which value type is int32. If value is bigger, it will consume more performance. Default: 1.

  • trans_std (float) – Coefficient of offset which value type is float32. It controls weight of offset. Default: 0.1.

  • position_sensitive (bool) – Whether to choose deformable psroi pooling mode or not, and value type is bool(True or False). If value is False, input dimension equals to output dimension. If value is True, input dimension should be output dimension * pooled_height * pooled_width. Default: False.

  • name (str|None) – Name of layer. Default: None.

Returns

Output of deformable roi pooling is that, if position sensitive is False, input dimension equals to output dimension. If position sensitive is True, input dimension should be the result of output dimension divided by pooled height and pooled width.

Return type

Variable

Examples

# position_sensitive=True
import paddle.fluid as fluid
input = fluid.data(name="input",
                   shape=[2, 192, 64, 64],
                   dtype='float32')
rois = fluid.data(name="rois",
                  shape=[-1, 4],
                  dtype='float32',
                  lod_level=1)
trans = fluid.data(name="trans",
                   shape=[2, 384, 64, 64],
                   dtype='float32')
x = fluid.layers.deformable_roi_pooling(input=input,
                                        rois=rois,
                                        trans=trans,
                                        no_trans=False,
                                        spatial_scale=1.0,
                                        group_size=(1, 1),
                                        pooled_height=8,
                                        pooled_width=8,
                                        part_size=(8, 8),
                                        sample_per_part=4,
                                        trans_std=0.1,
                                        position_sensitive=True)

# position_sensitive=False
import paddle.fluid as fluid
input = fluid.data(name="input",
                   shape=[2, 192, 64, 64],
                   dtype='float32')
rois = fluid.data(name="rois",
                  shape=[-1, 4],
                  dtype='float32',
                  lod_level=1)
trans = fluid.data(name="trans",
                   shape=[2, 384, 64, 64],
                   dtype='float32')
x = fluid.layers.deformable_roi_pooling(input=input,
                                        rois=rois,
                                        trans=trans,
                                        no_trans=False,
                                        spatial_scale=1.0,
                                        group_size=(1, 1),
                                        pooled_height=8,
                                        pooled_width=8,
                                        part_size=(8, 8),
                                        sample_per_part=4,
                                        trans_std=0.1,
                                        position_sensitive=False)