paddle.fluid.layers.nn. roi_align ( input, rois, pooled_height=1, pooled_width=1, spatial_scale=1.0, sampling_ratio=- 1, rois_num=None, name=None ) [source]

RoIAlign Operator

Region of interest align (also known as RoI align) is to perform bilinear interpolation on inputs of nonuniform sizes to obtain fixed-size feature maps (e.g. 7*7)

Dividing each region proposal into equal-sized sections with the pooled_width and pooled_height. Location remains the origin result.

In each ROI bin, the value of the four regularly sampled locations are computed directly through bilinear interpolation. The output is the mean of four locations. Thus avoid the misaligned problem.

  • input (Variable) – (Tensor), The input of ROIAlignOp. The data type is float32 or float64.The format of input tensor is NCHW. Where N is batch size, C is the number of input channels, H is the height of the feature, and W is the width of the feature

  • rois (Variable) – ROIs (Regions of Interest) to pool over.It should be a 2-D LoDTensor of shape (num_rois, 4), the lod level is 1. The data type is float32 or float64. Given as [[x1, y1, x2, y2], …], (x1, y1) is the top left coordinates, and (x2, y2) is the bottom right coordinates.

  • pooled_height (int32, optional) – (int, default 1), The pooled output height Default: 1

  • pooled_width (int32, optional) – (int, default 1), The pooled output width Default: 1

  • spatial_scale (float32, optional) – (float, default 1.0), Multiplicative spatial scale factor to translate ROI coords from their input scale to the scale used when pooling Default: 1.0

  • sampling_ratio (int32, optional) – (int,default -1),number of sampling points in the interpolation gridIf <=0, then grid points are adaptive to roi_width and pooled_w, likewise for height Default: -1

  • rois_num (Tensor) – The number of RoIs in each image. Default: None

  • name (str, optional) – For detailed information, please refer to Name. Usually name is no need to set and None by default.


Output: (Tensor), The output of ROIAlignOp is a 4-D tensor with shape (num_rois, channels, pooled_h, pooled_w). The data type is float32 or float64.

Return type



import paddle.fluid as fluid
import paddle

x =
    name='data', shape=[None, 256, 32, 32], dtype='float32')
rois =
    name='rois', shape=[None, 4], dtype='float32')
rois_num ='rois_num', shape=[None], dtype='int32')
align_out = fluid.layers.roi_align(input=x,