generate_proposals

paddle.fluid.layers.generate_proposals(scores, bbox_deltas, im_info, anchors, variances, pre_nms_top_n=6000, post_nms_top_n=1000, nms_thresh=0.5, min_size=0.1, eta=1.0, name=None)[source]

Generate proposal Faster-RCNN

This operation proposes RoIs according to each box with their probability to be a foreground object and the box can be calculated by anchors. Bbox_deltais and scores to be an object are the output of RPN. Final proposals could be used to train detection net.

For generating proposals, this operation performs following steps:

  1. Transposes and resizes scores and bbox_deltas in size of (H*W*A, 1) and (H*W*A, 4)

  2. Calculate box locations as proposals candidates.

  3. Clip boxes to image

  4. Remove predicted boxes with small area.

  5. Apply NMS to get final proposals as output.

Parameters
  • scores (Variable) – A 4-D Tensor with shape [N, A, H, W] represents the probability for each box to be an object. N is batch size, A is number of anchors, H and W are height and width of the feature map. The data type must be float32.

  • bbox_deltas (Variable) – A 4-D Tensor with shape [N, 4*A, H, W] represents the difference between predicted box location and anchor location. The data type must be float32.

  • im_info (Variable) – A 2-D Tensor with shape [N, 3] represents origin image information for N batch. Info contains height, width and scale between origin image size and the size of feature map. The data type must be int32.

  • anchors (Variable) – A 4-D Tensor represents the anchors with a layout of [H, W, A, 4]. H and W are height and width of the feature map, num_anchors is the box count of each position. Each anchor is in (xmin, ymin, xmax, ymax) format an unnormalized. The data type must be float32.

  • variances (Variable) – A 4-D Tensor. The expanded variances of anchors with a layout of [H, W, num_priors, 4]. Each variance is in (xcenter, ycenter, w, h) format. The data type must be float32.

  • pre_nms_top_n (float) – Number of total bboxes to be kept per image before NMS. The data type must be float32. 6000 by default.

  • post_nms_top_n (float) – Number of total bboxes to be kept per image after NMS. The data type must be float32. 1000 by default.

  • nms_thresh (float) – Threshold in NMS. The data type must be float32. 0.5 by default.

  • min_size (float) – Remove predicted boxes with either height or width < min_size. The data type must be float32. 0.1 by default.

  • eta (float) – Apply in adaptive NMS, if adaptive threshold > 0.5, adaptive_threshold = adaptive_threshold * eta in each iteration.

Returns

A tuple with format (rpn_rois, rpn_roi_probs).

  • rpn_rois: The generated RoIs. 2-D Tensor with shape [N, 4] while N is the number of RoIs. The data type is the same as scores.

  • rpn_roi_probs: The scores of generated RoIs. 2-D Tensor with shape [N, 1] while N is the number of RoIs. The data type is the same as scores.

Return type

tuple

Examples

import paddle.fluid as fluid
scores = fluid.data(name='scores', shape=[None, 4, 5, 5], dtype='float32')
bbox_deltas = fluid.data(name='bbox_deltas', shape=[None, 16, 5, 5], dtype='float32')
im_info = fluid.data(name='im_info', shape=[None, 3], dtype='float32')
anchors = fluid.data(name='anchors', shape=[None, 5, 4, 4], dtype='float32')
variances = fluid.data(name='variances', shape=[None, 5, 10, 4], dtype='float32')
rois, roi_probs = fluid.layers.generate_proposals(scores, bbox_deltas,
             im_info, anchors, variances)