generate_proposals¶
- paddle.vision.ops. generate_proposals ( scores, bbox_deltas, img_size, anchors, variances, pre_nms_top_n=6000, post_nms_top_n=1000, nms_thresh=0.5, min_size=0.1, eta=1.0, pixel_offset=False, return_rois_num=False, name=None ) [source]
- 
         This operation proposes RoIs according to each box with their probability to be a foreground object. And the proposals of RPN output are calculated by anchors, bbox_deltas and scores. Final proposals could be used to train detection net. For generating proposals, this operation performs following steps: - Transpose and resize scores and bbox_deltas in size of (H * W * A, 1) and (H * W * A, 4) 
- Calculate box locations as proposals candidates. 
- Clip boxes to image 
- Remove predicted boxes with small area. 
- Apply non-maximum suppression (NMS) to get final proposals as output. 
 - Parameters
- 
           - scores (Tensor) – A 4-D Tensor with shape [N, A, H, W] represents the probability for each box to be an object. N is batch size, A is number of anchors, H and W are height and width of the feature map. The data type must be float32. 
- bbox_deltas (Tensor) – A 4-D Tensor with shape [N, 4*A, H, W] represents the difference between predicted box location and anchor location. The data type must be float32. 
- img_size (Tensor) – A 2-D Tensor with shape [N, 2] represents origin image shape information for N batch, including height and width of the input sizes. The data type can be float32 or float64. 
- anchors (Tensor) – A 4-D Tensor represents the anchors with a layout of [H, W, A, 4]. H and W are height and width of the feature map, num_anchors is the box count of each position. Each anchor is in (xmin, ymin, xmax, ymax) format an unnormalized. The data type must be float32. 
- variances (Tensor) – A 4-D Tensor. The expanded variances of anchors with a layout of [H, W, num_priors, 4]. Each variance is in (xcenter, ycenter, w, h) format. The data type must be float32. 
- pre_nms_top_n (float, optional) – Number of total bboxes to be kept per image before NMS. 6000 by default. 
- post_nms_top_n (float, optional) – Number of total bboxes to be kept per image after NMS. 1000 by default. 
- nms_thresh (float, optional) – Threshold in NMS. The data type must be float32. 0.5 by default. 
- min_size (float, optional) – Remove predicted boxes with either height or width less than this value. 0.1 by default. 
- eta (float, optional) – Apply in adaptive NMS, only works if adaptive threshold > 0.5, adaptive_threshold = adaptive_threshold * eta in each iteration. 1.0 by default. 
- pixel_offset (bool, optional) – Whether there is pixel offset. If True, the offset of img_size will be 1. ‘False’ by default. 
- return_rois_num (bool, optional) – Whether to return rpn_rois_num . When setting True, it will return a 1D Tensor with shape [N, ] that includes Rois’s num of each image in one batch. ‘False’ by default. 
- name (str, optional) – For detailed information, please refer to Name. Usually name is no need to set and None by default. 
 
- Returns
- 
           The generated RoIs. 2-D Tensor with shape [N, 4]whileNis the number of RoIs. The data type is the same asscores. - rpn_roi_probs (Tensor): The scores of generated RoIs. 2-D Tensor with shape[N, 1]whileNis the number of RoIs. The data type is the same asscores. - rpn_rois_num (Tensor): Rois’s num of each image in one batch. 1-D Tensor with shape[B,]whileBis the batch size. And its sum equals to RoIs numberN.
- Return type
- 
           
           - rpn_rois (Tensor) 
 
 Examples import paddle scores = paddle.rand((2,4,5,5), dtype=paddle.float32) bbox_deltas = paddle.rand((2, 16, 5, 5), dtype=paddle.float32) img_size = paddle.to_tensor([[224.0, 224.0], [224.0, 224.0]]) anchors = paddle.rand((2,5,4,4), dtype=paddle.float32) variances = paddle.rand((2,5,10,4), dtype=paddle.float32) rois, roi_probs, roi_nums = paddle.vision.ops.generate_proposals(scores, bbox_deltas, img_size, anchors, variances, return_rois_num=True) print(rois, roi_probs, roi_nums) 
