multiclass_nms

paddle.fluid.layers.multiclass_nms(bboxes, scores, score_threshold, nms_top_k, keep_top_k, nms_threshold=0.3, normalized=True, nms_eta=1.0, background_label=0, name=None)[source]

Multiclass NMS

This operator is to do multi-class non maximum suppression (NMS) on boxes and scores.

In the NMS step, this operator greedily selects a subset of detection bounding boxes that have high scores larger than score_threshold, if providing this threshold, then selects the largest nms_top_k confidences scores if nms_top_k is larger than -1. Then this operator pruns away boxes that have high IOU (intersection over union) overlap with already selected boxes by adaptive threshold NMS based on parameters of nms_threshold and nms_eta. Aftern NMS step, at most keep_top_k number of total bboxes are to be kept per image if keep_top_k is larger than -1.

See below for an example:

if:
    box1.data = (2.0, 3.0, 7.0, 5.0) format is (xmin, ymin, xmax, ymax)
    box1.scores = (0.7, 0.2, 0.4)  which is (label0.score=0.7, label1.score=0.2, label2.cores=0.4)

    box2.data = (3.0, 4.0, 8.0, 5.0)
    box2.score = (0.3, 0.3, 0.1)

    nms_threshold = 0.3
    background_label = 0
    score_threshold = 0


Then:
    iou = 4/11 > 0.3
    out.data = [[1, 0.3, 3.0, 4.0, 8.0, 5.0],
                 [2, 0.4, 2.0, 3.0, 7.0, 5.0]]

    Out format is (label, confidence, xmin, ymin, xmax, ymax)

System Message: WARNING/2 (/usr/local/lib/python2.7/dist-packages/paddle/fluid/layers/detection.py:docstring of paddle.fluid.layers.multiclass_nms, line 37)

Explicit markup ends without a blank line; unexpected unindent.

Parameters
  • bboxes (Variable) – Two types of bboxes are supported: 1. (Tensor) A 3-D Tensor with shape [N, M, 4 or 8 16 24 32] represents the predicted locations of M bounding bboxes, N is the batch size. Each bounding box has four coordinate values and the layout is [xmin, ymin, xmax, ymax], when box size equals to 4. The data type is float32 or float64. 2. (LoDTensor) A 3-D Tensor with shape [M, C, 4] M is the number of bounding boxes, C is the class number. The data type is float32 or float64.

  • scores (Variable) – Two types of scores are supported: 1. (Tensor) A 3-D Tensor with shape [N, C, M] represents the predicted confidence predictions. N is the batch size, C is the class number, M is number of bounding boxes. For each category there are total M scores which corresponding M bounding boxes. Please note, M is equal to the 2nd dimension of BBoxes.The data type is float32 or float64. 2. (LoDTensor) A 2-D LoDTensor with shape [M, C]. M is the number of bbox, C is the class number. In this case, input BBoxes should be the second case with shape [M, C, 4].The data type is float32 or float64.

  • background_label (int) – The index of background label, the background label will be ignored. If set to -1, then all categories will be considered. Default: 0

  • score_threshold (float) – Threshold to filter out bounding boxes with low confidence score. If not provided, consider all boxes.

  • nms_top_k (int) – Maximum number of detections to be kept according to the confidences after the filtering detections based on score_threshold.

  • nms_threshold (float) – The threshold to be used in NMS. Default: 0.3

  • nms_eta (float) – The threshold to be used in NMS. Default: 1.0

  • keep_top_k (int) – Number of total bboxes to be kept per image after NMS step. -1 means keeping all bboxes after NMS step.

  • normalized (bool) – Whether detections are normalized. Default: True

  • name (str) – Name of the multiclass nms op. Default: None.

Returns

A 2-D LoDTensor with shape [No, 6] represents the detections.

Each row has 6 values: [label, confidence, xmin, ymin, xmax, ymax] or A 2-D LoDTensor with shape [No, 10] represents the detections. Each row has 10 values: [label, confidence, x1, y1, x2, y2, x3, y3, x4, y4]. No is the total number of detections. If there is no detected boxes for all images, lod will be set to {1} and Out only contains one value which is -1. (After version 1.3, when no boxes detected, the lod is changed from {0} to {1})

Return type

Variable

Examples

import paddle.fluid as fluid
boxes = fluid.data(name='bboxes', shape=[None,81, 4],
                          dtype='float32', lod_level=1)
scores = fluid.data(name='scores', shape=[None,81],
                          dtype='float32', lod_level=1)
out = fluid.layers.multiclass_nms(bboxes=boxes,
                                  scores=scores,
                                  background_label=0,
                                  score_threshold=0.5,
                                  nms_top_k=400,
                                  nms_threshold=0.3,
                                  keep_top_k=200,
                                  normalized=False)