ssd_loss(location, confidence, gt_box, gt_label, prior_box, prior_box_var=None, background_label=0, overlap_threshold=0.5, neg_pos_ratio=3.0, neg_overlap=0.5, loc_loss_weight=1.0, conf_loss_weight=1.0, match_type='per_prediction', mining_type='max_negative', normalize=True, sample_size=None)
Multi-box loss layer for object detection algorithm of SSD
This layer is to compute detection loss for SSD given the location offset predictions, confidence predictions, prior boxes and ground-truth bounding boxes and labels, and the type of hard example mining. The returned loss is a weighted sum of the localization loss (or regression loss) and confidence loss (or classification loss) by performing the following steps:
Find matched bounding box by bipartite matching algorithm.
1.1 Compute IOU similarity between ground-truth boxes and prior boxes.
1.2 Compute matched bounding box by bipartite matching algorithm.
Compute confidence for mining hard examples
2.1. Get the target label based on matched indices.
2.2. Compute confidence loss.
Apply hard example mining to get the negative example indices and update the matched indices.
Assign classification and regression targets
4.1. Encoded bbox according to the prior boxes.
4.2. Assign regression targets.
4.3. Assign classification targets.
Compute the overall objective loss.
5.1 Compute confidence loss.
5.2 Compute localization loss.
5.3 Compute the overall weighted loss.
location (Variable) – The location predictions are a 3D Tensor with shape [N, Np, 4], N is the batch size, Np is total number of predictions for each instance. 4 is the number of coordinate values, the layout is [xmin, ymin, xmax, ymax].The data type is float32 or float64.
confidence (Variable) – The confidence predictions are a 3D Tensor with shape [N, Np, C], N and Np are the same as they are in location, C is the class number.The data type is float32 or float64.
gt_box (Variable) – The ground-truth bounding boxes (bboxes) are a 2D LoDTensor with shape [Ng, 4], Ng is the total number of ground-truth bboxes of mini-batch input.The data type is float32 or float64.
gt_label (Variable) – The ground-truth labels are a 2D LoDTensor with shape [Ng, 1].Ng is the total number of ground-truth bboxes of mini-batch input, 1 is the number of class. The data type is float32 or float64.
prior_box (Variable) – The prior boxes are a 2D Tensor with shape [Np, 4]. Np and 4 are the same as they are in location. The data type is float32 or float64.
prior_box_var (Variable) – The variance of prior boxes are a 2D Tensor with shape [Np, 4]. Np and 4 are the same as they are in prior_box
background_label (int) – The index of background label, 0 by default.
overlap_threshold (float) – If match_type is ‘per_prediction’, use ‘overlap_threshold’ to determine the extra matching bboxes when finding matched boxes. 0.5 by default.
neg_pos_ratio (float) – The ratio of the negative boxes to the positive boxes, used only when mining_type is ‘max_negative’, 3.0 by default.
neg_overlap (float) – The negative overlap upper bound for the unmatched predictions. Use only when mining_type is ‘max_negative’, 0.5 by default.
loc_loss_weight (float) – Weight for localization loss, 1.0 by default.
conf_loss_weight (float) – Weight for confidence loss, 1.0 by default.
match_type (str) – The type of matching method during training, should be ‘bipartite’ or ‘per_prediction’, ‘per_prediction’ by default.
mining_type (str) – The hard example mining type, should be ‘hard_example’ or ‘max_negative’, now only support max_negative.
normalize (bool) – Whether to normalize the SSD loss by the total number of output locations, True by default.
sample_size (int) – The max sample size of negative box, used only when mining_type is ‘hard_example’.
The weighted sum of the localization loss and confidence loss, with shape [N * Np, 1], N and Np are the same as they are in location.The data type is float32 or float64.
- Return type
ValueError– If mining_type is ‘hard_example’, now only support mining type of max_negative.
import paddle.fluid as fluid pb = fluid.data( name='prior_box', shape=[10, 4], dtype='float32') pbv = fluid.data( name='prior_box_var', shape=[10, 4], dtype='float32') loc = fluid.data(name='target_box', shape=[10, 4], dtype='float32') scores = fluid.data(name='scores', shape=[10, 21], dtype='float32') gt_box = fluid.data( name='gt_box', shape=, lod_level=1, dtype='float32') gt_label = fluid.data( name='gt_label', shape=, lod_level=1, dtype='float32') loss = fluid.layers.ssd_loss(loc, scores, gt_box, gt_label, pb, pbv)