gts, dets, recall, ap LOSS

Published by onesixx on

gts데이터셋에서 해당Class의 bbox 갯수
number of bboxes of this class in your dataset
ground truth
dets평가에 사용된 bbox
the detect bboxes used to evaluate
detect bboxes 
recall재현율, 검출되야할 것(all GT )중 정답(TP)비율
a common metric used in the detection
recall
ap평균정밀도
average precision, a common metric used in the detection
average precision
2022-02-27 12:44:37,299 - mmdet - INFO - load checkpoint from local path: ../checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
2022-02-27 12:44:37,387 - mmdet - WARNING - The model and loaded state dict do not match exactly

size mismatch for roi_head.bbox_head.fc_cls.weight: copying a param with shape torch.Size([81, 1024])  from checkpoint, the shape in current model is torch.Size([4, 1024]).
size mismatch for roi_head.bbox_head.fc_cls.bias:   copying a param with shape torch.Size([81])        from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for roi_head.bbox_head.fc_reg.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([12, 1024]).
size mismatch for roi_head.bbox_head.fc_reg.bias:   copying a param with shape torch.Size([320])       from checkpoint, the shape in current model is torch.Size([12]).
2022-02-27 12:44:37,389 - mmdet - INFO - Start running, host: oschung_skcc@SKCCBMS20GS01, work_dir: /home/oschung_skcc/git/mmdetection/my/tutorial_exps
2022-02-27 12:44:37,390 - mmdet - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH   ) StepLrUpdaterHook                  
(NORMAL      ) CheckpointHook                     
(LOW         ) EvalHook                           
(VERY_LOW    ) TextLoggerHook                     
 -------------------- 
before_train_epoch:
(VERY_HIGH   ) StepLrUpdaterHook                  
(NORMAL      ) NumClassCheckHook                  
(LOW         ) IterTimerHook
2022-02-27 12:45:27,667 - mmdet - INFO - Epoch [11][10/25]  lr: 2.500e-04, eta: 0:00:06, time: 0.331, data_time: 0.240, memory: 2213, loss_rpn_cls: 0.0023, loss_rpn_bbox: 0.0079, loss_cls: 0.0746, acc: 97.0996, loss_bbox: 0.1330, loss: 0.2179
2022-02-27 12:45:28,485 - mmdet - INFO - Epoch [11][20/25]  lr: 2.500e-04, eta: 0:00:05, time: 0.082, data_time: 0.007, memory: 2213, loss_rpn_cls: 0.0019, loss_rpn_bbox: 0.0088, loss_cls: 0.0685, acc: 97.3535, loss_bbox: 0.1330, loss: 0.2122
2022-02-27 12:45:32,196 - mmdet - INFO - Epoch [12][10/25]  lr: 2.500e-05, eta: 0:00:02, time: 0.318, data_time: 0.233, memory: 2213, loss_rpn_cls: 0.0015, loss_rpn_bbox: 0.0065, loss_cls: 0.0614, acc: 97.6074, loss_bbox: 0.1177, loss: 0.1870
2022-02-27 12:45:33,052 - mmdet - INFO - Epoch [12][20/25]  lr: 2.500e-05, eta: 0:00:00, time: 0.086, data_time: 0.007, memory: 2213, loss_rpn_cls: 0.0020, loss_rpn_bbox: 0.0064, loss_cls: 0.0602, acc: 97.5000, loss_bbox: 0.1009, loss: 0.1695
2022-02-27 12:45:33,625 - mmdet - INFO - Saving checkpoint at 12 epochs
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 25/25, 28.7 task/s, elapsed: 1s, ETA:     0s
---------------iou_thr: 0.5---------------
2022-02-27 12:45:36,014 - mmdet - INFO - 
+------------+-----+------+--------+-------+
| class      | gts | dets | recall | ap    |
+------------+-----+------+--------+-------+
| Car        | 62  | 148  | 0.919  | 0.826 |
| Pedestrian | 13  | 55   | 0.846  | 0.645 |
| Cyclist    | 7   | 61   | 0.571  | 0.094 |
+------------+-----+------+--------+-------+
| mAP        |     |      |        | 0.522 |
+------------+-----+------+--------+-------+
2022-02-27 12:45:36,105 - mmdet - INFO - Epoch(val) [12][25]    AP50: 0.5220, mAP: 0.5218

loss

model = dict(
    type='MaskRCNN',      # The name of detector
    backbone=dict(
        type='ResNet',   # The type of the backbone
        num_stages=4,    # Number of stages of the backbone.
        out_indices=(0, 1, 2, 3),  # The index of output feature maps produced in each stages
        frozen_stages=1, # The weights in the first 1 stage are frozen
        ...),  
    neck=dict(...),  
    rpn_head=dict(
        type='RPNHead',  # The type of RPN head is 'RPNHead', we also support 'GARPNHead', etc. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/dense_heads/rpn_head.py#L12 for more details.
        in_channels=256,  # The input channels of each input feature map, this is consistent with the output channels of neck
        feat_channels=256,  # Feature channels of convolutional layers in the head.
        anchor_generator=dict(...),
        bbox_coder=dict(...), 
        loss_cls=dict(               # Config of loss function for the classification branch
            type='CrossEntropyLoss', 
            use_sigmoid=True,        # RPN usually perform two-class classification, so it usually uses sigmoid function.
            loss_weight=1.0),        # Loss weight of the classification branch.
        loss_bbox=dict(        # Config of loss function for the regression branch.
            type='L1Loss',     
            loss_weight=1.0)), # Loss weight of the regression branch.
    roi_head=dict(  # RoIHead encapsulates the second stage of two-stage/cascade detectors.
        type='StandardRoIHead',  # Type of the RoI head. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/standard_roi_head.py#L10 for implementation.
        bbox_roi_extractor=dict(       # RoI feature extractor for bbox regression.
            type='SingleRoIExtractor', 
        ...),
        bbox_head=dict(  # Config of box head in the RoIHead.
            type='Shared2FCBBoxHead',  
\t\t\t...
            loss_cls=dict(               # Config of loss function for the classification branch
                type='CrossEntropyLoss', # Type of loss for classification branch, we also support FocalLoss etc.
                use_sigmoid=False,       # Whether to use sigmoid.
                loss_weight=1.0),        # Loss weight of the classification branch.
            loss_bbox=dict(        # Config of loss function for the regression branch.
                type='L1Loss',     # Type of loss, we also support many IoU Losses and smooth L1-loss, etc.
                loss_weight=1.0)), # Loss weight of the regression branch.
        
        mask_roi_extractor=dict( # RoI feature extractor for mask generation.
            type='SingleRoIExtractor',  
            ...
        mask_head=dict(  # Mask prediction head
            type='FCNMaskHead', 
            ...
            loss_mask=dict(              # Config of loss function for the mask branch.
                type='CrossEntropyLoss', 
                use_mask=True,           # Whether to only train the mask in the correct class.
                loss_weight=1.0))),      # Loss weight of mask branch.
    train_cfg = dict(...)
    test_cfg = dict(...)

dataset_type = 'CocoDataset'  # Dataset type, this will be used to define the dataset
...

여러 Object에 대한 detector를 training할때, 두가지 종류의 loss지표를 보게 된다.

Loss type설명
loss_bbox예측한 bbox가 얼마나 gt에 일치하는지를 측정한 loss
(대게, regression loss, L1, smooth L1 등)
loss_clsClassification loss
각 bbox는 object의 class 또는 “backgound”로 분류되는데, 예측한 각각 bbox의 Classification정확도를 측정한 loss
(이 loss는 대게 cross entropy loss라고 불린다.)
loss_mask

Loss가 0일때

detector를 학습시킬때, Model은 이미지당 여러 bbox를 예측한다.
예측한 bbox의 대부분은 어떤 class에도 속하지 않는 backgound일것다.
loss함수는 이미지내에 gt bbox와 예측된 bbox에 각각 연관되어 있다.

예측한 bbox와 gt와 겹친다면 loss_bbox와 loss_cls를 계산할수 있고,
이를 통해 Model이 얼마나 잘 gt box를 예측할수 있는지를 계산할수 있다.
여기서, 일정 threshold기준으로 일부만 겹치는 경우에는 discarded된다.

반면, gt와 겹치지 않는다면, loss_bbox는 없고, loss_cls는 “backgound” class로 계산한다.

반면, gt와 겹치지 않는다면, loss_bbox는 없고, loss_cls는 “backgound” class로 계산한다.

https://stackoverflow.com/questions/70169219/what-is-total-loss-loss-cls-etc

계속 Train되면서, 매 iteration에서 찍히는 metric 이 있는데,

가장 중요한 loss 값외에

total_loss: iteration동안 계산된 아래 각 loss들의 weighted sum (가중합, 기본가중치는 각각 1)

Fast R-CNN paper and the code.

loss_clsROI() head 에서 Classification lossBox classification에 대한 loss 측정.
ex) 얼마나 정확하게 모델이 예측된 box에 class를 labelling했나?
Measures the loss for box classification, i.e., how good the model is at labelling a predicted box with the correct class.
loss_box_regROI() head 에서 Localisation lossBox localisation에 loss 측정
(예측된 location Vs. 실제 location)

Faster R-CNN paper and the code

loss_rpn_clsRPN에서 Classification loss“objectness” loss 측정.
ex) 얼마나 정확하게 RPN(Region Proposal Network)이 anchor box들을 foreground 또는 background로 labelling했나?
Measures the “objectness” loss, i.e., how good the RPN is at labelling the anchor boxes as foreground or background.
loss_rpn_locRPN에서
Localisation loss
RPN(Region Proposal Network)에서 예측된 region들의 localisation에 대한 loss측정
Measures the loss for localisation of the predicted regions in the RPN.

Faster R-CNN paper and the code.For more details on the loss (5), take a look at the Mask R-CNN paper and the code.

loss_maskMask head에서
Mask loss
예측된 binary mask가 얼마나 “정확”한지 측정
Mask R-CNN paper and the code
timeTime taken by the iteration.
data_timeTime taken by the dataloader in that iteration.
lrThe learning rate in that iteration.
max_mem: Maximum GPU memory occupied by tensors in bytes.

s0,s1s2

cascade rcnn의 stage를 의미한다.

https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/cascade_roi_head.py#L246

    def forward_train(self, 
        x, img_metas, proposal_list,
        gt_bboxes, gt_labels, gt_bboxes_ignore=None, gt_masks=None):
        """
        Args:
            x (list[Tensor]): list of multi-level img features.
            img_metas (list[dict]): list of image info dict where each dict
                has: 'img_shape', 'scale_factor', 'flip', and may also contain
                'filename', 'ori_shape', 'pad_shape', and 'img_norm_cfg'.
                For details on the values of these keys see
                `mmdet/datasets/pipelines/formatting.py:Collect`.
            proposals (list[Tensors]): list of region proposals.
            
            gt_bboxes (list[Tensor]): Ground truth bboxes for each image with
                shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format.
            gt_labels (list[Tensor]): class indices corresponding to each box
            gt_bboxes_ignore (None | list[Tensor]): specify which bounding
                boxes can be ignored when computing the loss.
            gt_masks (None | Tensor) : true segmentation masks for each box
                used if the architecture supports a segmentation task.
     
        Returns:
            dict[str, Tensor]: a dictionary of loss components
        """
        losses = dict()
        for i in range(self.num_stages):
            self.current_stage = i
            ...

            # bbox head forward and loss
            bbox_results = self._bbox_forward_train(i, x, sampling_results,
                                                    gt_bboxes, gt_labels,
                                                    rcnn_train_cfg)

            for name, value in bbox_results['loss_bbox'].items():
                losses[f's{i}.{name}'] = (value*lw if 'loss' in name else value)
                
            ...

mAP

Categories: vision

onesixx

Blog Owner

Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x