gts, dets, recall, ap LOSS
gts | 데이터셋에서 해당Class의 bbox 갯수 number of bboxes of this class in your dataset | ground truth |
dets | 평가에 사용된 bbox the detect bboxes used to evaluate | detect bboxes |
recall | 재현율, 검출되야할 것(all GT )중 정답(TP)비율 a common metric used in the detection | recall |
ap | 평균정밀도 average precision, a common metric used in the detection | average precision |
2022-02-27 12:44:37,299 - mmdet - INFO - load checkpoint from local path: ../checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth 2022-02-27 12:44:37,387 - mmdet - WARNING - The model and loaded state dict do not match exactly size mismatch for roi_head.bbox_head.fc_cls.weight: copying a param with shape torch.Size([81, 1024]) from checkpoint, the shape in current model is torch.Size([4, 1024]). size mismatch for roi_head.bbox_head.fc_cls.bias: copying a param with shape torch.Size([81]) from checkpoint, the shape in current model is torch.Size([4]). size mismatch for roi_head.bbox_head.fc_reg.weight: copying a param with shape torch.Size([320, 1024]) from checkpoint, the shape in current model is torch.Size([12, 1024]). size mismatch for roi_head.bbox_head.fc_reg.bias: copying a param with shape torch.Size([320]) from checkpoint, the shape in current model is torch.Size([12]). 2022-02-27 12:44:37,389 - mmdet - INFO - Start running, host: oschung_skcc@SKCCBMS20GS01, work_dir: /home/oschung_skcc/git/mmdetection/my/tutorial_exps 2022-02-27 12:44:37,390 - mmdet - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) StepLrUpdaterHook (NORMAL ) CheckpointHook (LOW ) EvalHook (VERY_LOW ) TextLoggerHook -------------------- before_train_epoch: (VERY_HIGH ) StepLrUpdaterHook (NORMAL ) NumClassCheckHook (LOW ) IterTimerHook 2022-02-27 12:45:27,667 - mmdet - INFO - Epoch [11][10/25] lr: 2.500e-04, eta: 0:00:06, time: 0.331, data_time: 0.240, memory: 2213, loss_rpn_cls: 0.0023, loss_rpn_bbox: 0.0079, loss_cls: 0.0746, acc: 97.0996, loss_bbox: 0.1330, loss: 0.2179 2022-02-27 12:45:28,485 - mmdet - INFO - Epoch [11][20/25] lr: 2.500e-04, eta: 0:00:05, time: 0.082, data_time: 0.007, memory: 2213, loss_rpn_cls: 0.0019, loss_rpn_bbox: 0.0088, loss_cls: 0.0685, acc: 97.3535, loss_bbox: 0.1330, loss: 0.2122 2022-02-27 12:45:32,196 - mmdet - INFO - Epoch [12][10/25] lr: 2.500e-05, eta: 0:00:02, time: 0.318, data_time: 0.233, memory: 2213, loss_rpn_cls: 0.0015, loss_rpn_bbox: 0.0065, loss_cls: 0.0614, acc: 97.6074, loss_bbox: 0.1177, loss: 0.1870 2022-02-27 12:45:33,052 - mmdet - INFO - Epoch [12][20/25] lr: 2.500e-05, eta: 0:00:00, time: 0.086, data_time: 0.007, memory: 2213, loss_rpn_cls: 0.0020, loss_rpn_bbox: 0.0064, loss_cls: 0.0602, acc: 97.5000, loss_bbox: 0.1009, loss: 0.1695 2022-02-27 12:45:33,625 - mmdet - INFO - Saving checkpoint at 12 epochs [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 25/25, 28.7 task/s, elapsed: 1s, ETA: 0s ---------------iou_thr: 0.5--------------- 2022-02-27 12:45:36,014 - mmdet - INFO - +------------+-----+------+--------+-------+ | class | gts | dets | recall | ap | +------------+-----+------+--------+-------+ | Car | 62 | 148 | 0.919 | 0.826 | | Pedestrian | 13 | 55 | 0.846 | 0.645 | | Cyclist | 7 | 61 | 0.571 | 0.094 | +------------+-----+------+--------+-------+ | mAP | | | | 0.522 | +------------+-----+------+--------+-------+ 2022-02-27 12:45:36,105 - mmdet - INFO - Epoch(val) [12][25] AP50: 0.5220, mAP: 0.5218
loss
model = dict( type='MaskRCNN', # The name of detector backbone=dict( type='ResNet', # The type of the backbone num_stages=4, # Number of stages of the backbone. out_indices=(0, 1, 2, 3), # The index of output feature maps produced in each stages frozen_stages=1, # The weights in the first 1 stage are frozen ...), neck=dict(...), rpn_head=dict( type='RPNHead', # The type of RPN head is 'RPNHead', we also support 'GARPNHead', etc. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/dense_heads/rpn_head.py#L12 for more details. in_channels=256, # The input channels of each input feature map, this is consistent with the output channels of neck feat_channels=256, # Feature channels of convolutional layers in the head. anchor_generator=dict(...), bbox_coder=dict(...), loss_cls=dict( # Config of loss function for the classification branch type='CrossEntropyLoss', use_sigmoid=True, # RPN usually perform two-class classification, so it usually uses sigmoid function. loss_weight=1.0), # Loss weight of the classification branch. loss_bbox=dict( # Config of loss function for the regression branch. type='L1Loss', loss_weight=1.0)), # Loss weight of the regression branch. roi_head=dict( # RoIHead encapsulates the second stage of two-stage/cascade detectors. type='StandardRoIHead', # Type of the RoI head. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/standard_roi_head.py#L10 for implementation. bbox_roi_extractor=dict( # RoI feature extractor for bbox regression. type='SingleRoIExtractor', ...), bbox_head=dict( # Config of box head in the RoIHead. type='Shared2FCBBoxHead', \t\t\t... loss_cls=dict( # Config of loss function for the classification branch type='CrossEntropyLoss', # Type of loss for classification branch, we also support FocalLoss etc. use_sigmoid=False, # Whether to use sigmoid. loss_weight=1.0), # Loss weight of the classification branch. loss_bbox=dict( # Config of loss function for the regression branch. type='L1Loss', # Type of loss, we also support many IoU Losses and smooth L1-loss, etc. loss_weight=1.0)), # Loss weight of the regression branch. mask_roi_extractor=dict( # RoI feature extractor for mask generation. type='SingleRoIExtractor', ... mask_head=dict( # Mask prediction head type='FCNMaskHead', ... loss_mask=dict( # Config of loss function for the mask branch. type='CrossEntropyLoss', use_mask=True, # Whether to only train the mask in the correct class. loss_weight=1.0))), # Loss weight of mask branch. train_cfg = dict(...) test_cfg = dict(...) dataset_type = 'CocoDataset' # Dataset type, this will be used to define the dataset ...
여러 Object에 대한 detector를 training할때, 두가지 종류의 loss지표를 보게 된다.
Loss type | 설명 |
loss_bbox | 예측한 bbox가 얼마나 gt에 일치하는지를 측정한 loss (대게, regression loss, L1, smooth L1 등) |
loss_cls | Classification loss 각 bbox는 object의 class 또는 “backgound”로 분류되는데, 예측한 각각 bbox의 Classification정확도를 측정한 loss (이 loss는 대게 cross entropy loss라고 불린다.) |
loss_mask |
Loss가 0일때
detector를 학습시킬때, Model은 이미지당 여러 bbox를 예측한다.
예측한 bbox의 대부분은 어떤 class에도 속하지 않는 backgound일것다.
loss함수는 이미지내에 gt bbox와 예측된 bbox에 각각 연관되어 있다.
예측한 bbox와 gt와 겹친다면 loss_bbox와 loss_cls를 계산할수 있고,
이를 통해 Model이 얼마나 잘 gt box를 예측할수 있는지를 계산할수 있다.
여기서, 일정 threshold기준으로 일부만 겹치는 경우에는 discarded된다.
반면, gt와 겹치지 않는다면, loss_bbox는 없고, loss_cls는 “backgound” class로 계산한다.
반면, gt와 겹치지 않는다면, loss_bbox는 없고, loss_cls는 “backgound” class로 계산한다.
https://stackoverflow.com/questions/70169219/what-is-total-loss-loss-cls-etc
계속 Train되면서, 매 iteration에서 찍히는 metric 이 있는데,
가장 중요한 loss 값외에
total_loss
: iteration동안 계산된 아래 각 loss들의 weighted sum (가중합, 기본가중치는 각각 1)
Fast R-CNN paper and the code.
loss_cls | ROI() head 에서 Classification loss | Box classification에 대한 loss 측정. ex) 얼마나 정확하게 모델이 예측된 box에 class를 labelling했나? Measures the loss for box classification, i.e., how good the model is at labelling a predicted box with the correct class. |
loss_box_reg | ROI() head 에서 Localisation loss | Box localisation에 loss 측정 (예측된 location Vs. 실제 location) |
Faster R-CNN paper and the code
loss_rpn_cls | RPN에서 Classification loss | “objectness” loss 측정. ex) 얼마나 정확하게 RPN(Region Proposal Network)이 anchor box들을 foreground 또는 background로 labelling했나? Measures the “objectness” loss, i.e., how good the RPN is at labelling the anchor boxes as foreground or background. |
loss_rpn_loc | RPN에서 Localisation loss | RPN(Region Proposal Network)에서 예측된 region들의 localisation에 대한 loss측정 Measures the loss for localisation of the predicted regions in the RPN. |
Faster R-CNN paper and the code.For more details on the loss (5), take a look at the Mask R-CNN paper and the code.
loss_mask | Mask head에서 Mask loss | 예측된 binary mask가 얼마나 “정확”한지 측정 Mask R-CNN paper and the code |
time | Time taken by the iteration. |
data_time | Time taken by the dataloader in that iteration. |
lr | The learning rate in that iteration. |
max_mem | : Maximum GPU memory occupied by tensors in bytes. |
s0,s1s2
cascade rcnn의 stage를 의미한다.
def forward_train(self, x, img_metas, proposal_list, gt_bboxes, gt_labels, gt_bboxes_ignore=None, gt_masks=None): """ Args: x (list[Tensor]): list of multi-level img features. img_metas (list[dict]): list of image info dict where each dict has: 'img_shape', 'scale_factor', 'flip', and may also contain 'filename', 'ori_shape', 'pad_shape', and 'img_norm_cfg'. For details on the values of these keys see `mmdet/datasets/pipelines/formatting.py:Collect`. proposals (list[Tensors]): list of region proposals. gt_bboxes (list[Tensor]): Ground truth bboxes for each image with shape (num_gts, 4) in [tl_x, tl_y, br_x, br_y] format. gt_labels (list[Tensor]): class indices corresponding to each box gt_bboxes_ignore (None | list[Tensor]): specify which bounding boxes can be ignored when computing the loss. gt_masks (None | Tensor) : true segmentation masks for each box used if the architecture supports a segmentation task. Returns: dict[str, Tensor]: a dictionary of loss components """ losses = dict() for i in range(self.num_stages): self.current_stage = i ... # bbox head forward and loss bbox_results = self._bbox_forward_train(i, x, sampling_results, gt_bboxes, gt_labels, rcnn_train_cfg) for name, value in bbox_results['loss_bbox'].items(): losses[f's{i}.{name}'] = (value*lw if 'loss' in name else value) ...