mmdet train
DataSet
Data ์์ง
๊ธฐ์กด ๋ฐ์ดํฐ์ : ๋ค์ด๋ก๋
- coco : https://cocodataset.org/#home
- pascal voc
์ ๊ท ๋ฐ์ดํฐ์ : ๊ตฌ์ฑ
1. Image ์์ง
- ์ง์ ์์ง
- ๋์์์์ ์ด๋ฏธ์ง ์ถ์ถ (ffmpeg ์ฌ์ฉ)
ffmpeg -i example.mp4 -vf fps=Afolder/ex_detect_%4d.jpg - google์์ ์์ง
- Unity๋ฅผ ํตํด ๊ฐ์๋ฐ์ดํฐ ์์ฑ
2. Annotation (CVAT ์ฌ์ฉ)
- Bbox
- Polygon : segmentation
- Key-point (top-down, down-top)
3. ๋ฐ์ดํฐ์ ๋ง๋ค๊ธฐ (Image์ Annotation์ ์ฐ๊ฒฐ)
- CoCo ๋ฐ์ดํฐ์
ํฌ๋ฉง์ผ๋ก CVAT์์ Export
(keypoint์ ๊ฒฝ์ฐ, cocoํ์์ผ๋ก export๊ฐ ๋์ง ์์, CVAT(xml)ํฌ๋ฉง์ผ๋ก ๋ด๋ณด๋ด๊ธฐ ํํ, datumaru๋ฅผ ํตํด coco ๋ฐ์ดํฐ์ ์ผ๋ก ๋ณํ - Custom ๋ฐ์ดํฐ์
mmdetection์์ custom dataset ๋ฑ๋กํ ์ฌ์ฉ
DataSet ๊ตฌ์ฑ (train / valid & test)
train / valid & test ๋ฐ์ดํฐ์
์ค๋น : ์ด๋ฏธ์ง๋ ๊ทธ๋๋ก ๋๊ณ , ๊ฐ DataSet๋ณ๋ก Annotationํ์ผ ๋ถ๋ฆฌํ์ฌ ์ค๋น
(coco๋ฐ์ดํฐ์
์ ์๋ก annotationํ ์ฑ๋ฒ์ ๋ค์ ํด์ผํจ)
Config์์, ์์ค์ trn, val, tst ์ ๋๋ ํ ๋ฆฌ ๊ตฌ์กฐ ๊ฒฐ์
์ปค์คํ ๋ฐ์ดํฐ์ ์ ๊ฒฝ์ฐ
MyCustomDataset์ ๋ฑ๋ก (load_annotations ์ ์์ ํด์)
dataset ์์ฑ
- datasets = [build_dataset(cfg.data.train)] # /tools/train.py์์
cocoset์ผ๋ก ๋ณํ
์ฌ๋งํ๋ฉด Coco๋ก ๋ณํ
Model
๋ชจ๋ธ ์ ์
open-mmlab/mmdetection ์ model-zoo
Model-zoo ์์
ex) faster_rcrnn
https://comlini8-8.tistory.com/86
MMDet ๋ชจ๋ธ์ 5๊ฐ์ง ์์๋ก ๊ตฌ๋ถ
Backbone | ํผ์ฒ๋งต์ ์ถ์ถํ๊ธฐ ์ํ FCN ๋คํธ์ํฌ | (ex. ResNet, MobileNet) |
neck | backbone๊ณผ head ์ฌ์ด๋ฅผ ์ฐ๊ฒฐํ๋ ์์ | (ex. FPN, PAFPN) |
head | ๊ตฌ์ฒด์ ์ธ ํ์คํฌ๋ฅผ ์ํ ์์ | (ex. bbox prediction, mask prediction) |
roi extractor | ํผ์ฒ๋งต์ผ๋ก๋ถํฐ RoI ํน์ง์ ์ถ์ถํ๋ ๋ถ๋ถ | (ex. RoI Align) |
loss | loss๋ฅผ ๊ณ์ฐํ๊ธฐ ์ํ head์ ๊ตฌ์ฑ ์์ | (ex. FocalLoss, L1Loss, GHMLoss) |
checkpoints ํ์ผ ์ค๋น
mmdet ์์ ConvNeXt (CVPR’2022).
Model Zoo์์ pretrained Model์ ๋คํธ์์ ๋ฐ์์ด.
w g e t -O checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
configuration – Config
๊ธฐ์กด Config ๊ฐ์ ธ์ค๊ธฐ
sixxconfigs/makeConfig.py ๋ฅผ ํตํด ์ด๊ธฐ Config ์์ฑ
import os from mmcv import Config os.chdir('/home/oschung_skcc/my/git/mmdetection') config_file = 'configs/convnext/cascade_mask_rcnn_convnext-s_p4_w7_fpn_giou_4conv1f_fp16_ms-crop_3x_coco.py' out_config = 'sixxconfigs/cascade_mask_rcnn_convnext-s_p4_w7_fpn_giou_4conv1f_fp16_ms-crop_3x_coco_sixx.py' cfg = Config.fromfile(config_file) #print(cfg.pretty_text) try: with open(out_config, 'w') as f: f.write(cfg.pretty_text) except FileNotFoundError: print("The 'docs' directory does not exist")
import argparse import logging import os from mmcv import Config parser = argparse.ArgumentParser(description="") parser.add_argument("-i", "--fromconfig", default='', type=str, metavar="PATH", help="path from getting config") parser.add_argument("-o", "--toconfig", default='', type=str, metavar="PATH", help="path to getting config") def print_info(message: str): logging.info(message) def main(): print_info("Starting...") args = parser.parse_args() if not args.fromconfig : print("Warning!", "Nothing to set.\ Please specify a path!") print_info("Exiting...") return else: config_file = args.fromconfig if not args.toconfig: out_config = 'sixxconfigs/'+ os.path.basename(args.fromconfig) print(args.toconfig) else: out_config = args.toconfig os.chdir('/home/oschung_skcc/my/git/mmdetection') # config_file = 'configs/convnext/cascade_mask_rcnn_convnext-s_p4_w7_fpn_giou_4conv1f_fp16_ms-crop_3x_coco.py' # out_config = 'sixxconfigs/cascade_mask_rcnn_convnext-s_p4_w7_fpn_giou_4conv1f_fp16_ms-crop_3x_coco_sixx.py' cfg = Config.fromfile(config_file) #print(cfg.pretty_text) try: with open(out_config, 'w') as f: f.write(cfg.pretty_text) except FileNotFoundError: print("The 'docs' directory does not exist") print_info("... End") if __name__ == "__main__": main()
์ฌ์ด ์์ ์ ์ํด ํ์ด์ง config ๋ง๋ค๊ธฐ
$ python sixxtools/makeConfig_sixx.py \\ --fromconfig configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \\ --toconfig sixxconfigs/faster_rcnn_r50_fpn_1x_coco_001.py
$ python sixxtools/misc/print_config.py \\ configs/convnext/cascade_mask_rcnn_convnext-s_p4_w7_fpn_giou_4conv1f_fp16_ms-crop_3x_coco.py
config ๋๋ถ๋ฅ ๋ฐ ์ฃผ์์ค์ ๋ด์ญ
config ๋๋ถ๋ฅ | ์ค๋ช |
dataset | dataset์ type(customdataset, cocodataset ๋ฑ), train/val/test dataset ์ ํ, data_root, train/val/test dataset์ ์ฃผ์ ํ๋ผ๋ฏธํฐ ์ค์ (type, ann_file, img_prefix, pipeline ๋ฑ) |
model | object detection model์ backbone, neck, dense head, roi extractor, roi head(num_classes=4) ์ฃผ์ ์์ญ๋ณ๋ก ์ธ๋ถ ์ค์ |
scheduler | optimizer ์ ํ ์ค์ (sgd, adam, rmsprop ๋ฑ), ์ต์ด learning ์ค์ ํ์ต์ค ๋์ learning rate ์ ์ฉ ์ ์ฑ ์ค์ (step, cyclic, cosine annealing ๋ฑ) train ์ epochs ํ์ : learning rate scheduler |
runtime | ์ฃผ๋ก hook(callback)๊ด๋ จ ์ค์ ํ์ต ์ค checkpoint ํ์ผ, log ํ์ผ ์์ฑ์ ์ํ interval epochs ์ |
config ์์
๊ธฐ์กด config๊ฐ์ ธ์์, training์ ์ฌ์ฉํ config ํ์ผ์์ฑ
sixxconfigs/faster_rcnn_r50_fpn_1x_coco_sixx.py ๊ทธ๋ฆฌ๊ณ ์์
- num_classes=4, ์์ (model์๋)
- dataset_type = ‘CocoDataset’ ํ์ธ
- data_root = ‘data/msc_pilot2/’ ์์
- classes = [‘TRAY_A_1’, ‘TRAY_A_2’, ‘TRAY_A_3’, ‘TRAY_B_1’] ์ถ๊ฐ
- …
gpu
- samples_per_gpu
- workers_per_gpu
data
- train / val / test
– ann_file ์์
– classes ์ถ๊ฐ
config ์์ ์
model = dict( roi_head=dict( bbox_head=dict( num_classes=4, dataset_type = 'CocoDataset' #data_root = 'data/coco/' data_root = 'data/msc_pilot2/' classes=('Car', 'Truck', 'Pedestrian', 'Cyclist') data = dict( train=dict( type='CocoDataset', ann_file='data/kitti_tiny/anno_cc.json', #img_prefix='data/kitti_tiny/training/image_2', classes=classes, val=dict( type='CocoDataset', ann_file='data/kitti_tiny/anno_cc_val.json', #img_prefix='data/kitti_tiny/training/image_2', classes=classes, test=dict( type='CocoDataset', ann_file='data/kitti_tiny/anno_cc_val.json', #img_prefix='data/kitti_tiny/training/image_2', classes=classes, )
W&B์ค์ ์
#log_config = dict(interval=1, hooks=[dict(type='TextLoggerHook')]) log_config = dict( interval=10, #500 hooks=[ dict(type='TextLoggerHook', interval=500), dict(type='WandbLoggerHook', interval=1000, init_kwargs=dict( project='faster_rcnn_r50_fpn_1x', #entity = 'ENTITY ์ด๋ฆ', name='sixx_tray') ) ] ) # workflow = [('train', 1)] # 1 epoch์ train๊ณผ validation์ ๋ชจ๋ ํ๊ณ ์ถ์ผ๋ฉด workflow = [('train', 1), ('val', 1)]
transfer learning
load_from = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
# 200epoch ํ์ตํ๋ ๋์ 50๋ฒ ๋ง๋ค pthํ์ผ ๋ง๋ค๊ณ , 100๋ฒ๋ง๋ค ๋ก๊ทธ ์ฐ์ # ํ๊ฐ๋ 200๋ฒ ๋๊ณ ํจ. evaluation = dict(interval=200, metric='mIoU') #'mAP') runner = dict(type='EpochBasedRunner', max_epochs=400) checkpoint_config = dict(interval=50) log_config = dict(interval=100, hooks=[dict(type='TextLoggerHook')]) # ํ์ต์จ ๋ณ๊ฒฝ ํ๊ฒฝ ํ๋ผ๋ฏธํฐ ์ค์ . optimizer = dict(type='SGD', lr=0.02/8, momentum=0.9, weight_decay=0.0001) lr_config = dict( policy='step', warmup=None, warmup_iters=500, warmup_ratio=0.001, step=[8, 11]) # ๊ฐ์ฅ ์ต๊ทผ๊บผ ๋ถํฐ ์ด์ด์ ํ์ต resume_from = 'work_dirs/sixx_faster_rcnn_r50_fpn_1x_coco/latest.pth'
์์2>
model = dict( roi_head=dict( bbox_head=dict( num_classes=4, \t\t\t\t dict( num_classes=4, \t\t\t dict( num_classes=4, ... mask_head=dict( num_classes=4, dataset_type = 'CocoDataset' data_root = 'data/kitti_tiny/' classes=('Car', 'Truck', 'Pedestrian', 'Cyclist'), data = dict( samples_per_gpu=4, workers_per_gpu=4, train=dict( type='CocoDataset', ann_file='data/kitti_tiny/anno_cc.json', img_prefix='data/kitti_tiny/training/image_2', classes=classes, val=dict( type='CocoDataset', ann_file='data/kitti_tiny/anno_cc_val.json', img_prefix='data/kitti_tiny/training/image_2', classes=classes, test=dict( type='CocoDataset', ann_file='data/kitti_tiny/anno_cc_val.json', img_prefix='data/kitti_tiny/training/image_2', classes=classes, ) evaluation = dict(metric=['bbox', 'segm'], save_best='auto', interval=50) runner = dict(type='EpochBasedRunner', max_epochs=10000) checkpoint_config = dict(interval=500) # workflow = [('train', 1)] # 1 epoch์ train๊ณผ validation์ ๋ชจ๋ ํ๊ณ ์ถ์ผ๋ฉด workflow = [('train', 1), ('val', 1)] #load_from = 'checkpoints/cascade_mask_rcnn_convnext-s_p4_w7_fpn_giou_4conv1f_fp16_ms-crop_3x_coco_20220510_201004-3d24f5a4.pth' load_from = 'https://download.openmmlab.com/mmclassification/v0/convnext/downstream/convnext-small_3rdparty_32xb128-noema_in1k_20220301-303e75e3.pth'
runner ์ max_epochs ๊ฐ ์ํ๋ epoch 46
Batch_size
step 1473/ 46 ์ฝ 32 …= iteration..
ํ๊ฐ (evaluation, 50๋ฒ์ ํ๋ฒ)
checkpoint 1๋ฒ์ ํ๋ฒ
log_config 1๋ฒ
CUDA_VISIBLE_DEVICES=2,3 port=29506 sixxtools/dist_train.sh "sixxconfigs/cascade_mask_rcnn_convnext-s_p4_w7_fpn_giou_4conv1f_fp16_ms-crop_3x_coco_sixx.py" 2
work_dir = './work_dirs/cascade_mask_rcnn_convnext-s_p4_w7_fpn_giou_4conv1f_fp16_ms-crop_3x_coco_sixx' auto_resume = False gpu_ids = range(0, 2)
model
- num_class
dataset_type
data_root
classes
data
- samples_per_gpu
- workers_per_gpu
- train / val / test
– ann_file
– classes
load_from
evaluation
- save_best=’auto’, interval=50
checkpoint_config
optimizer์ lr ์ค์
lr_config = dict( policy='step', # ์ด๋ค scheduler ๋ฅผ ์ธ๊ฑด์ง warmup='linear', # warmup์ ํ ๊ฑด์ง warmup_iters=500, # warmup iteration ์ผ๋ง๋ ์ค๊ฑด์ง warmup_ratio=0.001, step=[8, 11]) # step์ ์ผ๋ง๋ง๋ค ๋ฐ์ ๊ฑด์ง
runner (_1x๋ epoch 12๋ฒ, _2x๋ epoch 24๋ฒ, _20e๋ epoch 20๋ฒ์ ์๋ฏธ)
- max_epochs=10000
auto_resume
gpu_ids
https://onesixx.com/mmdet-log/
# ํ๊ฐ๋ 200๋ฒ ๋๊ณ ํจ. evaluation = dict(interval=200, metric='mIoU') #'mAP') # 200 epoch ํ์ตํ๋ ๋์ 50๋ฒ ๋ง๋ค pthํ์ผ ๋ง๋ค๊ณ , 100๋ฒ๋ง๋ค ๋ก๊ทธ ์ฐ์ runner = dict(type='EpochBasedRunner', max_epochs=400) checkpoint_config = dict(interval=50) log_config = dict(interval=100, hooks=[dict(type='TextLoggerHook')]) # ํ์ต์จ ๋ณ๊ฒฝ ํ๊ฒฝ ํ๋ผ๋ฏธํฐ ์ค์ . optimizer = dict(type='SGD', lr=0.02/8, momentum=0.9, weight_decay=0.0001) lr_config = dict( policy='step', warmup=None, warmup_iters=500, warmup_ratio=0.001, step=[8, 11]) # ๊ฐ์ฅ ์ต๊ทผ๊บผ ๋ถํฐ ์ด์ด์ ํ์ต resume_from = 'work_dirs/sixx_faster_rcnn_r50_fpn_1x_coco/latest.pth'
GPU์ฌ์ฉ๋ ๋ชจ๋ํฐ๋ง (nvidia-smi, nvitop, gpustat)
watch -d -n 0.5 nvidia-smi
$ conda update -n base -c defaults conda # https://anaconda.org/conda-forge/nvitop # https://github.com/XuehaiPan/nvitop $ conda install -c conda-forge nvitop $ nvitop # https://anaconda.org/conda-forge/gpustat $ conda install -c conda-forge gpustat $ gpustat
Training ์คํ
$ python sixxtools/train.py "sixxconfigs/faster_rcnn_r50_fpn_1x_coco_sixx.py"
~/my/git/mmdetection/tools ==> sixxtools
$ python sixxtools/train.py \\ "sixxconfigs/cascade_rcnn_r50_fpn_1x_coco.py" \\ --work-dir "work_dirs/ttt"
work_dirs์ ์์ ํ ํด๋๋ฅผ ๋ง๋ค์ด์ง
์์ ๋ cfg ํ์ธ
epoch_69.pth (PyTorch Model)์ด ์์ฑ๋๋ค.
๋ชจ๋ธ ์คํ์ ์ํ config ์์
https://pebpung.github.io/wandb/2021/10/06/WandB-1.html
์ง๊ด์ ์ผ๋ก ์์ ํ๋ ๋ฐฉ๋ฒ์ ๋นํจ์จ์ ์ด๋ค..
์ฌ๋ฌ GPU ์ฌ์ฉ
$ CUDA_VISIBLE_DEVICES=2,3 port=29506 sixxtools/dist_train.sh work_dirs/sixx_faster_rcnn_r50_fpn_1x_coco.py 2
$ CUDA_VISIBLE_DEVICES=2,3,4,5,6,7 port=29506 sixxtools/dist_train.sh sixxconfigs/faster_rcnn_r50_fpn_1x_coco_sixx.py 6
CUDA_VISIBLE_DEVICES๋ก ์ฌ์ฉํ GPU๋ฅผ ํ์ ํด์ฃผ๊ณ ,
Port๋ฅผ ๋ถ๋ฆฌํ ํ,
์คํ
CUDA_VISIBLE_DEVICES=2,3 python train.py
CUDA_VISIBLE_DEVICES=2,3 python train.py
CUDA_VISIBLE_DEVICES=2,3 python train.py
์ฐธ๊ณ
#!/usr/bin/env bash CONFIG=$1 GPUS=$2 NNODES=${NNODES:-1} NODE_RANK=${NODE_RANK:-0} PORT=${PORT:-29501} MASTER_ADDR=${MASTER_ADDR:-"127.0.0.1"} PYTHONPATH="$(dirname $0)/..":$PYTHONPATH \\ python -m torch.distributed.launch \\ --nnodes=$NNODES \\ --node_rank=$NODE_RANK \\ --master_addr=$MASTER_ADDR \\ --nproc_per_node=$GPUS \\ --master_port=$PORT \\ $(dirname "$0")/train.py \\ $CONFIG \\ --seed 0 \\ --launcher pytorch ${@:3}
gpu_ids = range(1,3)
$ watch -d -n0.5 nvidia-smi ~/my/git/mmdetection$ bash sixx/dist_train.sh work_dirs/sixx_faster_rcnn_r50_fpn_1x_coco/sixx_faster_rcnn_r50_fpn_1x_coco.py 3
๋ค์ Training work_dirs
sixx/dist_train.sh: line 2: $’\r’: command not found
์ด๋ฐ ์๋ฌ๊ฐ ๋ ๊ฒฝ์ฐ, ์ ์ฒด ์ค๋ฐ๊ฟ(Carriage return๊ณผ New Line \r )์ newline( )์ผ๋ก ๋ฐ๊ฟ์ค๋ค.
sed -i -e ‘s/\r$//’ ./sixx/dist_train.sh
https://github.com/open-mmlab/mmdetection/issues/334
$ bash tools/dist_train.sh configs/skeleton/posec3d/slowonly_r50_u48_240e_ntu120_xsub_keypoints.py 1 --work-dir work_dirs/slowonly_r50_u48_240e_ntu120_xsub_keypoints --validate --test-best --seed 0 --deterministic
https://github.com/facebookresearch/maskrcnn-benchmark
export NGPUS=2 CUDA_VISIBLE_DEVICES=2,3 python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train.py configs/faster_rcnn_r101_fpn_1x.py --gpus 2
https://artiiicy.tistory.com/61
“CUDA_VISIBLE_DEVICES”๋ฅผ ํตํด cuda๊ฐ ๋ณผ ์ ์๋ GPU ์ ํํ๊ธฐ
ํญ์ cuda๋ GPU 0๋ฒ(torch.cuda.current_device())๋ถํฐ ์ฌ์ฉ์ ํ๊ฒ ๋๊ณ , CUDA_VISBLE_DEVICES= 2,3 ์ด๋ผ๋ฉด, cuda๋ 2,3๋ฒ์งธ๋ง ๋ณผ์ ์๊ธฐ๋๋ฌธ์ GPU 0์ ํ ๋นํ๋๋ค๋ ๊ฒ์ด 2๋ฅผ ์ฌ์ฉํ๋๊ฒ๊ณผ ๊ฐ๋ค.
๋จ, multi์ธ ๊ฒฝ์ฐ, nn.DataParallel()์ ์์ฑํด์ฃผ์ด์ผ ํ๋ค.
1-2) Jupyter notebook ๋ฑ์ python script “~.ipynb” file ๋ด์์ ๋๋ฆฌ๋ ๊ฒฝ์ฐ
“~.ipynb” ์ ๊ฐ์ด python script ๋ด์์ ๋๋ฆฌ๋ ๊ฒฝ์ฐ์๋ ๋ค์๊ณผ ๊ฐ์ด os.environ[ ] code๋ฅผ ํ์ฉํ์ฌ environment๋ฅผ ์ค์ ํ์ฌ ์คํํ ์ ์๋ค.
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # Arrange GPU devices starting from 0 os.environ["CUDA_VISIBLE_DEVICES"]= "2,3" # Set the GPUs 2 and 3 to use
$ python sixx/train.py work_dirs/sixx_faster_rcnn_r50_fpn_1x_coco.py
$ python sixxtools/train.py sixxconfigs/cascade_rcnn_r50_fpn_1x_coco.py