美文网首页
MMDetection2 环境配置与模型搭建

MMDetection2 环境配置与模型搭建

作者: 晓智AI | 来源:发表于2021-04-23 22:03 被阅读0次

    研究背景

    检测任务

    项目代码

    mmdetection
    Docs for reference

    目标检测四大开源神器 Detectron2/MMDetection/darknet/SimpleDet 介绍 Link

    硬件配置

    $ uname -a
    Linux localhost.localdomain 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
    
    $ cat /proc/cpuinfo
    processor       : 1
    vendor_id       : GenuineIntel
    cpu family      : 6
    model           : 85
    model name      : Intel(R) Xeon(R) Gold 5120 CPU @ 2.20GHz
    stepping        : 4
    microcode       : 0x2000069
    cpu MHz         : 999.963
    cache size      : 19712 KB
    physical id     : 0
    siblings        : 28
    core id         : 1
    cpu cores       : 14
    apicid          : 2
    initial apicid  : 2
    fpu             : yes
    fpu_exception   : yes
    cpuid level     : 22
    wp              : yes
    flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
    pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdt
    scp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_ts
    c aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 
    sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_
    timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 inte
    l_ppin intel_pt ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid f
    sgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512
    f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt x
    savec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat 
    pln pts hwp hwp_act_window hwp_epp hwp_pkg_req pku ospke spec_ctrl intel_stibp f
    lush_l1d
    bogomips        : 4400.00
    clflush size    : 64
    cache_alignment : 64
    address sizes   : 46 bits physical, 48 bits virtual
    power management:
    
    $ lshw -c video
      *-display
           description: VGA compatible controller
           product: TU102 [GeForce RTX 2080 Ti]
           vendor: NVIDIA Corporation
           physical id: 0
           bus info: pci@0000:db:00.0
           version: a1
           width: 64 bits
           clock: 33MHz
           capabilities: vga_controller bus_master cap_list rom
           configuration: driver=nvidia latency=0
           resources: iomemory:bff0-bfef iomemory:bff0-bfef irq:303 memory:f8000000-f8ffffff memory:bffc0000000-bffcfffffff memory:bffd0000000-bffd1ffffff ioport:e000(size=128) memory:f9000000-f907ffff
    WARNING: output may be incomplete or inaccurate, you should run this program as super-user.
    

    环境配置

    anaconda配置文件environment.yaml

    name: py37pt15
    channels:
      - pytorch
      - psi4
      - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
      - https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud/pytorch/
      - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main/
      - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
      - defaults
    dependencies:
      - _libgcc_mutex=0.1=main
      - blas=1.0=mkl
      - ca-certificates=2021.4.13=h06a4308_1
      - certifi=2020.12.5=py37h06a4308_0
      - cloog=0.18.0=0
      - cudatoolkit=10.1.243=h6bb024c_0
      - cudnn=7.6.5=cuda10.1_0
      - cython=0.29.23=py37h2531618_0
      - freetype=2.10.4=h5ab3b9f_0
      - gcc-5=5.2.0=1
      - gmp=6.2.1=h2531618_2
      - intel-openmp=2020.2=254
      - isl=0.12.2=0
      - jpeg=9b=h024ee3a_2
      - lcms2=2.12=h3be6417_0
      - ld_impl_linux-64=2.33.1=h53a641e_7
      - libffi=3.3=he6710b0_2
      - libgcc=7.2.0=h69d50b8_2
      - libgcc-ng=9.1.0=hdf63c60_0
      - libpng=1.6.37=hbc83047_0
      - libstdcxx-ng=9.1.0=hdf63c60_0
      - libtiff=4.1.0=h2733197_1
      - lz4-c=1.9.3=h2531618_0
      - mkl=2020.2=256
      - mkl-service=2.3.0=py37he8ac12f_0
      - mkl_fft=1.3.0=py37h54f3939_0
      - mkl_random=1.1.1=py37h0573a6f_0
      - mpc=1.1.0=h10f8cd9_1
      - mpfr=4.0.2=hb69a4c5_1
      - ncurses=6.2=he6710b0_1
      - ninja=1.10.2=hff7bd54_1
      - numpy=1.19.2=py37h54aff64_0
      - numpy-base=1.19.2=py37hfa32c7d_0
      - olefile=0.46=py37_0
      - openssl=1.1.1k=h27cfd23_0
      - pillow=8.2.0=py37he98fc37_0
      - pip=21.0.1=py37h06a4308_0
      - python=3.7.10=hdb3f193_0
      - pytorch=1.5.0=py3.7_cuda10.1.243_cudnn7.6.3_0
      - readline=8.1=h27cfd23_0
      - setuptools=52.0.0=py37h06a4308_0
      - six=1.15.0=py37h06a4308_0
      - sqlite=3.35.4=hdfb4753_0
      - tk=8.6.10=hbc83047_0
      - torchvision=0.6.0=py37_cu101
      - wheel=0.36.2=pyhd3eb1b0_0
      - xz=5.2.5=h7b6447c_0
      - zlib=1.2.11=h7b6447c_3
      - zstd=1.4.9=haebb681_0
      - pip:
        - addict==2.4.0
        - cycler==0.10.0
        - kiwisolver==1.3.1
        - matplotlib==3.4.1
        - mmcv-full==1.3.1
        - mmpycocotools==12.0.3
        - opencv-python==4.5.1.48
        - pyparsing==2.4.7
        - python-dateutil==2.8.1
        - pyyaml==5.4.1
        - terminaltables==3.1.0
        - yapf==0.31.0
    prefix: /home/intern2/anaconda3/envs/py37pt15
    

    安装过程

    conda create -n py37pt15 python=3.7 -y
    conda activate py37pt15 
    
    # CUDA 10.1
    conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=10.1 -c pytorch
    conda install cudnn
    
    pip install mmdet
    
    # mmcv-full 这里PyTorch/CUDA/mmcv必须保持一致
    pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.5.0/index.html
    # or
    git clone https://github.com/open-mmlab/mmcv.git
    cd mmcv
    MMCV_WITH_OPS=1 pip install -e .  # package mmcv-full will be installed after this step
    cd ..
    
    # 安装MMDetection
    pip install -r requirements/build.txt
    pip install -v -e .  # or "python setup.py develop"
    

    安装环境验证

    $ python mmdet/utils/collect_env.py
    which: no hipcc in (/home/intern2/anaconda3/envs/py37pt15/bin:/home/intern2/anaconda3/condabin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/intern2/.local/bin:/home/intern2/bin:/home/intern2/.local/bin:/home/intern2/bin)
    sys.platform: linux
    Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]
    CUDA available: True
    GPU 0,1,2,3,4,5,6,7: GeForce RTX 2080 Ti
    CUDA_HOME: /usr/local/cuda
    NVCC: Cuda compilation tools, release 10.2, V10.2.89
    GCC: gcc (GCC) 5.2.0
    PyTorch: 1.5.0
    PyTorch compiling details: PyTorch built with:
      - GCC 7.3
      - C++ Version: 201402
      - Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
      - Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
      - OpenMP 201511 (a.k.a. OpenMP 4.5)
      - NNPACK is enabled
      - CPU capability usage: AVX2
      - CUDA Runtime 10.1
      - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
      - CuDNN 7.6.3
      - Magma 2.5.2
      - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, 
    
    TorchVision: 0.6.0a0+82fd1c8
    OpenCV: 4.5.1
    MMCV: 1.3.1
    MMCV Compiler: GCC 7.3
    MMCV CUDA Compiler: 10.1
    MMDetection: 2.11.0+5ebba9a
    

    Demo验证

    Verification.py

    from mmdet.apis import init_detector, inference_detector
    
    # If which: no nvcc occur, then run
    # export PATH=$PATH:/usr/local/cuda-10.2/bin
    
    CUDA_VISIBLE_DEVICES="0"
    
    config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
    # download the checkpoint from model zoo and put it in `checkpoints/`
    # url: http://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth
    checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
    device = 'cuda:0'
    # init a detector
    model = init_detector(config_file, checkpoint_file, device=device)
    # inference the demo image
    # inference_detector(model, 'demo/demo.jpg')
    # inference_detector(model, '/data/jay/lhz/Methods/mmdetection/demo/demo.jpg')
    img = 'demo/demo.jpg'
    result = inference_detector(model, img)
    model.show_result(img, result, out_file='./result.jpg')
    

    运行示例程序

    $ python Verification.py 
    which: no hipcc in (/home/intern2/anaconda3/envs/py37pt15/bin:/home/intern2/anaconda3/condabin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/intern2/.local/bin:/home/intern2/bin:/home/intern2/.local/bin:/home/intern2/bin)
    Use load_from_local loader
    /data/jay/lhz/Methods/mmdetection/mmdet/datasets/utils.py:68: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recommended to manually replace it in the test data pipeline in your config file.
      'data pipeline in your config file.', UserWarning)
    

    如果运行程序后,能够在根目录下生成result.jpg文件,即环境配置完成。

    简要教程

    1.配置文件config

    特点:modular and inheritance
    在config/_base_包含dataset,model,schedules,default_runtime。

    配置文件的命名规则:

    {model}_[model setting]_{backbone}_{neck}_[norm setting]_[misc]_[gpu x batch_per_gpu]_{schedule}_{dataset}

    {model}:型号类型等faster_rcnn,mask_rcnn等。
    [model setting]:某些模型的特定设置,例如without_semanticfor htc、momentforreppoints等。
    {backbone}:骨干类型,如r50(ResNet-50)、x101(ResNeXt-101)。
    {neck}: 颈型如fpn, pafpn, nasfpn, c4。
    [norm_setting]: bn(Batch Normalization) 除非指定,否则使用其他规范层类型可以是gn(Group Normalization), syncbn(Synchronized Batch Normalization)。 gn-head/gn-neck表示 GN 仅应用于头部/颈部,而gn-all表示 GN 应用于整个模型,例如脊椎、颈部、头部。
    [misc]: 模型的杂项设置/插件,例如dconv、gcb、attention、albu、mstrain。
    [gpu x batch_per_gpu]:8x2默认情况下使用GPU 和每个 GPU 的样本。
    {schedule}:训练计划,选项为1x、2x、20e等,1x和2x分别表示 12 epochs 和 24 epochs。 20e在级联模型中采用,表示 20 个 epoch。对于1x / 2x,初始学习率在 8/16 和 11/22 时期衰减 10 倍。对于20e,初始学习率在第 16 个和第 19 个时期衰减 10 倍。
    {dataset}: 数据集如coco, cityscapes, voc_0712, wider_face。

    使用ResNet50和FPN的Mask R-CNN配置文件如下:

    model = dict(
        type='MaskRCNN',  # 检测器 The name of detector
        pretrained=
        'torchvision://resnet50',  # ImageNet上预训练模型 The ImageNet pretrained backbone to be loaded
        backbone=dict(  # 骨干网  The config of backbone
            type='ResNet',  # 骨干网络类型  The type of the backbone, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/backbones/resnet.py#L288 for more details.
            depth=50,  # 骨干网络深度  The depth of backbone, usually it is 50 or 101 for ResNet and ResNext backbones.
            num_stages=4,  # 阶段数量  Number of stages of the backbone.
            out_indices=(0, 1, 2, 3),  # 输出特征图索引  The index of output feature maps produced in each stages
            frozen_stages=1,  # 权重冻结 The weights in the first 1 stage are fronzen
            norm_cfg=dict(  # The config of normalization layers.
                type='BN',  # Type of norm layer, usually it is BN or GN
                requires_grad=True),  # Whether to train the gamma and beta in BN
            norm_eval=True,  # Whether to freeze the statistics in BN
            style='pytorch'),  # The style of backbone, 'pytorch' means that stride 2 layers are in 3x3 conv, 'caffe' means stride 2 layers are in 1x1 convs.
        neck=dict(
            type='FPN',  # The neck of detector is FPN. We also support 'NASFPN', 'PAFPN', etc. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/necks/fpn.py#L10 for more details.
            in_channels=[256, 512, 1024, 2048],  # The input channels, this is consistent with the output channels of backbone
            out_channels=256,  # The output channels of each level of the pyramid feature map
            num_outs=5),  # The number of output scales
        rpn_head=dict(
            type='RPNHead',  # The type of RPN head is 'RPNHead', we also support 'GARPNHead', etc. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/dense_heads/rpn_head.py#L12 for more details.
            in_channels=256,  # The input channels of each input feature map, this is consistent with the output channels of neck
            feat_channels=256,  # Feature channels of convolutional layers in the head.
            anchor_generator=dict(  # The config of anchor generator
                type='AnchorGenerator',  # Most of methods use AnchorGenerator, SSD Detectors uses `SSDAnchorGenerator`. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/anchor/anchor_generator.py#L10 for more details
                scales=[8],  # Basic scale of the anchor, the area of the anchor in one position of a feature map will be scale * base_sizes
                ratios=[0.5, 1.0, 2.0],  # The ratio between height and width.
                strides=[4, 8, 16, 32, 64]),  # The strides of the anchor generator. This is consistent with the FPN feature strides. The strides will be taken as base_sizes if base_sizes is not set.
            bbox_coder=dict(  # Config of box coder to encode and decode the boxes during training and testing
                type='DeltaXYWHBBoxCoder',  # Type of box coder. 'DeltaXYWHBBoxCoder' is applied for most of methods. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/coder/delta_xywh_bbox_coder.py#L9 for more details.
                target_means=[0.0, 0.0, 0.0, 0.0],  # The target means used to encode and decode boxes
                target_stds=[1.0, 1.0, 1.0, 1.0]),  # The standard variance used to encode and decode boxes
            loss_cls=dict(  # Config of loss function for the classification branch
                type='CrossEntropyLoss',  # Type of loss for classification branch, we also support FocalLoss etc.
                use_sigmoid=True,  # RPN usually perform two-class classification, so it usually uses sigmoid function.
                loss_weight=1.0),  # Loss weight of the classification branch.
            loss_bbox=dict(  # Config of loss function for the regression branch.
                type='L1Loss',  # Type of loss, we also support many IoU Losses and smooth L1-loss, etc. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/losses/smooth_l1_loss.py#L56 for implementation.
                loss_weight=1.0)),  # Loss weight of the regression branch.
        roi_head=dict(  # RoIHead encapsulates the second stage of two-stage/cascade detectors.
            type='StandardRoIHead',  # Type of the RoI head. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/standard_roi_head.py#L10 for implementation.
            bbox_roi_extractor=dict(  # RoI feature extractor for bbox regression.
                type='SingleRoIExtractor',  # Type of the RoI feature extractor, most of methods uses SingleRoIExtractor. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/roi_extractors/single_level.py#L10 for details.
                roi_layer=dict(  # Config of RoI Layer
                    type='RoIAlign',  # Type of RoI Layer, DeformRoIPoolingPack and ModulatedDeformRoIPoolingPack are also supported. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/ops/roi_align/roi_align.py#L79 for details.
                    output_size=7,  # The output size of feature maps.
                    sampling_ratio=0),  # Sampling ratio when extracting the RoI features. 0 means adaptive ratio.
                out_channels=256,  # output channels of the extracted feature.
                featmap_strides=[4, 8, 16, 32]),  # Strides of multi-scale feature maps. It should be consistent to the architecture of the backbone.
            bbox_head=dict(  # Config of box head in the RoIHead.
                type='Shared2FCBBoxHead',  # Type of the bbox head, Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/bbox_heads/convfc_bbox_head.py#L177 for implementation details.
                in_channels=256,  # Input channels for bbox head. This is consistent with the out_channels in roi_extractor
                fc_out_channels=1024,  # Output feature channels of FC layers.
                roi_feat_size=7,  # Size of RoI features
                num_classes=80,  # Number of classes for classification
                bbox_coder=dict(  # Box coder used in the second stage.
                    type='DeltaXYWHBBoxCoder',  # Type of box coder. 'DeltaXYWHBBoxCoder' is applied for most of methods.
                    target_means=[0.0, 0.0, 0.0, 0.0],  # Means used to encode and decode box
                    target_stds=[0.1, 0.1, 0.2, 0.2]),  # Standard variance for encoding and decoding. It is smaller since the boxes are more accurate. [0.1, 0.1, 0.2, 0.2] is a conventional setting.
                reg_class_agnostic=False,  # Whether the regression is class agnostic.
                loss_cls=dict(  # Config of loss function for the classification branch
                    type='CrossEntropyLoss',  # Type of loss for classification branch, we also support FocalLoss etc.
                    use_sigmoid=False,  # Whether to use sigmoid.
                    loss_weight=1.0),  # Loss weight of the classification branch.
                loss_bbox=dict(  # Config of loss function for the regression branch.
                    type='L1Loss',  # Type of loss, we also support many IoU Losses and smooth L1-loss, etc.
                    loss_weight=1.0)),  # Loss weight of the regression branch.
            mask_roi_extractor=dict(  # RoI feature extractor for bbox regression.
                type='SingleRoIExtractor',  # Type of the RoI feature extractor, most of methods uses SingleRoIExtractor.
                roi_layer=dict(  # Config of RoI Layer that extracts features for instance segmentation
                    type='RoIAlign',  # Type of RoI Layer, DeformRoIPoolingPack and ModulatedDeformRoIPoolingPack are also supported
                    output_size=14,  # The output size of feature maps.
                    sampling_ratio=0),  # Sampling ratio when extracting the RoI features.
                out_channels=256,  # Output channels of the extracted feature.
                featmap_strides=[4, 8, 16, 32]),  # Strides of multi-scale feature maps.
            mask_head=dict(  # Mask prediction head
                type='FCNMaskHead',  # Type of mask head, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/models/roi_heads/mask_heads/fcn_mask_head.py#L21 for implementation details.
                num_convs=4,  # Number of convolutional layers in mask head.
                in_channels=256,  # Input channels, should be consistent with the output channels of mask roi extractor.
                conv_out_channels=256,  # Output channels of the convolutional layer.
                num_classes=80,  # Number of class to be segmented.
                loss_mask=dict(  # Config of loss function for the mask branch.
                    type='CrossEntropyLoss',  # Type of loss used for segmentation
                    use_mask=True,  # Whether to only train the mask in the correct class.
                    loss_weight=1.0))))  # Loss weight of mask branch.
        train_cfg = dict(  # Config of training hyperparameters for rpn and rcnn
            rpn=dict(  # Training config of rpn
                assigner=dict(  # Config of assigner
                    type='MaxIoUAssigner',  # Type of assigner, MaxIoUAssigner is used for many common detectors. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/assigners/max_iou_assigner.py#L10 for more details.
                    pos_iou_thr=0.7,  # IoU >= threshold 0.7 will be taken as positive samples
                    neg_iou_thr=0.3,  # IoU < threshold 0.3 will be taken as negative samples
                    min_pos_iou=0.3,  # The minimal IoU threshold to take boxes as positive samples
                    match_low_quality=True,  # Whether to match the boxes under low quality (see API doc for more details).
                    ignore_iof_thr=-1),  # IoF threshold for ignoring bboxes
                sampler=dict(  # Config of positive/negative sampler
                    type='RandomSampler',  # Type of sampler, PseudoSampler and other samplers are also supported. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/samplers/random_sampler.py#L8 for implementation details.
                    num=256,  # Number of samples
                    pos_fraction=0.5,  # The ratio of positive samples in the total samples.
                    neg_pos_ub=-1,  # The upper bound of negative samples based on the number of positive samples.
                    add_gt_as_proposals=False),  # Whether add GT as proposals after sampling.
                allowed_border=-1,  # The border allowed after padding for valid anchors.
                pos_weight=-1,  # The weight of positive samples during training.
                debug=False),  # Whether to set the debug mode
            rpn_proposal=dict(  # The config to generate proposals during training
                nms_across_levels=False,  # Whether to do NMS for boxes across levels. Only work in `GARPNHead`, naive rpn does not support do nms cross levels.
                nms_pre=2000,  # The number of boxes before NMS
                nms_post=1000,  # The number of boxes to be kept by NMS, Only work in `GARPNHead`.
                max_per_img=1000,  # The number of boxes to be kept after NMS.
                nms=dict( # Config of nms
                    type='nms',  #Type of nms
                    iou_threshold=0.7 # NMS threshold
                    ),
                min_bbox_size=0),  # The allowed minimal box size
            rcnn=dict(  # The config for the roi heads.
                assigner=dict(  # Config of assigner for second stage, this is different for that in rpn
                    type='MaxIoUAssigner',  # Type of assigner, MaxIoUAssigner is used for all roi_heads for now. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/assigners/max_iou_assigner.py#L10 for more details.
                    pos_iou_thr=0.5,  # IoU >= threshold 0.5 will be taken as positive samples
                    neg_iou_thr=0.5,  # IoU < threshold 0.5 will be taken as negative samples
                    min_pos_iou=0.5,  # The minimal IoU threshold to take boxes as positive samples
                    match_low_quality=False,  # Whether to match the boxes under low quality (see API doc for more details).
                    ignore_iof_thr=-1),  # IoF threshold for ignoring bboxes
                sampler=dict(
                    type='RandomSampler',  # Type of sampler, PseudoSampler and other samplers are also supported. Refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/bbox/samplers/random_sampler.py#L8 for implementation details.
                    num=512,  # Number of samples
                    pos_fraction=0.25,  # The ratio of positive samples in the total samples.
                    neg_pos_ub=-1,  # The upper bound of negative samples based on the number of positive samples.
                    add_gt_as_proposals=True
                ),  # Whether add GT as proposals after sampling.
                mask_size=28,  # Size of mask
                pos_weight=-1,  # The weight of positive samples during training.
                debug=False))  # Whether to set the debug mode
        test_cfg = dict(  # Config for testing hyperparameters for rpn and rcnn
            rpn=dict(  # The config to generate proposals during testing
                nms_across_levels=False,  # Whether to do NMS for boxes across levels. Only work in `GARPNHead`, naive rpn does not support do nms cross levels.
                nms_pre=1000,  # The number of boxes before NMS
                nms_post=1000,  # The number of boxes to be kept by NMS, Only work in `GARPNHead`.
                max_per_img=1000,  # The number of boxes to be kept after NMS.
                nms=dict( # Config of nms
                    type='nms',  #Type of nms
                    iou_threshold=0.7 # NMS threshold
                    ),
                min_bbox_size=0),  # The allowed minimal box size
            rcnn=dict(  # The config for the roi heads.
                score_thr=0.05,  # Threshold to filter out boxes
                nms=dict(  # Config of nms in the second stage
                    type='nms',  # Type of nms
                    iou_thr=0.5),  # NMS threshold
                max_per_img=100,  # Max number of detections of each image
                mask_thr_binary=0.5))  # Threshold of mask prediction
    dataset_type = 'CocoDataset'  # Dataset type, this will be used to define the dataset
    data_root = 'data/coco/'  # Root path of data
    img_norm_cfg = dict(  # Image normalization config to normalize the input images
        mean=[123.675, 116.28, 103.53],  # Mean values used to pre-training the pre-trained backbone models
        std=[58.395, 57.12, 57.375],  # Standard variance used to pre-training the pre-trained backbone models
        to_rgb=True
    )  # The channel orders of image used to pre-training the pre-trained backbone models
    train_pipeline = [  # Training pipeline
        dict(type='LoadImageFromFile'),  # First pipeline to load images from file path
        dict(
            type='LoadAnnotations',  # Second pipeline to load annotations for current image
            with_bbox=True,  # Whether to use bounding box, True for detection
            with_mask=True,  # Whether to use instance mask, True for instance segmentation
            poly2mask=False),  # Whether to convert the polygon mask to instance mask, set False for acceleration and to save memory
        dict(
            type='Resize',  # Augmentation pipeline that resize the images and their annotations
            img_scale=(1333, 800),  # The largest scale of image
            keep_ratio=True
        ),  # whether to keep the ratio between height and width.
        dict(
            type='RandomFlip',  # Augmentation pipeline that flip the images and their annotations
            flip_ratio=0.5),  # The ratio or probability to flip
        dict(
            type='Normalize',  # Augmentation pipeline that normalize the input images
            mean=[123.675, 116.28, 103.53],  # These keys are the same of img_norm_cfg since the
            std=[58.395, 57.12, 57.375],  # keys of img_norm_cfg are used here as arguments
            to_rgb=True),
        dict(
            type='Pad',  # Padding config
            size_divisor=32),  # The number the padded images should be divisible
        dict(type='DefaultFormatBundle'),  # Default format bundle to gather data in the pipeline
        dict(
            type='Collect',  # Pipeline that decides which keys in the data should be passed to the detector
            keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])
    ]
    test_pipeline = [
        dict(type='LoadImageFromFile'),  # First pipeline to load images from file path
        dict(
            type='MultiScaleFlipAug',  # An encapsulation that encapsulates the testing augmentations
            img_scale=(1333, 800),  # Decides the largest scale for testing, used for the Resize pipeline
            flip=False,  # Whether to flip images during testing
            transforms=[
                dict(type='Resize',  # Use resize augmentation
                     keep_ratio=True),  # Whether to keep the ratio between height and width, the img_scale set here will be suppressed by the img_scale set above.
                dict(type='RandomFlip'),  # Thought RandomFlip is added in pipeline, it is not used because flip=False
                dict(
                    type='Normalize',  # Normalization config, the values are from img_norm_cfg
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(
                    type='Pad',  # Padding config to pad images divisable by 32.
                    size_divisor=32),
                dict(
                    type='ImageToTensor',  # convert image to tensor
                    keys=['img']),
                dict(
                    type='Collect',  # Collect pipeline that collect necessary keys for testing.
                    keys=['img'])
            ])
    ]
    data = dict(
        samples_per_gpu=2,  # Batch size of a single GPU
        workers_per_gpu=2,  # Worker to pre-fetch data for each single GPU
        train=dict(  # Train dataset config
            type='CocoDataset',  # Type of dataset, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/datasets/coco.py#L19 for details.
            ann_file='data/coco/annotations/instances_train2017.json',  # Path of annotation file
            img_prefix='data/coco/train2017/',  # Prefix of image path
            pipeline=[  # pipeline, this is passed by the train_pipeline created before.
                dict(type='LoadImageFromFile'),
                dict(
                    type='LoadAnnotations',
                    with_bbox=True,
                    with_mask=True,
                    poly2mask=False),
                dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
                dict(type='RandomFlip', flip_ratio=0.5),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='Pad', size_divisor=32),
                dict(type='DefaultFormatBundle'),
                dict(
                    type='Collect',
                    keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])
            ]),
        val=dict(  # Validation dataset config
            type='CocoDataset',
            ann_file='data/coco/annotations/instances_val2017.json',
            img_prefix='data/coco/val2017/',
            pipeline=[  # Pipeline is passed by test_pipeline created before
                dict(type='LoadImageFromFile'),
                dict(
                    type='MultiScaleFlipAug',
                    img_scale=(1333, 800),
                    flip=False,
                    transforms=[
                        dict(type='Resize', keep_ratio=True),
                        dict(type='RandomFlip'),
                        dict(
                            type='Normalize',
                            mean=[123.675, 116.28, 103.53],
                            std=[58.395, 57.12, 57.375],
                            to_rgb=True),
                        dict(type='Pad', size_divisor=32),
                        dict(type='ImageToTensor', keys=['img']),
                        dict(type='Collect', keys=['img'])
                    ])
            ]),
        test=dict(  # Test dataset config, modify the ann_file for test-dev/test submission
            type='CocoDataset',
            ann_file='data/coco/annotations/instances_val2017.json',
            img_prefix='data/coco/val2017/',
            pipeline=[  # Pipeline is passed by test_pipeline created before
                dict(type='LoadImageFromFile'),
                dict(
                    type='MultiScaleFlipAug',
                    img_scale=(1333, 800),
                    flip=False,
                    transforms=[
                        dict(type='Resize', keep_ratio=True),
                        dict(type='RandomFlip'),
                        dict(
                            type='Normalize',
                            mean=[123.675, 116.28, 103.53],
                            std=[58.395, 57.12, 57.375],
                            to_rgb=True),
                        dict(type='Pad', size_divisor=32),
                        dict(type='ImageToTensor', keys=['img']),
                        dict(type='Collect', keys=['img'])
                    ])
            ],
            samples_per_gpu=2  # Batch size of a single GPU used in testing
            ))
    evaluation = dict(  # The config to build the evaluation hook, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/evaluation/eval_hooks.py#L7 for more details.
        interval=1,  # Evaluation interval
        metric=['bbox', 'segm'])  # Metrics used during evaluation
    optimizer = dict(  # Config used to build optimizer, support all the optimizers in PyTorch whose arguments are also the same as those in PyTorch
        type='SGD',  # Type of optimizers, refer to https://github.com/open-mmlab/mmdetection/blob/master/mmdet/core/optimizer/default_constructor.py#L13 for more details
        lr=0.02,  # Learning rate of optimizers, see detail usages of the parameters in the documentaion of PyTorch
        momentum=0.9,  # Momentum
        weight_decay=0.0001)  # Weight decay of SGD
    optimizer_config = dict(  # Config used to build the optimizer hook, refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/optimizer.py#L8 for implementation details.
        grad_clip=None)  # Most of the methods do not use gradient clip
    lr_config = dict(  # Learning rate scheduler config used to register LrUpdater hook
        policy='step',  # The policy of scheduler, also support CosineAnnealing, Cyclic, etc. Refer to details of supported LrUpdater from https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/lr_updater.py#L9.
        warmup='linear',  # The warmup policy, also support `exp` and `constant`.
        warmup_iters=500,  # The number of iterations for warmup
        warmup_ratio=
        0.001,  # The ratio of the starting learning rate used for warmup
        step=[8, 11])  # Steps to decay the learning rate
    runner = dict(type='EpochBasedRunner', max_epochs=12) # Runner that runs the workflow in total max_epochs
    checkpoint_config = dict(  # Config to set the checkpoint hook, Refer to https://github.com/open-mmlab/mmcv/blob/master/mmcv/runner/hooks/checkpoint.py for implementation.
        interval=1)  # The save interval is 1
    log_config = dict(  # config to register logger hook
        interval=50,  # Interval to print the log
        hooks=[
            # dict(type='TensorboardLoggerHook')  # The Tensorboard logger is also supported
            dict(type='TextLoggerHook')
        ])  # The logger used to record the training process.
    dist_params = dict(backend='nccl')  # Parameters to setup distributed training, the port can also be set.
    log_level = 'INFO'  # The level of logging.
    load_from = None  # load models as a pre-trained model from a given path. This will not resume training.
    resume_from = None  # Resume checkpoints from a given path, the training will be resumed from the epoch when the checkpoint's is saved.
    workflow = [('train', 1)]  # Workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once. The workflow trains the model by 12 epochs according to the total_epochs.
    work_dir = 'work_dir'  # Directory to save the model checkpoints and logs for the current experiments.
    

    2.自定义数据集pipelines

    3.自定义模型Model

    4.自定义Runtime设置

    5.自定义损失Loss

    6.微调模型

    问题梳理

    1. `CXXABI_1.3.11' not found

    问题描述:
    ImportError: /home/intern2/anaconda3/envs/py37pt15/lib/python3.7/site-packages/torch/lib/../../../.././libstdc++.so.6: version `CXXABI_1.3.11' not found (required by /home/intern2/anaconda3/envs/py37pt15/lib/python3.7/site-packages/torch/lib/libtorch_python.so)

    $ strings /home/intern2/anaconda3/envs/py37pt15/lib/libstdc++.so.6 | grep CXXABI
    CXXABI_1.3
    CXXABI_1.3.1
    CXXABI_1.3.2
    CXXABI_1.3.3
    CXXABI_1.3.4
    CXXABI_1.3.5
    CXXABI_1.3.6
    CXXABI_1.3.7
    CXXABI_1.3.8
    CXXABI_1.3.9
    CXXABI_TM_1
    CXXABI_FLOAT128
    
    $ find ~/ -name libstdc++.so.6
    
    $ strings /home/intern2/anaconda3/lib/libstdc++.so.6 | grep CXXABI
    CXXABI_1.3
    CXXABI_1.3.1
    CXXABI_1.3.2
    CXXABI_1.3.3
    CXXABI_1.3.4
    CXXABI_1.3.5
    CXXABI_1.3.6
    CXXABI_1.3.7
    CXXABI_1.3.8
    CXXABI_1.3.9
    CXXABI_1.3.10
    CXXABI_1.3.11
    CXXABI_1.3.12
    CXXABI_TM_1
    CXXABI_FLOAT128
    CXXABI_1.3
    CXXABI_1.3.11
    CXXABI_1.3.2
    CXXABI_1.3.6
    CXXABI_FLOAT128
    CXXABI_1.3.12
    CXXABI_1.3.9
    CXXABI_1.3.1
    CXXABI_1.3.5
    CXXABI_1.3.8
    CXXABI_1.3.4
    CXXABI_TM_1
    CXXABI_1.3.7
    CXXABI_1.3.10
    CXXABI_1.3.3
    
    $ mv /home/intern2/anaconda3/envs/py37pt15/lib/libstdc++.so.6 /home/intern2/anaconda3/envs/py37pt15/lib/libstdc++.so.6.bak
    
    $ ln -s /home/intern2/anaconda3/lib/libstdc++.so.6 /home/intern2/anaconda3/envs/py37pt15/lib/libstdc++.so.6
    
    1. gcc版本较低
    conda install -c ypsi4 gcc-5
    

    可以升级到GCC 5.2

    1. nvcc找不到
      问题描述:
    $ nvcc -V
    -bash: nvcc: command not found
    

    解决方式:

    export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-9.0/lib64
    export PATH=$PATH:/usr/local/cuda-9.0/bin
    export CUDA_HOME=$CUDA_HOME:/usr/local/cuda-9.0
    
    1. which: no hipcc

    问题描述:

    which: no hipcc in (/home/intern2/anaconda3/envs/py37pt15/bin:/home/intern2/anac
    onda3/condabin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/home/intern2/.
    local/bin:/home/intern2/bin:/home/intern2/.local/bin:/home/intern2/bin)
    

    解决方式:

                hipcc = subprocess.check_output(
                    ['which', 'hipcc']).decode().rstrip('\r\n')
                    ['which', 'hipcc'], stderr=subprocess.DEVNULL).decode().rstrip('\r\n')
                # this will be either <ROCM_HOME>/hip/bin/hipcc or <ROCM_HOME>/bin/hipcc
                rocm_home = os.path.dirname(os.path.dirname(hipcc))
                if os.path.basename(rocm_home) == 'hip':
    

    https://github.com/open-mmlab/mmcv/issues/274
    https://github.com/pytorch/pytorch/commit/048e19749b77df6a7aa327471ad919d420b22792

    1. libcudart.so.9.2

    ImportError: libcudart.so.9.2: cannot open shared object file: No such file or directory
    解决方式:

    export LD_LIBRARY_PATH=/usr/local/cuda-9.2/lib64:$LD_LIBRARY_PATH
    export PATH=/usr/local/cuda-9.2/bin:$PATH
    

    基本同3. nvcc

    1. mmcv._ext

    ModuleNotFoundError: No module named 'mmcv._ext'
    解决方式:
    环境问题,重新安装

    相关文章

      网友评论

          本文标题:MMDetection2 环境配置与模型搭建

          本文链接:https://www.haomeiwen.com/subject/piyhrltx.html