美文网首页
object_detectionAPI源码阅读笔记(4-mode

object_detectionAPI源码阅读笔记(4-mode

作者: yanghedada | 来源:发表于2018-10-06 12:00 被阅读256次

    model_builder.py文件

    上文说到这个object_detection\builders\model_builder.py文件产生模型的loss。
    注:本文是以Faster R-CNN模型进行逐步分析,其他类似。

    概述:

    • 该文件其汇总了所有的模型特征抽取函数,各种组件builder。
    • DetectionModel 的几个子类均在这个文件被引入,分别用于实现 SSD , Faster R-CNN , R-FCN 三个系列的基本功能,基于不同的分类模型,构建不同的feature_extractor。
      如下:
    #model_builder.py
    from object_detection.builders import anchor_generator_builder
    from object_detection.builders import box_coder_builder
    from object_detection.builders import box_predictor_builder
    from object_detection.builders import hyperparams_builder
    from object_detection.builders import image_resizer_builder
    from object_detection.builders import losses_builder
    from object_detection.builders import matcher_builder
    from object_detection.builders import post_processing_builder
    from object_detection.builders import region_similarity_calculator_builder as sim_calc
    from object_detection.core import box_predictor
    from object_detection.meta_architectures import faster_rcnn_meta_arch
    from object_detection.meta_architectures import rfcn_meta_arch
    from object_detection.meta_architectures import ssd_meta_arch
    from object_detection.models import faster_rcnn_inception_resnet_v2_feature_extractor as frcnn_inc_res
    from object_detection.models import faster_rcnn_inception_v2_feature_extractor as frcnn_inc_v2
    from object_detection.models import faster_rcnn_nas_feature_extractor as frcnn_nas
    from object_detection.models import faster_rcnn_resnet_v1_feature_extractor as frcnn_resnet_v1
    from object_detection.models.embedded_ssd_mobilenet_v1_feature_extractor import EmbeddedSSDMobileNetV1FeatureExtractor
    from object_detection.models.ssd_inception_v2_feature_extractor import SSDInceptionV2FeatureExtractor
    from object_detection.models.ssd_inception_v3_feature_extractor import SSDInceptionV3FeatureExtractor
    from object_detection.models.ssd_mobilenet_v1_feature_extractor import SSDMobileNetV1FeatureExtractor
    from object_detection.protos import model_pb2
    

    该模块中实现以下方法:
    build:主要方法,其他模块都会调用该方法。(这是供外部调用的唯一方法,其他的_build*_model不是给外部调用的)
    在这个.py文件中,主要实现了 SSD 和 R-CNN 两个系列的模型。
    因此,在具体实现中,也分为 SSD 和 R-CNN 两个部分。

    • _build_ssd_model:创建 SSD 系列模型。
    • _build_faster_rcnn_model:创建 R-CNN 系列模型。
    • _build_ssd_feature_extractor:SSD算法中提取特征图。
    • _build_faster_rcnn_feature_extractor:R-CNN算法中提取特征图。

    构建build 方法

    流程:
    根据model.proto配置文件所示,选用的是 ssd 模型还是 r-cnn 模型。
    根据选择的模型,调用_build_ssd_model或_build_faster_rcnn_model。

    def build(model_config, is_training):
      """Builds a DetectionModel based on the model config.
    
      Args:
        model_config: A model.proto object containing the config for the desired
          DetectionModel.
        is_training: True if this model is being built for training purposes.
    
      Returns:
        DetectionModel based on the config.
    
      Raises:
        ValueError: On invalid meta architecture or model.
      """
      if not isinstance(model_config, model_pb2.DetectionModel):
        raise ValueError('model_config not of type model_pb2.DetectionModel.')
      meta_architecture = model_config.WhichOneof('model')
      if meta_architecture == 'ssd':
        return _build_ssd_model(model_config.ssd, is_training)
      if meta_architecture == 'faster_rcnn':
        return _build_faster_rcnn_model(model_config.faster_rcnn, is_training)
      raise ValueError('Unknown meta architecture: {}'.format(meta_architecture))
    

    在这里如果选择_build_faster_rcnn_model()的话就会跳转到与faster_rcnn相关的两个内部类。

    • 1._build_faster_rcnn_model:创建 R-CNN 系列模型。
    • 2._build_faster_rcnn_feature_extractor:R-CNN算法中提取特征图。

    _build_faster_rcnn_feature_extractor():

    主要功能:根据配置条件,获取相应的 feature extractor
    这里会用到在文件里导入的各种feature extractor类:

    def _build_faster_rcnn_feature_extractor(
        feature_extractor_config, is_training, reuse_weights=None):
      feature_type = feature_extractor_config.type
      first_stage_features_stride = (
          feature_extractor_config.first_stage_features_stride)
      batch_norm_trainable = feature_extractor_config.batch_norm_trainable
    
      if feature_type not in FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP:
        raise ValueError('Unknown Faster R-CNN feature_extractor: {}'.format(
            feature_type))
      feature_extractor_class = FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP[
          feature_type]
      return feature_extractor_class(
          is_training, first_stage_features_stride,
          batch_norm_trainable, reuse_weights)
    

    就是在这个字典FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP里面中找到对应的特征抽取函数,配置文档中提到的注册就是在这里:
    字典在文件开头如下:

    # A map of names to Faster R-CNN feature extractors.
    FASTER_RCNN_FEATURE_EXTRACTOR_CLASS_MAP = {
        'faster_rcnn_nas':
        frcnn_nas.FasterRCNNNASFeatureExtractor,
        'faster_rcnn_inception_resnet_v2':
        frcnn_inc_res.FasterRCNNInceptionResnetV2FeatureExtractor,
        'faster_rcnn_inception_v2':
        frcnn_inc_v2.FasterRCNNInceptionV2FeatureExtractor,
        'faster_rcnn_resnet50':
        frcnn_resnet_v1.FasterRCNNResnet50FeatureExtractor,
        'faster_rcnn_resnet101':
        frcnn_resnet_v1.FasterRCNNResnet101FeatureExtractor,
        'faster_rcnn_resnet152':
        frcnn_resnet_v1.FasterRCNNResnet152FeatureExtractor,
    }
    

    _build_faster_rcnn_model():

    之后就是这个大BOSS,这个函数支持Faster RCNN和R-FCN这两个模型。注解里Builds R-FCN model if the second_stage_box_predictor in the config is of type rfcn_box_predictor else builds a Faster R-CNN model.
    就是在second_stage_box_predictor中对这两个模型进行区分。返回值用到了rfcn_meta_arch.RFCNMetaArch()和faster_rcnn_meta_arch.FasterRCNNMetaArch()这是下一步的分析重点,就是这里产生的loss啊。

    def _build_faster_rcnn_model(frcnn_config, is_training):
      num_classes = frcnn_config.num_classes
      image_resizer_fn = image_resizer_builder.build(frcnn_config.image_resizer)
    
      feature_extractor = _build_faster_rcnn_feature_extractor(
          frcnn_config.feature_extractor, is_training)
    
      first_stage_only = frcnn_config.first_stage_only
      first_stage_anchor_generator = anchor_generator_builder.build(
          frcnn_config.first_stage_anchor_generator)
    
      first_stage_atrous_rate = frcnn_config.first_stage_atrous_rate
      first_stage_box_predictor_arg_scope = hyperparams_builder.build(
          frcnn_config.first_stage_box_predictor_conv_hyperparams, is_training)
      first_stage_box_predictor_kernel_size = (
          frcnn_config.first_stage_box_predictor_kernel_size)
      first_stage_box_predictor_depth = frcnn_config.first_stage_box_predictor_depth
      first_stage_minibatch_size = frcnn_config.first_stage_minibatch_size
      first_stage_positive_balance_fraction = (
          frcnn_config.first_stage_positive_balance_fraction)
      first_stage_nms_score_threshold = frcnn_config.first_stage_nms_score_threshold
      first_stage_nms_iou_threshold = frcnn_config.first_stage_nms_iou_threshold
      first_stage_max_proposals = frcnn_config.first_stage_max_proposals
      first_stage_loc_loss_weight = (
          frcnn_config.first_stage_localization_loss_weight)
      first_stage_obj_loss_weight = frcnn_config.first_stage_objectness_loss_weight
    
      initial_crop_size = frcnn_config.initial_crop_size
      maxpool_kernel_size = frcnn_config.maxpool_kernel_size
      maxpool_stride = frcnn_config.maxpool_stride
    
      second_stage_box_predictor = box_predictor_builder.build(
          hyperparams_builder.build,
          frcnn_config.second_stage_box_predictor,
          is_training=is_training,
          num_classes=num_classes)
      second_stage_batch_size = frcnn_config.second_stage_batch_size
      second_stage_balance_fraction = frcnn_config.second_stage_balance_fraction
      (second_stage_non_max_suppression_fn, second_stage_score_conversion_fn
      ) = post_processing_builder.build(frcnn_config.second_stage_post_processing)
      second_stage_localization_loss_weight = (
          frcnn_config.second_stage_localization_loss_weight)
      second_stage_classification_loss = (
          losses_builder.build_faster_rcnn_classification_loss(
              frcnn_config.second_stage_classification_loss))
      second_stage_classification_loss_weight = (
          frcnn_config.second_stage_classification_loss_weight)
      second_stage_mask_prediction_loss_weight = (
          frcnn_config.second_stage_mask_prediction_loss_weight)
    
      hard_example_miner = None
      if frcnn_config.HasField('hard_example_miner'):
        hard_example_miner = losses_builder.build_hard_example_miner(
            frcnn_config.hard_example_miner,
            second_stage_classification_loss_weight,
            second_stage_localization_loss_weight)
    
      common_kwargs = {
          'is_training': is_training,
          'num_classes': num_classes,
          'image_resizer_fn': image_resizer_fn,
          'feature_extractor': feature_extractor,
          'first_stage_only': first_stage_only,
          'first_stage_anchor_generator': first_stage_anchor_generator,
          'first_stage_atrous_rate': first_stage_atrous_rate,
          'first_stage_box_predictor_arg_scope':
          first_stage_box_predictor_arg_scope,
          'first_stage_box_predictor_kernel_size':
          first_stage_box_predictor_kernel_size,
          'first_stage_box_predictor_depth': first_stage_box_predictor_depth,
          'first_stage_minibatch_size': first_stage_minibatch_size,
          'first_stage_positive_balance_fraction':
          first_stage_positive_balance_fraction,
          'first_stage_nms_score_threshold': first_stage_nms_score_threshold,
          'first_stage_nms_iou_threshold': first_stage_nms_iou_threshold,
          'first_stage_max_proposals': first_stage_max_proposals,
          'first_stage_localization_loss_weight': first_stage_loc_loss_weight,
          'first_stage_objectness_loss_weight': first_stage_obj_loss_weight,
          'second_stage_batch_size': second_stage_batch_size,
          'second_stage_balance_fraction': second_stage_balance_fraction,
          'second_stage_non_max_suppression_fn':
          second_stage_non_max_suppression_fn,
          'second_stage_score_conversion_fn': second_stage_score_conversion_fn,
          'second_stage_localization_loss_weight':
          second_stage_localization_loss_weight,
          'second_stage_classification_loss':
          second_stage_classification_loss,
          'second_stage_classification_loss_weight':
          second_stage_classification_loss_weight,
          'hard_example_miner': hard_example_miner}
    
      if isinstance(second_stage_box_predictor, box_predictor.RfcnBoxPredictor):
        return rfcn_meta_arch.RFCNMetaArch(
            second_stage_rfcn_box_predictor=second_stage_box_predictor,
            **common_kwargs)
      else:
        return faster_rcnn_meta_arch.FasterRCNNMetaArch(
            initial_crop_size=initial_crop_size,
            maxpool_kernel_size=maxpool_kernel_size,
            maxpool_stride=maxpool_stride,
            second_stage_mask_rcnn_box_predictor=second_stage_box_predictor,
            second_stage_mask_prediction_loss_weight=(
                second_stage_mask_prediction_loss_weight),
            **common_kwargs)
    
    

    前面说到一个脉络:
    1.model.py -> faster_rcnn_meta_arch.py ->faster_rcnn_inception_v2_feature_extractor.py
    2.inception_resnet_v2 ->faster_rcnn_inception_v2_feature_extractor.py ->model_builder.py
    3.model_builder.py -> train.py
    这就是model_builder.py的分布情况。
    也就是说faster_rcnn_meta_arch只是个工具,材料(图片)在inception_resnet_v2。这里我们才说到model_builder.py ,接下来需要深挖。
    透露一点在faster_rcnn_meta_arch.py的注解里面有这么一段话,这就是对FasterRCNNMetaArch类的解释。大概就是如下内容了:

    Faster R-CNN meta-architecture提供两种模式:
    first_stage_only=True 和 first_stage_only=False.
    
    在前一种设置实现了所有的方法(e.g., predict, postprocess,
    loss==预测,后期处理,损失函数)都可以被看 
    作该模型只包含RPN的样子来使用,返回class agnostic proposals(这些可以被认 为是没有关联的类信息的近似检测)。在后一种设置中,计算区域建议,然后通过第二阶段“盒分类器”得到(多类)检测。
    使用Faster R-CNN模型时候实现必须定义一个新的,因为
    FasterRCNNFeatureExtractor()这个类在的所有的方法都是抽象类方法(什么抽象类方法:请自行百度吧)。
    FasterRCNNFeatureExtractor提取器和重写三种方法:
     `preprocess`,`_extract_proposal_features` (the first stage of the model), 和`_extract_box_classifier_features` (the second stage of the model). 和一个可选的方法the `restore_fn` method can be overridden.  
    preprocess(预处理); 
    _extract_proposal_features(模型的第一阶段——提取建议框的特征); 
    _extract_box_classifier_features(模型的第二阶段——对框分类时提取特征)(可选的) 
    一些重要的注意事项: 
    +批处理约定: 
    1. 这里支持批量的推断和训练,同一批次内的所有图像应具有相同的分辨率; 
    2. 批量大小通过输入张量的shape来动态地决定(而不是在模型构造器中直接指定); 
    +麻烦的是,由于非最大抑制,不能保证每个图像从第一阶段RPN(区域提案网络) 
    中得到的提案数量相同。出于这个原因,为一个批次中的每个图像的proposals给一个最大值。 
    +self.max_num_proposals这个属性在 inference time的‘first_stage_max_proposals’ 
    参数以及在训练期间通过box分类器对batch进行二次抽样时的second_stage_batch_size参 
    数中设置。 
    +按照一个批次维度为批次内的所有图像安排proposals。例如,输入 的_extract_box_classifier_features的值是一个 
    [total_num_proposals,crop_height,crop_width,depth]形状的张量; total_num_proposals是batch_size * self.max_num_proposals。 
    (并注意上面的每个注意事项,都是零填充。)
    +坐标表示:
    遵循API(参见see model.DetectionModel definition模式定义),输出之后的后处理操作总是归一化框,在内部有时转换为绝对值---例如用于损失计算。特别地,anchors 和proposal_boxes都表示为绝对坐标。
    

    这里就是model_builder.py文件讲解:
    回顾model_builder.py中使用了faster_rcnn_meta_arch.py的一个类FasterRCNNMetaArch(这是整个FasterRCNN的最终框架,流程都在这里,两个阶段包括(preprocess,predict,postprocess,loss,restore_map))。

    其实在faster_rcnn_meta_arch.py有两个类FasterRCNNFeatureExtractor和FasterRCNNMetaArch。
    FasterRCNNFeatureExtractor的所有方法会在faster_rcnn_inception_v2_feature_extractor.py等文件中会被进一步实例化,因为在faster_rcnn_meta_arch.py只有框架,model_builder.py是引入了FasterRCNNMetaArch进整个FasterRCNN流程图的构建。FasterRCNNMetaArch就是FasterRCNN框架的流程图了,不过被写成了类。
    参考:
    haixwang 的CSDN 博客
    TensorFlow Object Detection API 源码(3) builders

    相关文章

      网友评论

          本文标题:object_detectionAPI源码阅读笔记(4-mode

          本文链接:https://www.haomeiwen.com/subject/jlncaftx.html