美文网首页
object_detectionAPI源码阅读笔记(8-fast

object_detectionAPI源码阅读笔记(8-fast

作者: yanghedada | 来源:发表于2018-10-06 21:54 被阅读159次

    faster_rcnn_inception_resnet_v2_feature_extractor.py

    object_detection\models\faster_rcnn_inception_resnet_v2_feature_extractor.py文件中只有一个类,那就是FasterRCNNInceptionResnetV2FeatureExtractor这是FasterRCNNFeatureExtractor的子类。

    对于不同的CNN基础模型object_detection\models\下面有对应的feature_extractor实现。

    import tensorflow as tf
    
    from object_detection.meta_architectures import faster_rcnn_meta_arch
    from nets import inception_resnet_v2
    
    slim = tf.contrib.slim
    
    class FasterRCNNInceptionResnetV2FeatureExtractor(
        faster_rcnn_meta_arch.FasterRCNNFeatureExtractor):
      """Faster R-CNN with Inception Resnet v2 feature extractor implementation."""
    

    介绍FasterRCNNInceptionResnetV2FeatureExtractor的方法

    import tensorflow as tf
    
    from object_detection.meta_architectures import faster_rcnn_meta_arch
    from nets import inception_resnet_v2
    
    slim = tf.contrib.slim
    
    class FasterRCNNInceptionResnetV2FeatureExtractor(
        faster_rcnn_meta_arch.FasterRCNNFeatureExtractor):
      """Faster R-CNN with Inception Resnet v2 feature extractor implementation."""
    
      def __init__(self,
                   is_training,
                   first_stage_features_stride,
                   batch_norm_trainable=False,
                   reuse_weights=None,
                   weight_decay=0.0):
     
        if first_stage_features_stride != 8 and first_stage_features_stride != 16:
          raise ValueError('`first_stage_features_stride` must be 8 or 16.')
        super(FasterRCNNInceptionResnetV2FeatureExtractor, self).__init__(
            is_training, first_stage_features_stride, batch_norm_trainable,
            reuse_weights, weight_decay)
    
      def preprocess(self, resized_inputs):
        return (2.0 / 255.0) * resized_inputs - 1.0
    
      def _extract_proposal_features(self, preprocessed_inputs, scope):
        
       .........
        return rpn_feature_map
    
      def _extract_box_classifier_features(self, proposal_feature_maps, scope):
          ......
            return proposal_classifier_features
    
      def restore_from_classification_checkpoint_fn(
          self,
          first_stage_feature_extractor_scope,
          second_stage_feature_extractor_scope):
        .........
        return variables_to_restore
    

    从上面可以看出FasterRCNNInceptionResnetV2FeatureExtractor有把四个抽象方法实现了,这些方法在object_detectionAPI源码阅读笔记(6-faster_rcnn_meta_arch.py) 中有提到。
    下面就具体介绍。

    • init()
      def __init__(self,
                   is_training,
                   first_stage_features_stride,
                   batch_norm_trainable=False,
                   reuse_weights=None,
                   weight_decay=0.0):
     """Constructor.
    
        Args:
          is_training: See base class.
          first_stage_features_stride: See base class.
          batch_norm_trainable: See base class.
          reuse_weights: See base class.
          weight_decay: See base class.
    
        Raises:
          ValueError: If `first_stage_features_stride` is not 8 or 16.
        """
        if first_stage_features_stride != 8 and first_stage_features_stride != 16:
          raise ValueError('`first_stage_features_stride` must be 8 or 16.')
        super(FasterRCNNInceptionResnetV2FeatureExtractor, self).__init__(
            is_training, first_stage_features_stride, batch_norm_trainable,
            reuse_weights, weight_decay)
    

    初始化函数就是把FasterRCNNInceptionResnetV2FeatureExtractor和FasterRCNNInceptionResnetV2FeatureExtractor进行初始化并且first_stage_features_stride必须是第8和16的特征图,其他层的特征图会报错。

    • preprocess(self, resized_inputs):
    def preprocess(self, resized_inputs):
      Args:
          resized_inputs: A [batch, height_in, width_in, channels] float32 tensor
            representing a batch of images with values between 0 and 255.0.
    
        Returns:
          preprocessed_inputs: A [batch, height_out, width_out, channels] float32
            tensor representing a batch of images.
        return (2.0 / 255.0) * resized_inputs - 1.0
    

    这是Faster R-CNN with Inception Resnet v2 的预处理函数.将像素值映射到[-1, 1]范围(归一化)。大概就是: resized_inputs = (resized_inputs / 255)*2 - 1这样算的。这样就到[-1,1]了。

    • _extract_proposal_features()
    def _extract_proposal_features(self, preprocessed_inputs, scope):
        '''
          Args:
          preprocessed_inputs: tensor的shape= [batch, height, width, channels],这里的preprocessed_inputs是归一的tensor。
          scope: 变量的空间名
         Returns:
          rpn_feature_map: 输出tensor shape = [batch, height, width, depth],将用于RPN网络进行特征提取。
        '''
        if len(preprocessed_inputs.get_shape().as_list()) != 4:
          raise ValueError('`preprocessed_inputs` must be 4 dimensional, got a '
                           'tensor of shape %s' % preprocessed_inputs.get_shape())
    
        with slim.arg_scope(inception_resnet_v2.inception_resnet_v2_arg_scope(
            weight_decay=self._weight_decay)):
      
          # Forces is_training to False to disable batch norm update.
          with slim.arg_scope([slim.batch_norm],
                              is_training=self._train_batch_norm):
            with tf.variable_scope('InceptionResnetV2',
                                   reuse=self._reuse_weights) as scope:
           # 通过 reuse=self._reuse_weights设置
            #variable_scope可以实现同一个name_scope中的变量的共享
              rpn_feature_map, _ = (
                  inception_resnet_v2.inception_resnet_v2_base(
                      preprocessed_inputs, final_endpoint='PreAuxLogits',
                      scope=scope, output_stride=self._first_stage_features_stride,
                      align_feature_maps=True))
        return rpn_feature_map
    

    这是提取第一阶段将用于RPN的特征,返回feature map。实现faster_rcnn_meta_arch中的抽象方法使用Inception Resnet v2网络的前半部分提取特征[将用于RPN的特征]。
    如果在align_feature_maps = True模式下构建网络,卷积的VALID变成SAME模式,以便特征映射对齐。

    • _extract_box_classifier_features():
      def _extract_box_classifier_features(self, proposal_feature_maps, scope):
        """
        提取将用于第二阶段框分类器的特征。
        这个方法重建了Inception ResNet v2的“后半部分”网络——
        `_extract_proposal_features`中定义的就是那“后半部分”。相当于原论文中的ROIPooling及其之后的层。
        Args:
          proposal_feature_maps: 用于裁剪出各个proposal的特征图
            [batch_size * self.max_num_proposals, crop_height, crop_width, depth]
          scope: A scope name.
        Returns:
          proposal_classifier_features: 分了类的proposal
            [batch_size * self.max_num_proposals, height, width, depth]
        """
        with tf.variable_scope('InceptionResnetV2', reuse=self._reuse_weights):
          with slim.arg_scope(inception_resnet_v2.inception_resnet_v2_arg_scope(
              weight_decay=self._weight_decay)):
            # Forces is_training to False to disable batch norm update.
            with slim.arg_scope([slim.batch_norm],
                                is_training=self._train_batch_norm):
              with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
                                  stride=1, padding='SAME'):
                with tf.variable_scope('Mixed_7a'):
                  with tf.variable_scope('Branch_0'):
                    tower_conv = slim.conv2d(proposal_feature_maps,
                                             256, 1, scope='Conv2d_0a_1x1')
                    tower_conv_1 = slim.conv2d(
                        tower_conv, 384, 3, stride=2,
                        padding='VALID', scope='Conv2d_1a_3x3')
                  with tf.variable_scope('Branch_1'):
                    tower_conv1 = slim.conv2d(
                        proposal_feature_maps, 256, 1, scope='Conv2d_0a_1x1')
                    tower_conv1_1 = slim.conv2d(
                        tower_conv1, 288, 3, stride=2,
                        padding='VALID', scope='Conv2d_1a_3x3')
                  with tf.variable_scope('Branch_2'):
                    tower_conv2 = slim.conv2d(
                        proposal_feature_maps, 256, 1, scope='Conv2d_0a_1x1')
                    tower_conv2_1 = slim.conv2d(tower_conv2, 288, 3,
                                                scope='Conv2d_0b_3x3')
                    tower_conv2_2 = slim.conv2d(
                        tower_conv2_1, 320, 3, stride=2,
                        padding='VALID', scope='Conv2d_1a_3x3')
                  with tf.variable_scope('Branch_3'):
                    tower_pool = slim.max_pool2d(
                        proposal_feature_maps, 3, stride=2, padding='VALID',
                        scope='MaxPool_1a_3x3')
                  net = tf.concat(
                      [tower_conv_1, tower_conv1_1, tower_conv2_2, tower_pool], 3)
                net = slim.repeat(net, 9, inception_resnet_v2.block8, scale=0.20)
                net = inception_resnet_v2.block8(net, activation_fn=None)
                proposal_classifier_features = slim.conv2d(
                    net, 1536, 1, scope='Conv2d_7b_1x1')
            return proposal_classifier_features
    

    在这个方法的末尾使用一个1x1的卷积核进行卷积生成1536通道的特征图。

    注意这里的输入tensor的shape=[batch_size * self.max_num_proposals, crop_height, crop_width, depth] ,说明这里的输入已经是经过区域提取的网络,每个批次有batch_size * self.max_num_proposals张特征图。 输出是tensor是proposal_classifier_features: 已经分了类的分类特征的shape=[batch_size * self.max_num_proposals, height, width, depth]

    • restore_from_classification_checkpoint_fn()
    def restore_from_classification_checkpoint_fn(
          self,
          first_stage_feature_extractor_scope,
          second_stage_feature_extractor_scope):
      
        variables_to_restore = {}
        for variable in tf.global_variables():
          if variable.op.name.startswith(
              first_stage_feature_extractor_scope):
            var_name = variable.op.name.replace(
                first_stage_feature_extractor_scope + '/', '')
            variables_to_restore[var_name] = variable
          if variable.op.name.startswith(
              second_stage_feature_extractor_scope):
            var_name = variable.op.name.replace(
                second_stage_feature_extractor_scope
                + '/InceptionResnetV2/Repeat', 'InceptionResnetV2/Repeat_2')
            var_name = var_name.replace(
                second_stage_feature_extractor_scope + '/', '')
            variables_to_restore[var_name] = variable
        return variables_to_restore
    

    这个方法覆盖了基类的方法,同时复用了_extract_box_classifier_features()方法构建的命名空间,这里所用到的参数权重均是_extract_box_classifier_features()构建的。
    Args: first_stage_feature_extractor_scope: 第一阶段的命名空间 second_stage_feature_extractor_scope: 第二阶段的命名空间。 Returns: 返回了一个权重参数字典。

    这个文件里面就是对特征的提取,所有提取的去处怎么用在object_detectionAPI源码阅读笔记(6-faster_rcnn_meta_arch.py)提到。
    FasterRCNNInceptionResnetV2FeatureExtractor是提供材料的一个类,怎么进行预测,检测,预处理在FasterRCNNMetaArch这个类实现了。

    参考:

    Tensorflow开源的object detection API中的源码解析(三)

    相关文章

      网友评论

          本文标题:object_detectionAPI源码阅读笔记(8-fast

          本文链接:https://www.haomeiwen.com/subject/tnjhaftx.html