object_detectionAPI源码阅读笔记（8-fast

作者: yanghedada | 来源:发表于2018-10-06 21:54 被阅读159次

object_detectionAPI源码阅读笔记（8-fast
object_detectionAPI源码阅读笔记（12-fas
object_detectionAPI源码阅读笔记（6-fast
object_detectionAPI源码阅读笔记（5-mode
object_detectionAPI源码阅读笔记（4-mode
object_detectionAPI源码阅读笔记（3-trai
object_detectionAPI源码阅读笔记（0--开始）
object_detectionAPI源码阅读笔记（10-voc
object_detectionAPI源码阅读笔记（7-Fast
xgboost和lda学习

faster_rcnn_inception_resnet_v2_feature_extractor.py

在object_detection\models\faster_rcnn_inception_resnet_v2_feature_extractor.py文件中只有一个类，那就是FasterRCNNInceptionResnetV2FeatureExtractor这是FasterRCNNFeatureExtractor的子类。

对于不同的CNN基础模型object_detection\models\下面有对应的feature_extractor实现。

import tensorflow as tf

from object_detection.meta_architectures import faster_rcnn_meta_arch
from nets import inception_resnet_v2

slim = tf.contrib.slim

class FasterRCNNInceptionResnetV2FeatureExtractor(
    faster_rcnn_meta_arch.FasterRCNNFeatureExtractor):
  """Faster R-CNN with Inception Resnet v2 feature extractor implementation."""

介绍FasterRCNNInceptionResnetV2FeatureExtractor的方法

import tensorflow as tf

from object_detection.meta_architectures import faster_rcnn_meta_arch
from nets import inception_resnet_v2

slim = tf.contrib.slim

class FasterRCNNInceptionResnetV2FeatureExtractor(
    faster_rcnn_meta_arch.FasterRCNNFeatureExtractor):
  """Faster R-CNN with Inception Resnet v2 feature extractor implementation."""

  def __init__(self,
               is_training,
               first_stage_features_stride,
               batch_norm_trainable=False,
               reuse_weights=None,
               weight_decay=0.0):
 
    if first_stage_features_stride != 8 and first_stage_features_stride != 16:
      raise ValueError('`first_stage_features_stride` must be 8 or 16.')
    super(FasterRCNNInceptionResnetV2FeatureExtractor, self).__init__(
        is_training, first_stage_features_stride, batch_norm_trainable,
        reuse_weights, weight_decay)

  def preprocess(self, resized_inputs):
    return (2.0 / 255.0) * resized_inputs - 1.0

  def _extract_proposal_features(self, preprocessed_inputs, scope):
    
   .........
    return rpn_feature_map

  def _extract_box_classifier_features(self, proposal_feature_maps, scope):
      ......
        return proposal_classifier_features

  def restore_from_classification_checkpoint_fn(
      self,
      first_stage_feature_extractor_scope,
      second_stage_feature_extractor_scope):
    .........
    return variables_to_restore

从上面可以看出FasterRCNNInceptionResnetV2FeatureExtractor有把四个抽象方法实现了，这些方法在object_detectionAPI源码阅读笔记（6-faster_rcnn_meta_arch.py) 中有提到。
下面就具体介绍。

init()

  def __init__(self,
               is_training,
               first_stage_features_stride,
               batch_norm_trainable=False,
               reuse_weights=None,
               weight_decay=0.0):
 """Constructor.

    Args:
      is_training: See base class.
      first_stage_features_stride: See base class.
      batch_norm_trainable: See base class.
      reuse_weights: See base class.
      weight_decay: See base class.

    Raises:
      ValueError: If `first_stage_features_stride` is not 8 or 16.
    """
    if first_stage_features_stride != 8 and first_stage_features_stride != 16:
      raise ValueError('`first_stage_features_stride` must be 8 or 16.')
    super(FasterRCNNInceptionResnetV2FeatureExtractor, self).__init__(
        is_training, first_stage_features_stride, batch_norm_trainable,
        reuse_weights, weight_decay)

初始化函数就是把FasterRCNNInceptionResnetV2FeatureExtractor和FasterRCNNInceptionResnetV2FeatureExtractor进行初始化并且first_stage_features_stride必须是第8和16的特征图，其他层的特征图会报错。

preprocess(self, resized_inputs):

def preprocess(self, resized_inputs):
  Args:
      resized_inputs: A [batch, height_in, width_in, channels] float32 tensor
        representing a batch of images with values between 0 and 255.0.

    Returns:
      preprocessed_inputs: A [batch, height_out, width_out, channels] float32
        tensor representing a batch of images.
    return (2.0 / 255.0) * resized_inputs - 1.0

这是Faster R-CNN with Inception Resnet v2 的预处理函数.将像素值映射到[-1, 1]范围（归一化）。大概就是： resized_inputs = （resized_inputs / 255）*2 - 1这样算的。这样就到[-1,1]了。

_extract_proposal_features()

def _extract_proposal_features(self, preprocessed_inputs, scope):
    '''
      Args:
      preprocessed_inputs: tensor的shape= [batch, height, width, channels]，这里的preprocessed_inputs是归一的tensor。
      scope: 变量的空间名
     Returns:
      rpn_feature_map: 输出tensor shape = [batch, height, width, depth]，将用于RPN网络进行特征提取。
    '''
    if len(preprocessed_inputs.get_shape().as_list()) != 4:
      raise ValueError('`preprocessed_inputs` must be 4 dimensional, got a '
                       'tensor of shape %s' % preprocessed_inputs.get_shape())

    with slim.arg_scope(inception_resnet_v2.inception_resnet_v2_arg_scope(
        weight_decay=self._weight_decay)):
  
      # Forces is_training to False to disable batch norm update.
      with slim.arg_scope([slim.batch_norm],
                          is_training=self._train_batch_norm):
        with tf.variable_scope('InceptionResnetV2',
                               reuse=self._reuse_weights) as scope:
       # 通过 reuse=self._reuse_weights设置
        #variable_scope可以实现同一个name_scope中的变量的共享
          rpn_feature_map, _ = (
              inception_resnet_v2.inception_resnet_v2_base(
                  preprocessed_inputs, final_endpoint='PreAuxLogits',
                  scope=scope, output_stride=self._first_stage_features_stride,
                  align_feature_maps=True))
    return rpn_feature_map

这是提取第一阶段将用于RPN的特征，返回feature map。实现faster_rcnn_meta_arch中的抽象方法使用Inception Resnet v2网络的前半部分提取特征[将用于RPN的特征]。
如果在align_feature_maps = True模式下构建网络，卷积的VALID变成SAME模式，以便特征映射对齐。

_extract_box_classifier_features():

  def _extract_box_classifier_features(self, proposal_feature_maps, scope):
    """
    提取将用于第二阶段框分类器的特征。
    这个方法重建了Inception ResNet v2的“后半部分”网络——
    `_extract_proposal_features`中定义的就是那“后半部分”。相当于原论文中的ROIPooling及其之后的层。
    Args:
      proposal_feature_maps: 用于裁剪出各个proposal的特征图
        [batch_size * self.max_num_proposals, crop_height, crop_width, depth]
      scope: A scope name.
    Returns:
      proposal_classifier_features: 分了类的proposal
        [batch_size * self.max_num_proposals, height, width, depth]
    """
    with tf.variable_scope('InceptionResnetV2', reuse=self._reuse_weights):
      with slim.arg_scope(inception_resnet_v2.inception_resnet_v2_arg_scope(
          weight_decay=self._weight_decay)):
        # Forces is_training to False to disable batch norm update.
        with slim.arg_scope([slim.batch_norm],
                            is_training=self._train_batch_norm):
          with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d],
                              stride=1, padding='SAME'):
            with tf.variable_scope('Mixed_7a'):
              with tf.variable_scope('Branch_0'):
                tower_conv = slim.conv2d(proposal_feature_maps,
                                         256, 1, scope='Conv2d_0a_1x1')
                tower_conv_1 = slim.conv2d(
                    tower_conv, 384, 3, stride=2,
                    padding='VALID', scope='Conv2d_1a_3x3')
              with tf.variable_scope('Branch_1'):
                tower_conv1 = slim.conv2d(
                    proposal_feature_maps, 256, 1, scope='Conv2d_0a_1x1')
                tower_conv1_1 = slim.conv2d(
                    tower_conv1, 288, 3, stride=2,
                    padding='VALID', scope='Conv2d_1a_3x3')
              with tf.variable_scope('Branch_2'):
                tower_conv2 = slim.conv2d(
                    proposal_feature_maps, 256, 1, scope='Conv2d_0a_1x1')
                tower_conv2_1 = slim.conv2d(tower_conv2, 288, 3,
                                            scope='Conv2d_0b_3x3')
                tower_conv2_2 = slim.conv2d(
                    tower_conv2_1, 320, 3, stride=2,
                    padding='VALID', scope='Conv2d_1a_3x3')
              with tf.variable_scope('Branch_3'):
                tower_pool = slim.max_pool2d(
                    proposal_feature_maps, 3, stride=2, padding='VALID',
                    scope='MaxPool_1a_3x3')
              net = tf.concat(
                  [tower_conv_1, tower_conv1_1, tower_conv2_2, tower_pool], 3)
            net = slim.repeat(net, 9, inception_resnet_v2.block8, scale=0.20)
            net = inception_resnet_v2.block8(net, activation_fn=None)
            proposal_classifier_features = slim.conv2d(
                net, 1536, 1, scope='Conv2d_7b_1x1')
        return proposal_classifier_features

在这个方法的末尾使用一个1x1的卷积核进行卷积生成1536通道的特征图。

注意这里的输入tensor的shape=[batch_size * self.max_num_proposals, crop_height, crop_width, depth] ，说明这里的输入已经是经过区域提取的网络，每个批次有batch_size * self.max_num_proposals张特征图。输出是tensor是proposal_classifier_features: 已经分了类的分类特征的shape=[batch_size * self.max_num_proposals, height, width, depth]

restore_from_classification_checkpoint_fn()

def restore_from_classification_checkpoint_fn(
      self,
      first_stage_feature_extractor_scope,
      second_stage_feature_extractor_scope):
  
    variables_to_restore = {}
    for variable in tf.global_variables():
      if variable.op.name.startswith(
          first_stage_feature_extractor_scope):
        var_name = variable.op.name.replace(
            first_stage_feature_extractor_scope + '/', '')
        variables_to_restore[var_name] = variable
      if variable.op.name.startswith(
          second_stage_feature_extractor_scope):
        var_name = variable.op.name.replace(
            second_stage_feature_extractor_scope
            + '/InceptionResnetV2/Repeat', 'InceptionResnetV2/Repeat_2')
        var_name = var_name.replace(
            second_stage_feature_extractor_scope + '/', '')
        variables_to_restore[var_name] = variable
    return variables_to_restore

这个方法覆盖了基类的方法，同时复用了_extract_box_classifier_features（）方法构建的命名空间，这里所用到的参数权重均是_extract_box_classifier_features（）构建的。
Args: first_stage_feature_extractor_scope: 第一阶段的命名空间 second_stage_feature_extractor_scope: 第二阶段的命名空间。 Returns: 返回了一个权重参数字典。

这个文件里面就是对特征的提取，所有提取的去处怎么用在object_detectionAPI源码阅读笔记（6-faster_rcnn_meta_arch.py)提到。
FasterRCNNInceptionResnetV2FeatureExtractor是提供材料的一个类，怎么进行预测，检测，预处理在FasterRCNNMetaArch这个类实现了。

参考：

Tensorflow开源的object detection API中的源码解析（三）

object_detectionAPI源码阅读笔记（8-fast
faster_rcnn_inception_resnet_v2_feature_extractor.py 在obj...
object_detectionAPI源码阅读笔记（12-fas
已经被这个API折磨的不行了，每天都脑壳疼。。。。。怎么才能创建一个自定义的特征提取类，我现在还没头绪。。。。。 ...
object_detectionAPI源码阅读笔记（6-fast
faster_rcnn_meta_arch.py 前面看到的modle.py/DetectionModel是所有检...
object_detectionAPI源码阅读笔记（5-mode
model.py 上一篇说到Faster R-CNN的流程图被写成类，这里就介绍目标检测的基类-----Detec...
object_detectionAPI源码阅读笔记（4-mode
model_builder.py文件上文说到这个object_detection\builders\model_...
object_detectionAPI源码阅读笔记（3-trai
本文是以Faster RCNN为脉络进行分析。 SDD等类似吧!!! 我还没看。作为一个菜鸟，阅读代码一般是从第...
object_detectionAPI源码阅读笔记（0--开始）
缘由国庆期间,想对物体检测模型挨个撸一遍，最后发现这是一个极其困难的操作。大神发的论文大都是以python2，c...
object_detectionAPI源码阅读笔记（10-voc
创建的tfcord文件是 create_my_data_tf_record.py是google object de...
object_detectionAPI源码阅读笔记（7-Fast
FasterRCNNMetaArch的详解：上篇说到init函数就是对参数的提取如下： init() Faste...
xgboost和lda学习
XGBoost 源码阅读笔记 ( 1 ) ：代码逻辑结构 XGBoost 源码阅读笔记（2）：树构造之 Exact...