FPN源码代码调试

作者: yanghedada | 来源:发表于2018-11-26 16:09 被阅读205次

由于特属原因，需要调试这里的代码。。。

制作数据 voc

使用convert_data_to_tfrecord.py，制作tfrecord数据。

tf.app.flags.DEFINE_string('VOC_dir', 'data/{}/'.format(cfgs.DATASET_NAME), 'Voc dir')
tf.app.flags.DEFINE_string('xml_dir', 'Annotations', 'xml dir')
tf.app.flags.DEFINE_string('image_dir', 'JPEGImages', 'image dir')
tf.app.flags.DEFINE_string('save_name', 'train', 'save name')
tf.app.flags.DEFINE_string('save_dir', cfgs.ROOT_PATH + '/data/tfrecords/', 'save name')
tf.app.flags.DEFINE_string('img_format', '.jpg', 'format of image')

这里的参数就是数据目录
pascal就是存放数据的地方

修改141行

 'img_name': _bytes_feature(img_name.encode('utf-8')),

执行如下，其他地方我没改

python convert_data_to_tfrecord.py

得到train.tfrecod

把数据分成两份就可以得到test和train数据

开始trian

VOC 的数据只有20分类。
修改改config.py，如下。文件默认和voc的不一样。

NET_NAME = 'resnet_v1_101'
DATASET_NAME = 'pascal'
VERSION = 'v1_{}'.format(DATASET_NAME)
CLASS_NUM = 20  # 700 exclude background

无预训练模型训练

修改train.py，注释掉如下部分开始训练

    #restorer, restore_ckpt = restore_model.get_restorer(test=False)
    saver = tf.train.Saver(max_to_keep=3)

    config = tf.ConfigProto()
    # config.gpu_options.per_process_gpu_memory_fraction = 0.5
    config.gpu_options.allow_growth = True
    with tf.Session(config=config) as sess:
      sess.run(init_op)
      # if not restorer is None:
      #   restorer.restore(sess, restore_ckpt)
      #   print('restore model')

在trian.py修改保存路径

      summary_path = os.path.join('../output/{}'.format(cfgs.DATASET_NAME),
                                  FLAGS.summary_path, cfgs.VERSION)

........................


          save_dir = os.path.join('../output/{}'.format(cfgs.DATASET_NAME),
                                  FLAGS.trained_checkpoint, cfgs.VERSION)

这时候就会在根目录得到output的目录。

训练

python train.py

如下：

得到训练参数：

loss图：

从预训练参数开始训练

修改config_res101.py的参数

tf.app.flags.DEFINE_string(
    'pretrained_model_path',
    '../output/pascal/res101_trained_weights/v1_pascal/pascal_500model.ckpt',
    #YangHe_MyCode/FPN_TensorFlow-master/output/pascal/res101_trained_weights
    # 'output-1/res101_trained_weights/v1_layer/voc_50000model.ckpt',
    'the path of pretrained weights'
)

同时，把前面注释的代码还原:

    restorer, restore_ckpt = restore_model.get_restorer(test=False)
    saver = tf.train.Saver(max_to_keep=3)

    config = tf.ConfigProto()
    # config.gpu_options.per_process_gpu_memory_fraction = 0.5
    config.gpu_options.allow_growth = True
    with tf.Session(config=config) as sess:
      sess.run(init_op)
      if not restorer is None:
        restorer.restore(sess, restore_ckpt)
        print('restore model')

就可以训练了

python train.py

evla.py

对模型进行评估，这里的没法在命令行参数输入(本人太Low)，So。。。
只能设置默认参数。。

如下：

注意：这里使用的数据就是刚才得到pascal_test.tfrecord
OK了。

这里有两个参数没用我就删掉了,src_folder,des_folder被我删了。

  parser = argparse.ArgumentParser(description='Evaluate a trained FPN model')
  parser.add_argument('--weights', dest='weights',
                      help='model path',
                      default='../output/pascal/res101_trained_weights/v1_pascal/pascal_500model.ckpt',
                      type=str)
  parser.add_argument('--img_num', dest='img_num',
                      help='image numbers',
                      default=20, type=int)

这里需要设置预训练参数。。。。
修改pickle的读写吧，这里有BUG。

eval.py的pickle写

eval.py的pickle读


  fr1 = open('predict_dict.pkl', 'rb')
  fr2 = open('gtboxes_dict.pkl', 'rb')

  predict_dict = pickle.load(fr1,encoding='iso-8859-1')
  gtboxes_dict = pickle.load(fr2,encoding='iso-8859-1')

再修改：

这里加一个if len(rboxes) != 0的判断，防止rec[-1]报错.

执行这里eval.py

python eval.py

这里会得到两个文件gtboxes_dict.pkl和predict_dict.pkl。

这里得到map

来个mAP大写：

你没看错,这里mAP =0。。。。。。

出错

2018-11-27 09:33:31: step2050 image_name:b'2008_008546.jpg'
                     rpn_loc_loss:0.2765 | rpn_cla_loss:0.2248 | rpn_total_loss:0.5013
                     fast_rcnn_loc_loss:0.0773 | fast_rcnn_cla_loss:0.2025 | fast_rcnn_total_loss:0.2797
                     added_loss:0.7811 | total_loss:6.3684 | pre_cost_time:0.3456s
2018-11-27 09:33:48: step2100 image_name:b'2008_008745.jpg'
                     rpn_loc_loss:0.2128 | rpn_cla_loss:0.1975 | rpn_total_loss:0.4103
                     fast_rcnn_loc_loss:0.1009 | fast_rcnn_cla_loss:0.0907 | fast_rcnn_total_loss:0.1916
                     added_loss:0.6019 | total_loss:6.1892 | pre_cost_time:0.3453s
2018-11-27 09:34:10: step2150 image_name:b'2009_000161.jpg'
                     rpn_loc_loss:nan | rpn_cla_loss:0.6903 | rpn_total_loss:nan
                     fast_rcnn_loc_loss:nan | fast_rcnn_cla_loss:3.0358 | fast_rcnn_total_loss:nan
                     added_loss:nan | total_loss:10000000000.0000 | pre_cost_time:0.3478s
2018-11-27 09:34:29: step2200 image_name:b'2009_000393.jpg'
                     rpn_loc_loss:nan | rpn_cla_loss:0.6902 | rpn_total_loss:nan
                     fast_rcnn_loc_loss:nan | fast_rcnn_cla_loss:3.0358 | fast_rcnn_total_loss:nan
                     added_loss:nan | total_loss:10000000000.0000 | pre_cost_time:0.3507s

loss出现Nan，是因为这里的loc_loss有错误吧

修改loss

def l1_smooth_losses(predict_boxes, gtboxes, object_weights, classes_weights=None):
  '''
  :param predict_boxes: [minibatch_size, -1]
  :param gtboxes: [minibatch_size, -1]
  :param object_weights: [minibatch_size, ]. 1.0 represent object, 0.0 represent others(ignored or background)
  :return:
  '''
  diff = predict_boxes - gtboxes
  abs_diff = tf.cast(tf.abs(diff), tf.float32)

  if classes_weights is None:
    '''
    first_stage:
    predict_boxes :[minibatch_size, 4]
    gtboxes: [minibatchs_size, 4]
    '''
    anchorwise_smooth_l1norm = tf.reduce_sum(
        tf.where(tf.less(abs_diff, 1), 0.5 * tf.square(abs_diff), abs_diff - 0.5), axis=1) * object_weights
  else:
    '''
    fast_rcnn:
    predict_boxes: [minibatch_size, 4*num_classes]
    gtboxes: [minibatch_size, 4*num_classes]
    classes_weights : [minibatch_size, 4*num_classes]
    '''
    anchorwise_smooth_l1norm = tf.reduce_sum(tf.where(tf.less(abs_diff, 1), 0.5*tf.square(
        abs_diff)*classes_weights, (abs_diff - 0.5)*classes_weights), axis=1)*object_weights
  anchorwise_smooth_l1norm = tf.clip_by_value(anchorwise_smooth_l1norm, 1e-10, 1e10)
  return tf.reduce_mean(anchorwise_smooth_l1norm, axis=0)  # reduce mean

就是加这句：

tf.clip_by_value(anchorwise_smooth_l1norm, 1e-10, 1e10)
  return tf.reduce_mean(anchorwise_smooth_l1norm, axis=0)

虽然signed integer is less than minimum

2018-11-27 11:00:27: step1150 image_name:b'2008_004903.jpg'
                     rpn_loc_loss:0.1078 | rpn_cla_loss:0.1171 | rpn_total_loss:0.2249
                     fast_rcnn_loc_loss:0.0625 | fast_rcnn_cla_loss:0.0605 | fast_rcnn_total_loss:0.1230
                     added_loss:0.3479 | total_loss:5.9376 | pre_cost_time:0.4136s
signed integer is less than minimum
signed integer is less than minimum
2018-11-27 11:00:49: step1200 image_name:b'2008_005101.jpg'
                     rpn_loc_loss:0.3705 | rpn_cla_loss:0.2455 | rpn_total_loss:0.6160
                     fast_rcnn_loc_loss:0.0000 | fast_rcnn_cla_loss:0.0069 | fast_rcnn_total_loss:0.0069
                     added_loss:0.6230 | total_loss:6.2127 | pre_cost_time:0.3253s
2018-11-27 11:01:11: step1250 image_name:b'2008_005321.jpg'
                     rpn_loc_loss:0.0295 | rpn_cla_loss:0.0396 | rpn_total_loss:0.0692
                     fast_rcnn_loc_loss:0.0170 | fast_rcnn_cla_loss:0.0240 | fast_rcnn_total_loss:0.0410
                     added_loss:0.1102 | total_loss:5.6999 | pre_cost_time:0.3905s
2018-11-27 11:01:33: step1300 image_name:b'2008_005514.jpg'
                     rpn_loc_loss:0.0564 | rpn_cla_loss:0.0632 | rpn_total_loss:0.1196
                     fast_rcnn_loc_loss:0.0000 | fast_rcnn_cla_loss:0.0057 | fast_rcnn_total_loss:0.0057
                     added_loss:0.1253 | total_los

改成这样

def l1_smooth_losses(predict_boxes, gtboxes, object_weights, classes_weights=None):
  '''
  :param predict_boxes: [minibatch_size, -1]
  :param gtboxes: [minibatch_size, -1]
  :param object_weights: [minibatch_size, ]. 1.0 represent object, 0.0 represent others(ignored or background)
  :return:
  '''
  diff = predict_boxes - gtboxes
  abs_diff = tf.cast(tf.abs(diff), tf.float32)
  abs_diff = tf.clip_by_value(abs_diff, 1e-5, 1e2)  # clip
  if classes_weights is None:
    '''
    first_stage:
    predict_boxes :[minibatch_size, 4]
    gtboxes: [minibatchs_size, 4]
    '''
    anchorwise_smooth_l1norm = tf.reduce_sum(
        tf.where(tf.less(abs_diff, 1), 0.5 * tf.square(abs_diff), abs_diff - 0.5), axis=1) * object_weights
  else:
    '''
    fast_rcnn:
    predict_boxes: [minibatch_size, 4*num_classes]
    gtboxes: [minibatch_size, 4*num_classes]
    classes_weights : [minibatch_size, 4*num_classes]
    '''
    anchorwise_smooth_l1norm = tf.reduce_sum(tf.where(tf.less(abs_diff, 1), 0.5*tf.square(
        abs_diff)*classes_weights, (abs_diff - 0.5)*classes_weights), axis=1)*object_weights

  return tf.reduce_mean(anchorwise_smooth_l1norm, axis=0)  # reduce mean

这样至少不会报错

在检查这里的数据的时候发现这里的数据制作出错了

训练数据完全标错了

修改：

# -*- coding: utf-8 -*-
from __future__ import division, print_function, absolute_import
import sys
sys.path.append('../../')
import xml.etree.cElementTree as ET
import numpy as np
import tensorflow as tf
import glob
import cv2
from help_utils.tools import *
from libs.label_name_dict.label_dict import *
from lxml import etree
'''
python convert_data_to_tfrecord.py --VOC_dir=VOCdevkit_train/ --save_name=train  --dataset=pascal --dataset=pascal
'''

tf.app.flags.DEFINE_string('VOC_dir', None, 'Voc dir')
tf.app.flags.DEFINE_string('xml_dir', 'Annotations', 'xml dir')
tf.app.flags.DEFINE_string('image_dir', 'JPEGImages', 'image dir')
tf.app.flags.DEFINE_string('save_name', 'train', 'save name')
tf.app.flags.DEFINE_string('save_dir', cfgs.ROOT_PATH + '/data/tfrecords/', 'save name')
tf.app.flags.DEFINE_string('img_format', '.jpg', 'format of image')
tf.app.flags.DEFINE_string('dataset', 'car', 'dataset')
FLAGS = tf.app.flags.FLAGS


def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))


def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))



def recursive_parse_xml_to_dict(xml):
  """Recursively parses XML contents to python dict.

  We assume that `object` tags are the only ones that can appear
  multiple times at the same level of a tree.

  Args:
    xml: xml tree obtained by parsing XML file contents using lxml.etree

  Returns:
    Python dictionary holding XML contents.
  """
  if not xml:
    return {xml.tag: xml.text}
  result = {}
  for child in xml:
    child_result = recursive_parse_xml_to_dict(child)
    if child.tag != 'object':
      result[child.tag] = child_result[child.tag]
    else:
      if child.tag not in result:
        result[child.tag] = []
      result[child.tag].append(child_result[child.tag])
  return {xml.tag: result}



def read_xml_gtbox_and_label(xml_path):

    """
    :param xml_path: the path of voc xml
    :return: a list contains gtboxes and labels, shape is [num_of_gtboxes, 5],
           and has [xmin, ymin, xmax, ymax, label] in a per row
    """
    box_list = []
    with open(xml_path,) as f:
        xml_str = f.read()
        #show_all_image_test()
        xml = etree.fromstring(xml_str)
        data = recursive_parse_xml_to_dict(xml)['annotation']
    img_width = int(data['size']['width'])
    img_height = int(data['size']['height'])

    for obj in data['object']:
        xmin = int(obj['bndbox']['xmin'])
        ymin = int(obj['bndbox']['ymin'])
        ymax = int(obj['bndbox']['ymax'])
        xmax = int(obj['bndbox']['xmax'])
        label = NAME_LABEL_MAP[obj['name']]
        box_list.append([ymin, xmin, ymax, xmax, label])
    gtbox_label = np.array(box_list, dtype=np.int32)
    ymin, xmin, ymax, xmax, label = gtbox_label[:, 0], gtbox_label[:, 1], gtbox_label[:, 2], gtbox_label[:, 3], \
                                    gtbox_label[:, 4]
    xmin = np.where(xmin <= 0, 0, xmin)
    ymin = np.where(ymin <= 0, 0, ymin)
    xmax = np.where(xmax >= img_width, img_width , xmax)
    ymax = np.where(ymax >= img_height, img_height, ymax)
    gtbox_label = np.transpose(np.stack([ymin, xmin, ymax, xmax, label], axis=0))  # [ymin, xmin, ymax, xmax, label]

    return img_height, img_width, gtbox_label


def convert_pascal_to_tfrecord():
    xml_path = FLAGS.VOC_dir + FLAGS.xml_dir
    image_path = FLAGS.VOC_dir + FLAGS.image_dir
    save_path = FLAGS.save_dir + FLAGS.dataset + '_' + FLAGS.save_name + '.tfrecord'
    mkdir(FLAGS.save_dir)

    # writer_options = tf.python_io.TFRecordOptions(tf.python_io.TFRecordCompressionType.ZLIB)
    # writer = tf.python_io.TFRecordWriter(path=save_path, options=writer_options)
    writer = tf.python_io.TFRecordWriter(path=save_path)

    for count, xml in enumerate(glob.glob(xml_path + '/*.xml')):
        # to avoid path error in different development platform
        xml = xml.replace('\\', '/')

        img_name = xml.split('/')[-1].split('.')[0] + FLAGS.img_format
        img_path = image_path + '/' + img_name

        if not os.path.exists(img_path):
            print('{} is not exist!'.format(img_path))
            continue

        img_height, img_width, gtbox_label = read_xml_gtbox_and_label(xml)
        # img = np.array(Image.open(img_path))
        img = cv2.imread(img_path)

        feature = tf.train.Features(feature={
            # maybe do not need encode() in linux
            'img_name': _bytes_feature(img_name.encode('utf8')),
            'img_height': _int64_feature(img_height),
            'img_width': _int64_feature(img_width),
            'img': _bytes_feature(img.tostring()),
            'gtboxes_and_label': _bytes_feature(gtbox_label.tostring()),
            'num_objects': _int64_feature(gtbox_label.shape[0])
        })

        example = tf.train.Example(features=feature)

        writer.write(example.SerializeToString())

        view_bar('Conversion progress', count + 1, len(glob.glob(xml_path + '/*.xml')))

    print('\nConversion is complete!')


def show_all_image_test():
    NAME_LABEL = list(NAME_LABEL_MAP.keys())
    xml_path = 'VOCdevkit_train/' + FLAGS.xml_dir
    image_path = 'VOCdevkit_train/'+ FLAGS.image_dir
    for count, xml in enumerate(glob.glob(xml_path + '/*.xml')):
        # to avoid path error in different development platform
        xml = xml.replace('\\', '/')

        img_name = xml.split('/')[-1].split('.')[0] + FLAGS.img_format
        img_path = image_path + '/' + img_name

        if not os.path.exists(img_path):
            print('{} is not exist!'.format(img_path))
            continue

        img_height, img_width, gtbox_label = read_xml_gtbox_and_label(xml)
        image = cv2.imread(img_path)
        for i in range(len(gtbox_label)):
            object = gtbox_label[i]
            ymin, xmin, ymax, xmax, label = object
            image = cv2.rectangle(image, (object[1], object[0]),
                                   (object[3], object[2]),
                                   color=(0, 255, 0))
            cv2.putText(image,
                        text=str(len(gtbox_label)),
                        org=((image.shape[1]) // 2, (image.shape[0]) // 2),
                        fontFace=3,
                        fontScale=1,
                        color=(255, 0, 0))
            if ymin <= 0 or  xmin <= 0 or ymax >= img_height or xmax>=img_width:
                cv2.putText(image,
                            text='error',
                            org=((image.shape[1]) // 2, (image.shape[0]) // 2),
                            fontFace=3,
                            fontScale=1,
                            color=(255, 0, 0))
            else:
                cv2.putText(image,
                            text=str(NAME_LABEL[object[4]]),
                            org=(object[1], object[0] + 10),
                            fontFace=1,
                            fontScale=1,
                            thickness=2,
                            color=(255, 0, 0))
            cv2.imshow("s", image)
            cv2.waitKey(500)
if __name__ == '__main__':
    # xml_path = 'VOCdevkit_test/Annotations/2008_000082.xml'
    # print(read_xml_gtbox_and_label(xml_path))  # show xml
    show_all_image_test() # show label and image in plt

    #convert_pascal_to_tfrecord() # craete the datasets

更新

作者的 mAP的代码也是比完整。来源是facebookresearch的评估方法。github。所以不用担心mAP不够官方了。
这里有一份关于mAP代码解释。

作者推荐的faster rcnn 调试

faster rcnn 调试的问题

更新

今天，作者更新了FPN的代码，代码的风格和Faster R-CNN 一脉相承。
FPN_Tensorflow

这里的数据制作与原来的数据一点点区别：

    gtbox_label = np.transpose(np.stack([xmin,ymin,xmax,ymax, label], axis=0))
    # FPN  old is [ymin, xmin, ymax, xmax, label]
    # FPN new  is [xmin,ymin,xmax,ymax, label]
    # Faster rcnn [xmin,ymin,xmax,ymax, label]
    return img_height, img_width, gtbox_label

新版的数据格式FPN　和　Faster rcnn一样的。

VOC提交结果

这是使用预训练数据进行训练之后的一个评估系统。

http://host.robots.ox.ac.uk/anonymous/RXNDLK.html

效果很好的

参考:

目标检测：损失函数之SmoothL1Loss

FPN源码代码调试

制作数据 voc

开始trian

evla.py

出错

在检查这里的数据的时候发现这里的数据制作出错了

更新

作者推荐的faster rcnn 调试

更新

VOC提交结果

参考:

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

FPN源码 代码调试

制作数据 voc

开始trian

evla.py

出错

在检查这里的数据的时候发现这里的数据制作出错了

更新

作者推荐的faster rcnn 调试

更新

VOC提交结果

参考:

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

FPN源码代码调试