string(20761) "
<h1>Tensorflow 环境搭建</h1>
<blockquote>
<p>Windows GPU 版安装</p></blockquote>
<h2>依赖软件包</h2>
<ul>
<li>Tensorflow 1.5.0/1.6.0</li>
<li><a href="https://developer.nvidia.com/cuda-toolkit-archive" target="_blank" rel="nofollow">Cuda v9.0</a></li>
<li><a href="(https://developer.nvidia.com/rdp/cudnn-archive#a-collapse705-9)" target="_blank" rel="nofollow">cudnn v7.0.5 for cuda 9.0</a></li>
</ul>
<blockquote>
<p>cuDNN v7.0.5 解压后将文件(bin、include、lib)拷贝到 CUDA 安装目录(NVIDIA GPU Computing Toolkit/CUDA/v9.0)下</p></blockquote>
<blockquote>
<p>各个版本需要保持一致,不然会存在版本不一致问题,注意选择正确的系统版本</p></blockquote>
<h2>python 环境安装(训练环境/开发环境)</h2>
<ul>
<li><a href="https://www.anaconda.com/download/" target="_blank" rel="nofollow">Anaconda</a></li>
<li><a href="https://github.com/google/protobuf/releases/download/v3.4.0/protoc-3.4.0-win32.zip" target="_blank" rel="nofollow">protoc</a></li>
</ul>
<p>训练环境建议安装 Anaconda , 它是一个流行的进行数据科学研究的 python 平台,预安装了很多库,可以很方便的管理多个版本的 python 环境,实现 python 环境的自由切换</p>
<p>Tensorflow 底层使用了 gRPC 框架,使用 Protocol Buffers 数据交换协议,protoc 工具是一个编译器,可以很方便将 proto 协议文件编译成供多个语言版本使用</p>
<blockquote>
<p>此处使用 3.4.0 版本,新版本编译命令可能不同,为避免后续出现错误,可以直接使用 3.4.0 版本</p></blockquote>
<p>安装</p>
<ol>
<li>下载<a href="https://www.anaconda.com/download/" target="_blank" rel="nofollow">Anaconda</a>并安装</li>
<li>配置环境变量 <code>安装目录\Anaconda3;安装目录\Anaconda3\Scripts;安装目录\Anaconda3\Library\bin;</code> 到 path( 系统环境变量)中;<br>
配置国内源</li>
</ol>
<pre><code class="python">  # 添加Anaconda的TUNA镜像
  conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
  # 设置搜索时显示通道地址
  conda config --set show_channel_urls yes
</code></pre>
<ol start="3">
<li>安装 python 环境</li>
</ol>
<pre><code>#查看系统当前已有的Python环境,
conda info --envs
#安装指定版本的 python 环境
conda create --name py35 python=3.5
#切换 python 环境
activate py35
#切回原来的Python环境
deactivate py35
#删除环境
conda remove --name py35 --all
</code></pre>
<h2>python 3 环境下的 Tensorflow 安装</h2>
<p>install Tensorflow</p>
<pre><code class="shell"># For CPU
pip install tensorflow==1.6
# For GPU
pip install tensorflow-gpu==1.6
</code></pre>
<p>users can install dependencies using pip:</p>
<pre><code class="shell">pip install Cython
pip install pillow
pip install lxml
pip install jupyter
pip install matplotlib
</code></pre>
<h2>模型训练项目的编译准备</h2>
<ul>
<li>Protobuf Compilation</li>
</ul>
<pre><code>protoc object_detection/protos/*.proto --python_out=.
</code></pre>
<ul>
<li>Add Libraries to PYTHONPATH</li>
</ul>
<pre><code>1. 在你的Anaconda3安装路/Anaconda3/Lib/site-packages 下新建一个txt文件 
(我这里的安装路径是C:\ProgramData\Anaconda3\Lib\site-packages);如果安装有其他 python 环境,则在对应的环境目录(Anaconda3\envs\py35\Lib\site-packages)下新建一个txt文件 。

2. 在新建的txt文件中写入自己对应的 Tensorflow object_detection 工程的目录路径:
F:\project\project
F:\project\project\slim

3. 将文件名改为 tensorflow_model.pth (注意这里的后缀一定要以pth结尾)
</code></pre>
<ul>
<li>Testing the Installation</li>
</ul>
<pre><code>#From tensorflow/models/research/
python object_detection/builders/model_builder_test.py
</code></pre>
<hr>
<h2>模型训练</h2>
<h3>样本标注</h3>
<p>使用 <a href="https://github.com/tzutalin/labelImg" target="_blank" rel="nofollow">label_images</a>  工具用于标记图片,生成 Pascal voc 格式的 标注文件</p>
<h3>生成 tensorflow 支持的 tfrecord 文件</h3>
<p>工作目录结构</p>
<pre><code>|- template
|  |- annotations (标注文件)
|  |- images (样本图片)
|  |- label_maps
|  |  |- *.pbtxt (标注映射文件,id 从 1 开始)
</code></pre>
<p>脚本工具 - tfrecord_util.py 【python 3 环境】</p>
<pre><code>
import os
import io
import tensorflow as tf

from PIL import Image

from object_detection.utils import dataset_util
from object_detection.utils import label_map_util
from collections import namedtuple
import glob
import pandas as pd
import xml.etree.ElementTree as ET


current_path = 'template所在目录'
train_path = os.path.join(current_path, "template")
# 图片标注文件目录
annotations_dir = os.path.join(train_path, "annotations")
# 图片目录
images_path = os.path.join(train_path, "images")
# 映射文件
labels_path = os.path.join(train_path, "label_maps")
labels_file = os.path.join(labels_path, "mscoco_label_map.pbtxt")
# csv 文件(全路径)
csv_file = os.path.join(train_path, "temp_csv_name.csv")
# record 文件(全路径)
tf_record_file = os.path.join(train_path, "tf_record_file.record")
# ---------------------------------------------------------------------- xml operator

def xml_to_csv(path):
    xml_list = []
    for xml_file in glob.glob(path + '/*.xml'):
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall('object'):
            # if member[0].text != 'a_hn_101':
            #     continue

            file_path = root.find('path').text
            filename = file_path.split("/")[-1].split("\\")[-1]
            value = (filename,
                     int(root.find('size')[0].text),
                     int(root.find('size')[1].text),
                     member[0].text,
                     int(member[4][0].text),
                     int(member[4][1].text),
                     int(member[4][2].text),
                     int(member[4][3].text)
                     )

            xml_list.append(value)
    column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
    xml_df = pd.DataFrame(xml_list, columns=column_name)
    return xml_df

# ---------------------------------------------------------------------- tfrecord operator


classes_num = 100

label_map = label_map_util.load_labelmap(labels_file)
print("success loading label map file["+str(labels_file)+"]")
# print('\n-------------label_map------------------\n')
# print(label_map)
# categories array [{'id':id,'name':name},···]
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=classes_num, use_display_name=True)
# category_index  dic  {id : {'id':id,'name':name}, ···}
# category_index = label_map_util.create_category_index(categories)

# category_index  dic  {name : {'id':id,'name':name}, ···}
category_index = {}
for cat in categories:
    category_index[cat['name']] = cat
print(category_index)
print("success generating categories dic")


def class_text_to_int(row_label):
    if row_label in category_index.keys():
        # print(str(category_index[row_label]['id']))
        return category_index[row_label]['id']
    else:
        # print(row_label)
        return 0


def split(df, group):
    data = namedtuple('data', ['filename', 'object'])
    gb = df.groupby(group)
    return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]


def create_tf_example(group, path):

    with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
        encoded_jpg = fid.read()
    encoded_jpg_io = io.BytesIO(encoded_jpg)
    image = Image.open(encoded_jpg_io)
    width, height = image.size

    filename = group.filename.encode('utf8')
    # image_format = b'jpg'
    if image.format != 'JPEG':
        print(group.filename)
        raise ValueError('Image format not JPEG')
    else:
        image_format = b'jpg'
    xmins = []
    xmaxs = []
    ymins = []
    ymaxs = []
    classes_text = []
    classes = []

    for index, row in group.object.iterrows():

        if class_text_to_int(row['class']) == 0:
            print(group.filename)
            # print(row['class'].encode('utf8'))
            continue
        xmins.append(row['xmin'] / width)
        xmaxs.append(row['xmax'] / width)
        ymins.append(row['ymin'] / height)
        ymaxs.append(row['ymax'] / height)
        classes_text.append(row['class'].encode('utf8'))
        classes.append(class_text_to_int(row['class']))

    tf_example = tf.train.Example(features=tf.train.Features(feature={
        'image/height': dataset_util.int64_feature(height),
        'image/width': dataset_util.int64_feature(width),
        'image/filename': dataset_util.bytes_feature(filename),
        'image/source_id': dataset_util.bytes_feature(filename),
        'image/encoded': dataset_util.bytes_feature(encoded_jpg),
        'image/format': dataset_util.bytes_feature(image_format),
        'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
        'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
        'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
        'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
        'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
        'image/object/class/label': dataset_util.int64_list_feature(classes),
    }))
    return tf_example

# ----------------------------------------------------------------------


def generate_tf_record_file(recreate=True):
    """
    generate the tensorflow record file from label xml files which belongs sample images
    :param recreate:  if create a new record file
    :return:  tf_record_file path
    """
    if recreate:
        # 1. 读取图片标注文件目录下的所有 xml 文件,并转化为 csv 文件
        xml_df = xml_to_csv(annotations_dir)
        xml_df.to_csv(csv_file, index=None)
        print('Successfully converted xml['+str(annotations_dir)+'] to csv['+str(csv_file)+'].')

        print(csv_file)
        # 2. 将 csv 文件转 record 文件
        examples = pd.read_csv(csv_file)
        grouped = split(examples, 'filename')

        writer = tf.python_io.TFRecordWriter(tf_record_file)
        for group in grouped:
            try:
                tf_example = create_tf_example(group, images_path)
            except:
                print(group.filename)
                continue
            writer.write(tf_example.SerializeToString())
        writer.close()
        print('Successfully created the TFRecords: {}'.format(tf_record_file))
        return tf_record_file

    else:
        # TODO - look up the exist file
        return None

def main(_):
    my_tf_record_file = generate_tf_record_file()
    print(my_tf_record_file)

if __name__ == '__main__':
    tf.app.run()
</code></pre>
<h2>模型训练相关配置</h2>
<p>配置文件  faster_rcnn_resnet101.config</p>
<pre><code># Faster R-CNN with Resnet-101 (v1) configuration for MSCOCO Dataset.
# Users should configure the fine_tune_checkpoint field in the train config as
# well as the label_map_path and input_path fields in the train_input_reader and
# eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
# should be configured.

model {
  faster_rcnn {
    num_classes: 23
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 1024
        max_dimension: 1280  
      }
    }
    feature_extractor {
      type: 'faster_rcnn_resnet101'
      first_stage_features_stride: 16
    }
    first_stage_anchor_generator {
      grid_anchor_generator {
        scales: [0.25, 0.5, 1.0, 2.0]
        aspect_ratios: [0.5, 1.0, 2.0]
        height_stride: 16
        width_stride: 16
      }
    }
    first_stage_box_predictor_conv_hyperparams {
      op: CONV
      regularizer {
        l2_regularizer {
          weight: 0.0
        }
      }
      initializer {
        truncated_normal_initializer {
          stddev: 0.01
        }
      }
    }
    first_stage_nms_score_threshold: 0.0
    first_stage_nms_iou_threshold: 0.6
    first_stage_max_proposals: 400
    first_stage_localization_loss_weight: 2.0
    first_stage_objectness_loss_weight: 1.0
    initial_crop_size: 14
    maxpool_kernel_size: 2
    maxpool_stride: 2
    second_stage_box_predictor {
      mask_rcnn_box_predictor {
        use_dropout: false
        dropout_keep_probability: 1.0
        fc_hyperparams {
          op: FC
          regularizer {
            l2_regularizer {
              weight: 0.0
            }
          }
          initializer {
            variance_scaling_initializer {
              factor: 1.0
              uniform: true
              mode: FAN_AVG
            }
          }
        }
      }
    }
    second_stage_post_processing {
      batch_non_max_suppression {
        score_threshold: 0.0
        iou_threshold: 0.7
        max_detections_per_class: 100
        max_total_detections: 300
      }
      score_converter: SOFTMAX
    }
    second_stage_localization_loss_weight: 2.0
    second_stage_classification_loss_weight: 1.0
  }
}

train_config: {
  batch_size: 1
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        manual_step_learning_rate {
          initial_learning_rate: 0.0002
          schedule {
            step: 900000
            learning_rate: .00003
          }
          schedule {
            step: 1200000
            learning_rate: .000003
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  gradient_clipping_by_norm: 10.0
  # fine_tune_checkpoint: "F:/project/project/faster_rcnn_resnet101_coco_2018_01_28/model.ckpt"
  # from_detection_checkpoint: true
  # Note: The below line limits the training process to 200K steps, which we
  # empirically found to be sufficient enough to train the pets dataset. This
  # effectively bypasses the learning rate schedule (the learning rate will
  # never decay). Remove the below line to train indefinitely.
  #num_steps: 10000
  data_augmentation_options {
    random_adjust_brightness {
      max_delta: 0.1
    }
  }
  data_augmentation_options {
    random_image_scale {
      min_scale_ratio: 0.8
      max_scale_ratio: 1.2
    }
  }
  #data_augmentation_options {
  #  random_crop_to_aspect_ratio {
  #  }
  #}

  #data_augmentation_options {
  #  random_adjust_contrast {
  #      min_delta: 0.5
  #      max_delta: 1.5
  #  }
  #}
  #data_augmentation_options {
  #  random_adjust_saturation {
  #    min_delta: 0.5
  #    max_delta: 1.5
  #  }
  #}
}

train_input_reader: {
  tf_record_input_reader {
    input_path: "D:/Workspace/train_dir/all/tf_record_file_23_3035_20180724.record"
  }
  label_map_path: "D:/Workspace/train_dir/all/mscoco_label_map_23.pbtxt"
  shuffle: true
}

eval_config: {
  # num_examples: 1
  num_visualizations: 200
  # Note: The below line limits the evaluation process to 10 evaluations.
  # Remove the below line to evaluate indefinitely.
  max_evals: 2
  visualization_export_dir: "D:/Workspace/train_dir/all/20180724/eval/exportfrcnn"
}

eval_input_reader: {
  tf_record_input_reader {
    input_path: "D:/Workspace/train_dir/all/tf_record_file_23_3035_20180724_eval.record"
  }
  label_map_path: "D:/Workspace/train_dir/all/mscoco_label_map_23.pbtxt"
  shuffle: true
  num_readers: 5
  num_epochs: 1
}
</code></pre>
<p>配置文件主要分为 5 个部分:</p>
<ul>
<li>model :定义 神经网络模型结构,及相关超参数</li>
<li>train_config: 训练相关配置</li>
<li>train_input_reader: 训练样本输入相关配置</li>
<li>eval_config: 模型评估相关配置</li>
<li>eval_input_reader:评估样本输入相关配置</li>
</ul>
<h3>model 部分</h3>
<ul>
<li>num_classes 对应待检测物体的总数(一共有多少个标注样本)</li>
<li>keep_aspect_ratio_resizer.min_dimension、keep_aspect_ratio_resizer.max_dimension 控制样本输入缩放后的大小</li>
<li>feature_extractor.first_stage_features_stride  第一阶段特征提取步长,训练时可以保持 16 不变,如果样本中 sku 比较密集,多是远拍,sku 比较小,16 的情况下的训练效果不佳,可以考虑减小该值为 8</li>
<li>grid_anchor_generator.height_stride、grid_anchor_generator.width_stride 物体框训练时的滑动步长,训练时可以保持 16 不变,如果样本中 sku 比较密集,多是远拍,sku 比较小,如果样本中 sku 比较密集,多是远拍,sku 比较小,16 的情况下的训练效果不佳,可以考虑减小该值为 8</li>
<li>first_stage_nms_iou_threshold  第一阶段框 IOU 阈值,可以适当减小来增大查全率,但相应准确率可能降低,范围 0~1</li>
<li>first_stage_max_proposals 第一阶段选取得推荐框的个数,可以适当增大来增大查全率,但相应准确率可能降低</li>
<li>batch_non_max_suppression.iou_threshold 第二阶段 IOU 阈值,可以适当减小来增大查全率,但相应准确率可能降低,范围 0~1</li>
<li>batch_non_max_suppression.max_detections_per_class 每类样本的最大检测数量</li>
<li>batch_non_max_suppression.max_total_detections 所有样本的最大检测数量</li>
</ul>
<h3>train_config 部分</h3>
<ul>
<li>initial_learning_rate 初始学习率 , 0.0003、0.0002都可以</li>
<li>data_augmentation_options 数据增强选项
<ul>
<li>random_adjust_brightness 随机调节亮度</li>
<li>random_image_scale 随机缩放图片大小</li>
<li>random_crop_to_aspect_ratio 随机裁剪到指定比例大小<br>
以上几类增强比较常用</li>
</ul>
</li>
</ul>
<h3>train_input_reader 部分</h3>
<ul>
<li>tf_record_input_reader.input_path 指定 tfrecord 文件路径</li>
<li>label_map_path 指定标注映射文件路径</li>
<li>shuffle 是否打乱样本原有顺序,随机输入训练</li>
</ul>
<h3>eval_config 部分</h3>
<ul>
<li>num_visualizations 评估导出图片数量,根据评估输入样本决定,不用太大,主要用于评估结果的可视化</li>
<li>visualization_export_dir 指定评估图片的保存路径</li>
</ul>
<h3>eval_input_reader 部分</h3>
<ul>
<li>tf_record_input_reader.input_path 指定 tfrecord 文件路径</li>
<li>label_map_path 指定标注映射文件路径</li>
<li>shuffle 是否打乱样本原有顺序,随机输入训练</li>
<li>num_epochs 评估样本几次,一般不用改</li>
</ul>
<h2>模型训练</h2>
<p>训练:</p>
<pre><code># object_detection 工程所在目录下,执行如下命令
python object_detection/train.py  --logtostderr --pipeline_config_path=F:/Workspaces/hongniu3sku/train/faster_rcnn_resnet101_20180530.config  --train_dir=F:/Workspaces/hongniu3sku/train/train_data/train/20180530

# pipeline_config_path :训练配置文件所在路径
# train_dir : 训练所产生的中间文件保存目录
</code></pre>
<p>评估:</p>
<pre><code># object_detection 工程所在目录下,执行如下命令
python object_detection/eval.py --logtostderr  --pipeline_config_path=F:/Workspaces/hongniu3sku/train/faster_rcnn_resnet101_20180530.config  --checkpoint_dir=F:/Workspaces/hongniu3sku/train/train_data/train/20180530  --eval_dir=F:/Workspaces/hongniu3sku/train/train_data/eval/20180530

# pipeline_config_path :训练配置文件所在路径
# checkpoint_dir : 指定训练时所产生的中间文件的保存目录
# eval_dir: 评估时所产生的中间文件保存目录
</code></pre>
<p>导出模型:</p>
<pre><code># object_detection 工程所在目录下,执行如下命令
python object_detection/export_inference_graph.py --input_type image_tensor --pipeline_config_path=F:/Workspaces/hongniu3sku/train/faster_rcnn_resnet101_20180530.config  --trained_checkpoint_prefix=F:/Workspaces/hongniu3sku/train/train_data/train/20180530/model.ckpt-157978  --output_directory=F:/Workspaces/hongniu3sku/train/train_data/export/20180530

# pipeline_config_path :训练配置文件所在路径
# trained_checkpoint_prefix:指定模型导出使用的中间文件 ,model.ckpt-【数字】 对应导出哪一步的参数到最终模型中
# output_directory:指定模型最终的导出目录
</code></pre>
<p>最终导出的文件有:</p>
<pre><code>|- saved_model
|  |- variables
|  |- saved_model.pb   (tensorflow serving 使用的模型文件)
|- checkpoint (检查点临时文件)
|- frozen_inference_graph.pb  (冻结参数的用于推理的图文件)
|- model.ckpt.*  (模型数据,参数、结构等)
</code></pre>
<blockquote>
<p>建议每次训练后 checkpoint、frozen_inference_graph.pb、model.ckpt.* 都保存,方便后续对该模型进行优化</p></blockquote>
"
Tensorflow 模型训练
美文网首页
Tensorflow 模型训练

Tensorflow 模型训练

作者: HansenGuan | 来源:发表于2018-10-12 16:14 被阅读0次

    Tensorflow 环境搭建

    Windows GPU 版安装

    依赖软件包

    cuDNN v7.0.5 解压后将文件(bin、include、lib)拷贝到 CUDA 安装目录(NVIDIA GPU Computing Toolkit/CUDA/v9.0)下

    各个版本需要保持一致,不然会存在版本不一致问题,注意选择正确的系统版本

    python 环境安装(训练环境/开发环境)

    训练环境建议安装 Anaconda , 它是一个流行的进行数据科学研究的 python 平台,预安装了很多库,可以很方便的管理多个版本的 python 环境,实现 python 环境的自由切换

    Tensorflow 底层使用了 gRPC 框架,使用 Protocol Buffers 数据交换协议,protoc 工具是一个编译器,可以很方便将 proto 协议文件编译成供多个语言版本使用

    此处使用 3.4.0 版本,新版本编译命令可能不同,为避免后续出现错误,可以直接使用 3.4.0 版本

    安装

    1. 下载Anaconda并安装
    2. 配置环境变量 安装目录\Anaconda3;安装目录\Anaconda3\Scripts;安装目录\Anaconda3\Library\bin; 到 path( 系统环境变量)中;
      配置国内源
      # 添加Anaconda的TUNA镜像
      conda config --add channels https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
      # 设置搜索时显示通道地址
      conda config --set show_channel_urls yes
    
    1. 安装 python 环境
    #查看系统当前已有的Python环境,
    conda info --envs
    #安装指定版本的 python 环境
    conda create --name py35 python=3.5
    #切换 python 环境
    activate py35
    #切回原来的Python环境
    deactivate py35
    #删除环境
    conda remove --name py35 --all
    

    python 3 环境下的 Tensorflow 安装

    install Tensorflow

    # For CPU
    pip install tensorflow==1.6
    # For GPU
    pip install tensorflow-gpu==1.6
    

    users can install dependencies using pip:

    pip install Cython
    pip install pillow
    pip install lxml
    pip install jupyter
    pip install matplotlib
    

    模型训练项目的编译准备

    • Protobuf Compilation
    protoc object_detection/protos/*.proto --python_out=.
    
    • Add Libraries to PYTHONPATH
    1. 在你的Anaconda3安装路/Anaconda3/Lib/site-packages 下新建一个txt文件 
    (我这里的安装路径是C:\ProgramData\Anaconda3\Lib\site-packages);如果安装有其他 python 环境,则在对应的环境目录(Anaconda3\envs\py35\Lib\site-packages)下新建一个txt文件 。
    
    2. 在新建的txt文件中写入自己对应的 Tensorflow object_detection 工程的目录路径:
    F:\project\project
    F:\project\project\slim
    
    3. 将文件名改为 tensorflow_model.pth (注意这里的后缀一定要以pth结尾)
    
    • Testing the Installation
    #From tensorflow/models/research/
    python object_detection/builders/model_builder_test.py
    

    模型训练

    样本标注

    使用 label_images 工具用于标记图片,生成 Pascal voc 格式的 标注文件

    生成 tensorflow 支持的 tfrecord 文件

    工作目录结构

    |- template
    |  |- annotations (标注文件)
    |  |- images (样本图片)
    |  |- label_maps
    |  |  |- *.pbtxt (标注映射文件,id 从 1 开始)
    

    脚本工具 - tfrecord_util.py 【python 3 环境】

    
    import os
    import io
    import tensorflow as tf
    
    from PIL import Image
    
    from object_detection.utils import dataset_util
    from object_detection.utils import label_map_util
    from collections import namedtuple
    import glob
    import pandas as pd
    import xml.etree.ElementTree as ET
    
    
    current_path = 'template所在目录'
    train_path = os.path.join(current_path, "template")
    # 图片标注文件目录
    annotations_dir = os.path.join(train_path, "annotations")
    # 图片目录
    images_path = os.path.join(train_path, "images")
    # 映射文件
    labels_path = os.path.join(train_path, "label_maps")
    labels_file = os.path.join(labels_path, "mscoco_label_map.pbtxt")
    # csv 文件(全路径)
    csv_file = os.path.join(train_path, "temp_csv_name.csv")
    # record 文件(全路径)
    tf_record_file = os.path.join(train_path, "tf_record_file.record")
    # ---------------------------------------------------------------------- xml operator
    
    def xml_to_csv(path):
        xml_list = []
        for xml_file in glob.glob(path + '/*.xml'):
            tree = ET.parse(xml_file)
            root = tree.getroot()
            for member in root.findall('object'):
                # if member[0].text != 'a_hn_101':
                #     continue
    
                file_path = root.find('path').text
                filename = file_path.split("/")[-1].split("\\")[-1]
                value = (filename,
                         int(root.find('size')[0].text),
                         int(root.find('size')[1].text),
                         member[0].text,
                         int(member[4][0].text),
                         int(member[4][1].text),
                         int(member[4][2].text),
                         int(member[4][3].text)
                         )
    
                xml_list.append(value)
        column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
        xml_df = pd.DataFrame(xml_list, columns=column_name)
        return xml_df
    
    # ---------------------------------------------------------------------- tfrecord operator
    
    
    classes_num = 100
    
    label_map = label_map_util.load_labelmap(labels_file)
    print("success loading label map file["+str(labels_file)+"]")
    # print('\n-------------label_map------------------\n')
    # print(label_map)
    # categories array [{'id':id,'name':name},···]
    categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=classes_num, use_display_name=True)
    # category_index  dic  {id : {'id':id,'name':name}, ···}
    # category_index = label_map_util.create_category_index(categories)
    
    # category_index  dic  {name : {'id':id,'name':name}, ···}
    category_index = {}
    for cat in categories:
        category_index[cat['name']] = cat
    print(category_index)
    print("success generating categories dic")
    
    
    def class_text_to_int(row_label):
        if row_label in category_index.keys():
            # print(str(category_index[row_label]['id']))
            return category_index[row_label]['id']
        else:
            # print(row_label)
            return 0
    
    
    def split(df, group):
        data = namedtuple('data', ['filename', 'object'])
        gb = df.groupby(group)
        return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
    
    
    def create_tf_example(group, path):
    
        with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
            encoded_jpg = fid.read()
        encoded_jpg_io = io.BytesIO(encoded_jpg)
        image = Image.open(encoded_jpg_io)
        width, height = image.size
    
        filename = group.filename.encode('utf8')
        # image_format = b'jpg'
        if image.format != 'JPEG':
            print(group.filename)
            raise ValueError('Image format not JPEG')
        else:
            image_format = b'jpg'
        xmins = []
        xmaxs = []
        ymins = []
        ymaxs = []
        classes_text = []
        classes = []
    
        for index, row in group.object.iterrows():
    
            if class_text_to_int(row['class']) == 0:
                print(group.filename)
                # print(row['class'].encode('utf8'))
                continue
            xmins.append(row['xmin'] / width)
            xmaxs.append(row['xmax'] / width)
            ymins.append(row['ymin'] / height)
            ymaxs.append(row['ymax'] / height)
            classes_text.append(row['class'].encode('utf8'))
            classes.append(class_text_to_int(row['class']))
    
        tf_example = tf.train.Example(features=tf.train.Features(feature={
            'image/height': dataset_util.int64_feature(height),
            'image/width': dataset_util.int64_feature(width),
            'image/filename': dataset_util.bytes_feature(filename),
            'image/source_id': dataset_util.bytes_feature(filename),
            'image/encoded': dataset_util.bytes_feature(encoded_jpg),
            'image/format': dataset_util.bytes_feature(image_format),
            'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
            'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
            'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
            'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
            'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
            'image/object/class/label': dataset_util.int64_list_feature(classes),
        }))
        return tf_example
    
    # ----------------------------------------------------------------------
    
    
    def generate_tf_record_file(recreate=True):
        """
        generate the tensorflow record file from label xml files which belongs sample images
        :param recreate:  if create a new record file
        :return:  tf_record_file path
        """
        if recreate:
            # 1. 读取图片标注文件目录下的所有 xml 文件,并转化为 csv 文件
            xml_df = xml_to_csv(annotations_dir)
            xml_df.to_csv(csv_file, index=None)
            print('Successfully converted xml['+str(annotations_dir)+'] to csv['+str(csv_file)+'].')
    
            print(csv_file)
            # 2. 将 csv 文件转 record 文件
            examples = pd.read_csv(csv_file)
            grouped = split(examples, 'filename')
    
            writer = tf.python_io.TFRecordWriter(tf_record_file)
            for group in grouped:
                try:
                    tf_example = create_tf_example(group, images_path)
                except:
                    print(group.filename)
                    continue
                writer.write(tf_example.SerializeToString())
            writer.close()
            print('Successfully created the TFRecords: {}'.format(tf_record_file))
            return tf_record_file
    
        else:
            # TODO - look up the exist file
            return None
    
    def main(_):
        my_tf_record_file = generate_tf_record_file()
        print(my_tf_record_file)
    
    if __name__ == '__main__':
        tf.app.run()
    

    模型训练相关配置

    配置文件 faster_rcnn_resnet101.config

    # Faster R-CNN with Resnet-101 (v1) configuration for MSCOCO Dataset.
    # Users should configure the fine_tune_checkpoint field in the train config as
    # well as the label_map_path and input_path fields in the train_input_reader and
    # eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
    # should be configured.
    
    model {
      faster_rcnn {
        num_classes: 23
        image_resizer {
          keep_aspect_ratio_resizer {
            min_dimension: 1024
            max_dimension: 1280  
          }
        }
        feature_extractor {
          type: 'faster_rcnn_resnet101'
          first_stage_features_stride: 16
        }
        first_stage_anchor_generator {
          grid_anchor_generator {
            scales: [0.25, 0.5, 1.0, 2.0]
            aspect_ratios: [0.5, 1.0, 2.0]
            height_stride: 16
            width_stride: 16
          }
        }
        first_stage_box_predictor_conv_hyperparams {
          op: CONV
          regularizer {
            l2_regularizer {
              weight: 0.0
            }
          }
          initializer {
            truncated_normal_initializer {
              stddev: 0.01
            }
          }
        }
        first_stage_nms_score_threshold: 0.0
        first_stage_nms_iou_threshold: 0.6
        first_stage_max_proposals: 400
        first_stage_localization_loss_weight: 2.0
        first_stage_objectness_loss_weight: 1.0
        initial_crop_size: 14
        maxpool_kernel_size: 2
        maxpool_stride: 2
        second_stage_box_predictor {
          mask_rcnn_box_predictor {
            use_dropout: false
            dropout_keep_probability: 1.0
            fc_hyperparams {
              op: FC
              regularizer {
                l2_regularizer {
                  weight: 0.0
                }
              }
              initializer {
                variance_scaling_initializer {
                  factor: 1.0
                  uniform: true
                  mode: FAN_AVG
                }
              }
            }
          }
        }
        second_stage_post_processing {
          batch_non_max_suppression {
            score_threshold: 0.0
            iou_threshold: 0.7
            max_detections_per_class: 100
            max_total_detections: 300
          }
          score_converter: SOFTMAX
        }
        second_stage_localization_loss_weight: 2.0
        second_stage_classification_loss_weight: 1.0
      }
    }
    
    train_config: {
      batch_size: 1
      optimizer {
        momentum_optimizer: {
          learning_rate: {
            manual_step_learning_rate {
              initial_learning_rate: 0.0002
              schedule {
                step: 900000
                learning_rate: .00003
              }
              schedule {
                step: 1200000
                learning_rate: .000003
              }
            }
          }
          momentum_optimizer_value: 0.9
        }
        use_moving_average: false
      }
      gradient_clipping_by_norm: 10.0
      # fine_tune_checkpoint: "F:/project/project/faster_rcnn_resnet101_coco_2018_01_28/model.ckpt"
      # from_detection_checkpoint: true
      # Note: The below line limits the training process to 200K steps, which we
      # empirically found to be sufficient enough to train the pets dataset. This
      # effectively bypasses the learning rate schedule (the learning rate will
      # never decay). Remove the below line to train indefinitely.
      #num_steps: 10000
      data_augmentation_options {
        random_adjust_brightness {
          max_delta: 0.1
        }
      }
      data_augmentation_options {
        random_image_scale {
          min_scale_ratio: 0.8
          max_scale_ratio: 1.2
        }
      }
      #data_augmentation_options {
      #  random_crop_to_aspect_ratio {
      #  }
      #}
    
      #data_augmentation_options {
      #  random_adjust_contrast {
      #      min_delta: 0.5
      #      max_delta: 1.5
      #  }
      #}
      #data_augmentation_options {
      #  random_adjust_saturation {
      #    min_delta: 0.5
      #    max_delta: 1.5
      #  }
      #}
    }
    
    train_input_reader: {
      tf_record_input_reader {
        input_path: "D:/Workspace/train_dir/all/tf_record_file_23_3035_20180724.record"
      }
      label_map_path: "D:/Workspace/train_dir/all/mscoco_label_map_23.pbtxt"
      shuffle: true
    }
    
    eval_config: {
      # num_examples: 1
      num_visualizations: 200
      # Note: The below line limits the evaluation process to 10 evaluations.
      # Remove the below line to evaluate indefinitely.
      max_evals: 2
      visualization_export_dir: "D:/Workspace/train_dir/all/20180724/eval/exportfrcnn"
    }
    
    eval_input_reader: {
      tf_record_input_reader {
        input_path: "D:/Workspace/train_dir/all/tf_record_file_23_3035_20180724_eval.record"
      }
      label_map_path: "D:/Workspace/train_dir/all/mscoco_label_map_23.pbtxt"
      shuffle: true
      num_readers: 5
      num_epochs: 1
    }
    

    配置文件主要分为 5 个部分:

    • model :定义 神经网络模型结构,及相关超参数
    • train_config: 训练相关配置
    • train_input_reader: 训练样本输入相关配置
    • eval_config: 模型评估相关配置
    • eval_input_reader:评估样本输入相关配置

    model 部分

    • num_classes 对应待检测物体的总数(一共有多少个标注样本)
    • keep_aspect_ratio_resizer.min_dimension、keep_aspect_ratio_resizer.max_dimension 控制样本输入缩放后的大小
    • feature_extractor.first_stage_features_stride 第一阶段特征提取步长,训练时可以保持 16 不变,如果样本中 sku 比较密集,多是远拍,sku 比较小,16 的情况下的训练效果不佳,可以考虑减小该值为 8
    • grid_anchor_generator.height_stride、grid_anchor_generator.width_stride 物体框训练时的滑动步长,训练时可以保持 16 不变,如果样本中 sku 比较密集,多是远拍,sku 比较小,如果样本中 sku 比较密集,多是远拍,sku 比较小,16 的情况下的训练效果不佳,可以考虑减小该值为 8
    • first_stage_nms_iou_threshold 第一阶段框 IOU 阈值,可以适当减小来增大查全率,但相应准确率可能降低,范围 0~1
    • first_stage_max_proposals 第一阶段选取得推荐框的个数,可以适当增大来增大查全率,但相应准确率可能降低
    • batch_non_max_suppression.iou_threshold 第二阶段 IOU 阈值,可以适当减小来增大查全率,但相应准确率可能降低,范围 0~1
    • batch_non_max_suppression.max_detections_per_class 每类样本的最大检测数量
    • batch_non_max_suppression.max_total_detections 所有样本的最大检测数量

    train_config 部分

    • initial_learning_rate 初始学习率 , 0.0003、0.0002都可以
    • data_augmentation_options 数据增强选项
      • random_adjust_brightness 随机调节亮度
      • random_image_scale 随机缩放图片大小
      • random_crop_to_aspect_ratio 随机裁剪到指定比例大小
        以上几类增强比较常用

    train_input_reader 部分

    • tf_record_input_reader.input_path 指定 tfrecord 文件路径
    • label_map_path 指定标注映射文件路径
    • shuffle 是否打乱样本原有顺序,随机输入训练

    eval_config 部分

    • num_visualizations 评估导出图片数量,根据评估输入样本决定,不用太大,主要用于评估结果的可视化
    • visualization_export_dir 指定评估图片的保存路径

    eval_input_reader 部分

    • tf_record_input_reader.input_path 指定 tfrecord 文件路径
    • label_map_path 指定标注映射文件路径
    • shuffle 是否打乱样本原有顺序,随机输入训练
    • num_epochs 评估样本几次,一般不用改

    模型训练

    训练:

    # object_detection 工程所在目录下,执行如下命令
    python object_detection/train.py  --logtostderr --pipeline_config_path=F:/Workspaces/hongniu3sku/train/faster_rcnn_resnet101_20180530.config  --train_dir=F:/Workspaces/hongniu3sku/train/train_data/train/20180530
    
    # pipeline_config_path :训练配置文件所在路径
    # train_dir : 训练所产生的中间文件保存目录
    

    评估:

    # object_detection 工程所在目录下,执行如下命令
    python object_detection/eval.py --logtostderr  --pipeline_config_path=F:/Workspaces/hongniu3sku/train/faster_rcnn_resnet101_20180530.config  --checkpoint_dir=F:/Workspaces/hongniu3sku/train/train_data/train/20180530  --eval_dir=F:/Workspaces/hongniu3sku/train/train_data/eval/20180530
    
    # pipeline_config_path :训练配置文件所在路径
    # checkpoint_dir : 指定训练时所产生的中间文件的保存目录
    # eval_dir: 评估时所产生的中间文件保存目录
    

    导出模型:

    # object_detection 工程所在目录下,执行如下命令
    python object_detection/export_inference_graph.py --input_type image_tensor --pipeline_config_path=F:/Workspaces/hongniu3sku/train/faster_rcnn_resnet101_20180530.config  --trained_checkpoint_prefix=F:/Workspaces/hongniu3sku/train/train_data/train/20180530/model.ckpt-157978  --output_directory=F:/Workspaces/hongniu3sku/train/train_data/export/20180530
    
    # pipeline_config_path :训练配置文件所在路径
    # trained_checkpoint_prefix:指定模型导出使用的中间文件 ,model.ckpt-【数字】 对应导出哪一步的参数到最终模型中
    # output_directory:指定模型最终的导出目录
    

    最终导出的文件有:

    |- saved_model
    |  |- variables
    |  |- saved_model.pb   (tensorflow serving 使用的模型文件)
    |- checkpoint (检查点临时文件)
    |- frozen_inference_graph.pb  (冻结参数的用于推理的图文件)
    |- model.ckpt.*  (模型数据,参数、结构等)
    

    建议每次训练后 checkpoint、frozen_inference_graph.pb、model.ckpt.* 都保存,方便后续对该模型进行优化

    相关文章

      网友评论

          本文标题:Tensorflow 模型训练

          本文链接:https://www.haomeiwen.com/subject/mupvoftx.html