美文网首页DL&NN深度学习 入门我爱编程
TensorFlow学习笔记:使用Inception v3进行图

TensorFlow学习笔记:使用Inception v3进行图

作者: DexterLei | 来源:发表于2017-10-03 19:32 被阅读3685次
    Inception

    0. Google Inception模型简介

    Inception为Google开源的CNN模型,至今已经公开四个版本,每一个版本都是基于大型图像数据库ImageNet中的数据训练而成。因此我们可以直接利用Google的Inception模型来实现图像分类。本篇文章主要以Inception_v3模型为基础。Inception v3模型大约有2500万个参数,分类一张图像就用了50亿的乘加指令。在一台没有GPU的现代PC上,分类一张图像转眼就能完成。

    1. Google Inception模型发展

    以下为Inception四个版本所对应的论文,末尾为ILSVRC中的Top-5错误率:

    2. 下载Inception_v3模型

    解压下载好的压缩文件,如下: 文件列表
    • classify_image_graph_def.pb文件为Inception_v3本体
    • imagenet_2012_challenge_label_map_proto.pbtxt文件内容如下所示: imagenet_2012_challenge_label_map_proto.pbtxt

      包含target_class与target_class_string,前者为分类代码,从1~1000,共1k类,记为Node_ID;后者为一编号字符串“n********”,可以理解为“地址”或者“桥梁”,记为UID。

    • imagenet_synset_to_human_label_map.txt文件内容如下: imagenet_synset_to_human_label_map.txt

      包含UID与类别的映射,这种类别文字标签记为human_string。

    3. 准备工作

    随便从网上下载一张图片,命名为husky.jpg:


    husky.jpg

    下面的代码就将使用Inception_v3模型对这张哈士奇图片进行分类。

    4. 代码

    先创建一个类NodeLookup来将softmax概率值映射到标签上;然后创建一个函数create_graph()来读取并新建模型;最后读取哈士奇图片进行分类识别:

    # -*- coding: utf-8 -*-
    
    import tensorflow as tf
    import numpy as np
    #import re
    import os
    
    model_dir='C:/Users/Dexter/Documents/ML_files/171003_Inception_v3/Inception_model'
    image = 'C:/Users/Dexter/Documents/ML_files/171003_Inception_v3/Images/husky.jpg'
    
    
    #将类别ID转换为人类易读的标签
    class NodeLookup(object):
        def __init__(self, label_lookup_path=None, uid_lookup_path=None):
            if not label_lookup_path:
                # 加载“label_lookup_path”文件
                # 此文件将数据集中所含类别(1-1000)与一个叫做target_class_string的地址对应起来
                # 其地址编码为“n********”星号代表数字
                label_lookup_path = os.path.join(
                        model_dir, 'imagenet_2012_challenge_label_map_proto.pbtxt')
            if not uid_lookup_path:
                # 加载“uid_lookup_path”文件
                # 此文件将数据集中所含类别具体名称与编码方式为“n********”的地址/UID一一对应起来
                uid_lookup_path = os.path.join(
                        model_dir, 'imagenet_synset_to_human_label_map.txt')
            self.node_lookup = self.load(label_lookup_path, uid_lookup_path)
    
        def load(self, label_lookup_path, uid_lookup_path):
            if not tf.gfile.Exists(uid_lookup_path):
                # 预先检测地址是否存在
                tf.logging.fatal('File does not exist %s', uid_lookup_path)
            if not tf.gfile.Exists(label_lookup_path):
                # 预先检测地址是否存在
                tf.logging.fatal('File does not exist %s', label_lookup_path)
    
    
            # Loads mapping from string UID to human-readable string
            # 加载编号字符串n********,即UID与分类名称之间的映射关系(字典):uid_to_human
            
            # 读取uid_lookup_path中所有的lines
            # readlines(): Returns all lines from the file in a list.
            # Leaves the '\n' at the end.
            proto_as_ascii_lines = tf.gfile.GFile(uid_lookup_path).readlines()
            
            # 创建空字典uid_to_human用以存储映射关系
            uid_to_human = {}
    # =============================================================================
    #         # 使用正则化方法处理文件:
    #         p = re.compile(r'[n\d]*[ \S,]*')
    #         for line in proto_as_ascii_lines:         
    #              = p.findall(line)
    #             uid = parsed_items[0]
    #             human_string = parsed_items[2]
    #             uid_to_human[uid] = human_string
    # =============================================================================
            # 使用简单方法处理文件:
            # 一行行读取数据
            for line in proto_as_ascii_lines:
                # 去掉换行符
                line = line.strip('\n')
                # 按照‘\t’分割,即tab,将line分为两个部分
                parse_items = line.split('\t')
                # 获取分类编码,即UID
                uid = parse_items[0]
                # 获取分类名称
                human_string = parse_items[1]
                # 新建编号字符串n********,即UID与分类名称之间的映射关系(字典):uid_to_human
                uid_to_human[uid] = human_string
                
    
            # Loads mapping from string UID to integer node ID.
            # 加载编号字符串n********,即UID与分类代号,即node ID之间的映射关系(字典)
            
            # 加载分类字符串n********,即UID对应分类编号1-1000的文件
            proto_as_ascii = tf.gfile.GFile(label_lookup_path).readlines()
            # 创建空字典node_id_to_uid用以存储分类代码node ID与UID之间的关系
            node_id_to_uid = {}
            for line in proto_as_ascii:
                # 注意空格
                if line.startswith('  target_class:'):
                    # 获取分类编号
                    target_class = int(line.split(': ')[1])
                if line.startswith('  target_class_string:'):
                    # 获取UID(带双引号,eg:"n01484850")
                    target_class_string = line.split(': ')[1]
                    # 去掉前后的双引号,构建映射关系
                    node_id_to_uid[target_class] = target_class_string[1:-2]
        
            # Loads the final mapping of integer node ID to human-readable string
            # 加载node ID与分类名称之间的映射关系
            node_id_to_name = {}
            for key, val in node_id_to_uid.items():
                # 假如uid不存在于uid_to_human中,则报错
                if val not in uid_to_human:
                    tf.logging.fatal('Failed to locate: %s', val)
                # 获取分类名称
                name = uid_to_human[val]
                # 构建分类编号1-1000对应分类名称的映射关系:key为node_id;val为name
                node_id_to_name[key] = name
        
            return node_id_to_name
    
        # 传入分类编号1-1000,返回分类具体名称
        def id_to_string(self, node_id):
            # 若不存在,则返回空字符串
            if node_id not in self.node_lookup:
                return ''
            return self.node_lookup[node_id]
    
    # 读取并创建一个图graph来存放Google训练好的Inception_v3模型(函数)
    def create_graph():
        with tf.gfile.FastGFile(os.path.join(
                model_dir, 'classify_image_graph_def.pb'), 'rb') as f:
            graph_def = tf.GraphDef()
            graph_def.ParseFromString(f.read())
            tf.import_graph_def(graph_def, name='')
    
    #读取图片
    image_data = tf.gfile.FastGFile(image, 'rb').read()
    
    #创建graph
    create_graph()
    
    # 创建会话,因为是从已有的Inception_v3模型中恢复,所以无需初始化
    with tf.Session() as sess:
        # Inception_v3模型的最后一层softmax的输出
        # 形如'conv1'是节点名称,而'conv1:0'是张量名称,表示节点的第一个输出张量
        softmax_tensor = sess.graph.get_tensor_by_name('softmax:0')
        # 输入图像(jpg格式)数据,得到softmax概率值(一个shape=(1,1008)的向量)
        predictions = sess.run(softmax_tensor,{'DecodeJpeg/contents:0': image_data})
        # 将结果转为1维数据
        predictions = np.squeeze(predictions)
        # 新建类:ID --> English string label.
        node_lookup = NodeLookup()
        # 排序,取出前5个概率最大的值(top-5)
        # argsort()返回的是数组值从小到大排列所对应的索引值
        top_5 = predictions.argsort()[-5:][::-1]
        for node_id in top_5:
            # 获取分类名称
            human_string = node_lookup.id_to_string(node_id)
            # 获取该分类的置信度
            score = predictions[node_id]
            print('%s (score = %.5f)' % (human_string, score))
    

    最后输出:

    runfile('C:/Users/Dexter/Documents/ML_files/171003_Inception_v3/test.py', wdir='C:/Users/Dexter/Documents/ML_files/171003_Inception_v3')
    Siberian husky (score = 0.51033)
    Eskimo dog, husky (score = 0.41048)
    malamute, malemute, Alaskan malamute (score = 0.00653)
    kelpie (score = 0.00136)
    dogsled, dog sled, dog sleigh (score = 0.00133)
    

    稍微修改一下代码,使输入为多张图片,输出为图片路径+图片+预测结果:

    # -*- coding: utf-8 -*-
    """
    Created on Fri Oct  6 19:32:04 2017
    test2:将test中输入一张图片变为输入一个文件夹的图片,并使输出可见
    @author: Dexter
    """
    
    import tensorflow as tf
    import numpy as np
    #import re
    import os
    from PIL import Image
    import matplotlib.pyplot as plt
    
    model_dir='C:/Users/Dexter/Documents/ML_files/171003_Inception_v3/Inception_model'
    image = 'C:/Users/Dexter/Documents/ML_files/171003_Inception_v3/Images/'
    
    
    #将类别ID转换为人类易读的标签
    class NodeLookup(object):
        def __init__(self, label_lookup_path=None, uid_lookup_path=None):
            if not label_lookup_path:
                # 加载“label_lookup_path”文件
                # 此文件将数据集中所含类别(1-1000)与一个叫做target_class_string的地址对应起来
                # 其地址编码为“n********”星号代表数字
                label_lookup_path = os.path.join(
                        model_dir, 'imagenet_2012_challenge_label_map_proto.pbtxt')
            if not uid_lookup_path:
                # 加载“uid_lookup_path”文件
                # 此文件将数据集中所含类别具体名称与编码方式为“n********”的地址/UID一一对应起来
                uid_lookup_path = os.path.join(
                        model_dir, 'imagenet_synset_to_human_label_map.txt')
            self.node_lookup = self.load(label_lookup_path, uid_lookup_path)
    
        def load(self, label_lookup_path, uid_lookup_path):
            if not tf.gfile.Exists(uid_lookup_path):
                # 预先检测地址是否存在
                tf.logging.fatal('File does not exist %s', uid_lookup_path)
            if not tf.gfile.Exists(label_lookup_path):
                # 预先检测地址是否存在
                tf.logging.fatal('File does not exist %s', label_lookup_path)
    
    
            # Loads mapping from string UID to human-readable string
            # 加载编号字符串n********,即UID与分类名称之间的映射关系(字典):uid_to_human
            
            # 读取uid_lookup_path中所有的lines
            # readlines(): Returns all lines from the file in a list.
            # Leaves the '\n' at the end.
            proto_as_ascii_lines = tf.gfile.GFile(uid_lookup_path).readlines()
            
            # 创建空字典uid_to_human用以存储映射关系
            uid_to_human = {}
    # =============================================================================
    #         # 使用正则化方法处理文件:
    #         p = re.compile(r'[n\d]*[ \S,]*')
    #         for line in proto_as_ascii_lines:         
    #              = p.findall(line)
    #             uid = parsed_items[0]
    #             human_string = parsed_items[2]
    #             uid_to_human[uid] = human_string
    # =============================================================================
            # 使用简单方法处理文件:
            # 一行行读取数据
            for line in proto_as_ascii_lines:
                # 去掉换行符
                line = line.strip('\n')
                # 按照‘\t’分割,即tab,将line分为两个部分
                parse_items = line.split('\t')
                # 获取分类编码,即UID
                uid = parse_items[0]
                # 获取分类名称
                human_string = parse_items[1]
                # 新建编号字符串n********,即UID与分类名称之间的映射关系(字典):uid_to_human
                uid_to_human[uid] = human_string
                
    
            # Loads mapping from string UID to integer node ID.
            # 加载编号字符串n********,即UID与分类代号,即node ID之间的映射关系(字典)
            
            # 加载分类字符串n********,即UID对应分类编号1-1000的文件
            proto_as_ascii = tf.gfile.GFile(label_lookup_path).readlines()
            # 创建空字典node_id_to_uid用以存储分类代码node ID与UID之间的关系
            node_id_to_uid = {}
            for line in proto_as_ascii:
                # 注意空格
                if line.startswith('  target_class:'):
                    # 获取分类编号
                    target_class = int(line.split(': ')[1])
                if line.startswith('  target_class_string:'):
                    # 获取UID(带双引号,eg:"n01484850")
                    target_class_string = line.split(': ')[1]
                    # 去掉前后的双引号,构建映射关系
                    node_id_to_uid[target_class] = target_class_string[1:-2]
        
            # Loads the final mapping of integer node ID to human-readable string
            # 加载node ID与分类名称之间的映射关系
            node_id_to_name = {}
            for key, val in node_id_to_uid.items():
                # 假如uid不存在于uid_to_human中,则报错
                if val not in uid_to_human:
                    tf.logging.fatal('Failed to locate: %s', val)
                # 获取分类名称
                name = uid_to_human[val]
                # 构建分类编号1-1000对应分类名称的映射关系:key为node_id;val为name
                node_id_to_name[key] = name
        
            return node_id_to_name
    
        # 传入分类编号1-1000,返回分类具体名称
        def id_to_string(self, node_id):
            # 若不存在,则返回空字符串
            if node_id not in self.node_lookup:
                return ''
            return self.node_lookup[node_id]
    
    # 读取并创建一个图graph来存放Google训练好的Inception_v3模型(函数)
    def create_graph():
        with tf.gfile.FastGFile(os.path.join(
                model_dir, 'classify_image_graph_def.pb'), 'rb') as f:
            graph_def = tf.GraphDef()
            graph_def.ParseFromString(f.read())
            tf.import_graph_def(graph_def, name='')
    
    #创建graph
    create_graph()
    
    # 创建会话,因为是从已有的Inception_v3模型中恢复,所以无需初始化
    with tf.Session() as sess:
        # Inception_v3模型的最后一层softmax的输出
        softmax_tensor = sess.graph.get_tensor_by_name('softmax:0')
        
        # 遍历目录
        for root, dirs, files in os.walk('images/'):
            for file in files:
                # 载入图片
                image_data = tf.gfile.FastGFile(os.path.join(root, file), 'rb').read()
                # 输入图像(jpg格式)数据,得到softmax概率值(一个shape=(1,1008)的向量)
                predictions = sess.run(softmax_tensor,{'DecodeJpeg/contents:0': image_data})
                # 将结果转为1维数据
                predictions = np.squeeze(predictions)
        
                # 打印图片路径及名称
                image_path = os.path.join(root, file)
                print(image_path)
                # 显示图片
                img = Image.open(image_path)
                plt.imshow(img)
                plt.axis('off')
                plt.show()
                
                # 新建类:ID --> English string label.
                node_lookup = NodeLookup()
                # 排序,取出前5个概率最大的值(top-5)
                # argsort()返回的是数组值从小到大排列所对应的索引值
                top_5 = predictions.argsort()[-5:][::-1]
                for node_id in top_5:
                    # 获取分类名称
                    human_string = node_lookup.id_to_string(node_id)
                    # 获取该分类的置信度
                    score = predictions[node_id]
                    print('%s (score = %.5f)' % (human_string, score))
                print()
    

    最后输出:

    runfile('C:/Users/Dexter/Documents/ML_files/171003_Inception_v3/test2.py', wdir='C:/Users/Dexter/Documents/ML_files/171003_Inception_v3')
    images/dog.jpg
    
    dingo, warrigal, warragal, Canis dingo (score = 0.46103)
    Chihuahua (score = 0.05741)
    Eskimo dog, husky (score = 0.04384)
    dhole, Cuon alpinus (score = 0.04106)
    Pembroke, Pembroke Welsh corgi (score = 0.02823)
    
    images/husky.jpg
    
    Siberian husky (score = 0.51033)
    Eskimo dog, husky (score = 0.41048)
    malamute, malemute, Alaskan malamute (score = 0.00653)
    kelpie (score = 0.00136)
    dogsled, dog sled, dog sleigh (score = 0.00133)
    

    5. 相关函数补充说明

    • tf.get_default_graph()
      返回当前进程中的默认图(可以使用Graph.as_default()设置)

    Returns the default graph for the current thread.
    The returned graph will be the innermost graph on which a Graph.as_default() context has been entered, or a global default graph if none has been explicitly created.
    NOTE: The default graph is a property of the current thread. If you create a new thread, and wish to use the default graph in that thread, you must explicitly add a with g.as_default(): in that thread's function.

    Returns:
    The default Graph being used in the current thread.


    • tf.Graph.as_default()
      将Graph设置为默认图

    Returns a context manager that makes this Graph the default graph.


    • tf.Graph.get_tensor_by_name()

    All tensors have string names which you can see as follows:

    [tensor.name for tensor in tf.get_default_graph().as_graph_def().node]
    

    Once you know the name you can fetch the Tensor using <name>:0 (0 refers to endpoint which is somewhat redundant)

    import tensorflow as tf
    
    c = tf.constant([[1.0, 2.0], [3.0, 4.0]])
    d = tf.constant([[1.0, 1.0], [0.0, 1.0]])
    e = tf.matmul(c, d, name='example')
    
    with tf.Session() as sess:
        test = sess.run(e)
        print (e.name)  
        #example:0
        #<name>:0 (0 refers to endpoint which is somewhat redundant)
        test = tf.get_default_graph().get_tensor_by_name("example:0")
        print (test)    
        #Tensor("example:0", shape=(2, 2), dtype=float32)
    
    参考资料:

    6. 一些改进

    6.1 使用png或者其他图片格式,代替jpg作为输入

    The shipped InceptionV3 graph used in classify_image.py
    only supports JPEG images out-of-the-box. There are two ways you could use this graph with PNG images:

    1. Convert the PNG image to a height
      x width x 3 (channels) Numpy array, for example using PIL, then feed the 'DecodeJpeg:0' tensor:
    import numpy as np
    from PIL import Image
    # ...
    
    image = Image.open("example.png")
    image_array = np.array(image)[:, :, 0:3]  # Select RGB channels only.
    
    prediction = sess.run(softmax_tensor, {'DecodeJpeg:0': image_array})
    

    Perhaps confusingly, 'DecodeJpeg:0' is the output of the DecodeJpeg op, so by feeding this tensor, you are able to feed raw image data.

    1. Add a tf.image.decode_png() op to the imported graph. Simply switching the name of the fed tensor from 'DecodeJpeg/contents:0'
      to 'DecodePng/contents:0' does not work because there is no 'DecodePng' op in the shipped graph. You can add such a node to the graph by using the input_map argument to tf.import_graph_def()
      :
    png_data = tf.placeholder(tf.string, shape=[])
    decoded_png = tf.image.decode_png(png_data, channels=3)
    # ...
    
    graph_def = ...
    softmax_tensor = tf.import_graph_def(
        graph_def,
        input_map={'DecodeJpeg:0': decoded_png},
        return_elements=['softmax:0'])
    
    sess.run(softmax_tensor, {png_data: ...})
    
    1. The following code should handle of both cases.
    import numpy as np
    from PIL import Image
    
    image_file = 'test.jpeg'
    with tf.Session() as sess:
    
        #     softmax_tensor = sess.graph.get_tensor_by_name('final_result:0')
        if image_file.lower().endswith('.jpeg'):
            image_data = tf.gfile.FastGFile(image_file, 'rb').read()
            prediction = sess.run('final_result:0', {'DecodeJpeg/contents:0': image_data})
        elif image_file.lower().endswith('.png'):
            image = Image.open(image_file)
            image_array = np.array(image)[:, :, 0:3]
            prediction = sess.run('final_result:0', {'DecodeJpeg:0': image_array})
    
        prediction = prediction[0]    
        print(prediction)
    

    or shorter version with direct strings:

    image_file = 'test.png' # or 'test.jpeg'
    image_data = tf.gfile.FastGFile(image_file, 'rb').read()
    ph = tf.placeholder(tf.string, shape=[])
    
    with tf.Session() as sess:        
        predictions = sess.run(output_layer_name, {ph: image_data} )
    
    参考资料:

    7. 参考资料

    1. TensorFlow 教程 #07 - Inception 模型
    2. 『TensorFlow』迁移学习_他山之石,可以攻玉
    3. 『TensorFlow』迁移学习_他山之石,可以攻玉_V2
    4. 使用inception-v3做各种图像的识别

    相关文章

      网友评论

        本文标题:TensorFlow学习笔记:使用Inception v3进行图

        本文链接:https://www.haomeiwen.com/subject/iemfyxtx.html