美文网首页人工智能
tensorflow人脸识别(自己的数据集)

tensorflow人脸识别(自己的数据集)

作者: yanghedada | 来源:发表于2018-10-03 21:55 被阅读80次

    可以在云盘下载打包文件包括API,数据
    把原有的文件夹下面的object_detection删掉,这里面的(__init____.py)文件百度云盘上传不了,全都没成功,所以在把文件下来之后object_detection/object_detection/下的内容删掉,把object_detection.zip解压到object_detection里面。
    链接:https://pan.baidu.com/s/1BkMpGOF1cVjJl2Hpip-Hpg
    提取码:9stc

    首先先下载图片

    网友ACLJW的爬虫代码简单高效


    pachong.py
    # @File    : pachong.py
    
    import requests
    import re
    import os
    from pypinyin import pinyin, lazy_pinyin
    def getHTMLText(url):
        try:
            r = requests.get(url,timeout=30)
            r.raise_for_status()
            r.encoding = r.apparent_encoding
            return r.text
        except:
            print("")
    
    def getPageUrls(text,name):
        re_pageUrl=r'href="(.+)">\s*<img src="(.+)" alt="'+name
        return re.findall(re_pageUrl,text)
    
    def downPictures(text,root,name,L):
        pageUrls=getPageUrls(text,name)
        titles=re.findall(r'alt="'+name+r'(.+)" ',text)
        for i in range(len(pageUrls)):
            pageUrl=pageUrls[i][0]
            path = root + titles[i]+ "//"
            if not os.path.exists(path):
                os.mkdir(path)
            if not os.listdir(path):
                pageText=getHTMLText(pageUrl)
                totalPics=int(re.findall(r'<em>(.+)</em>)',pageText)[0])
                downUrl=re.findall(r'href="(.+?)" class="">下载图片',pageText)[0]
                cnt=1;
                while(cnt<=totalPics):
                    L += 1
                    picPath=path+"%s.jpg"%str(L)
                    r=requests.get(downUrl)
                    with open(picPath,'wb') as f:
                        f.write(r.content)
                        f.close()
                    print('{} - 第{}张下载已完成\n'.format(titles[i],L))
                    cnt+=1
                    nextPageUrl=re.findall(r'href="(.+?)">下一张',pageText)[0]
                    pageText=getHTMLText(nextPageUrl)
                    downUrl=re.findall(r'href="(.+?)" class="">下载图片',pageText)[0]
        return L
    
    def main():
        name=input("请输入你喜欢的明星的名字:")
        nameUrl="http://www.win4000.com/mt/"+''.join(lazy_pinyin(name))+".html"
        L  = 0
        try:
            text=getHTMLText(nameUrl)
            if not re.findall(r'暂无(.+)!',text):
                root = "C:/Users/yanghe/Desktop/data/"+name+"//"
                if not os.path.exists(root):
                    os.mkdir(root)
                L = downPictures(text,root,name, L)
                try:
                    nextPage=re.findall(r'next" href="(.+)"',text)[0]
                    while(nextPage):
                        nextText=getHTMLText(nextPage)
                        L = downPictures(nextText,root,name,L)
                        nextPage=re.findall(r'next" href="(.+)"',nextText)[0]
                except IndexError:
                    print("已全部下载完毕")
        except TypeError:
            print("不好意思,没有{}的照片".format(name))
        return
    
    if __name__ == '__main__':
        main()
    
    
    

    打上标签

    1.打标签用的软件是是labelImg.exe,这款软件操作简单。
    labelImg.exe的快捷键


    2.这里需要设置类别:

    这里有一个open_dir是照片文件打开的目录,还有一个Ctrl+R更改默认xml文件地址。这里是为了生成和Pascal voc2007数据集一样格式的文件。
    每打一张图片就保存一下,点下ok就行了,好像是自动保存的,超级简单。

    给戚薇打上标签

    图片爬到的质量有问题,大部分是侧脸,柳岩的全是戴帽子的照片,哎!!这明星的写真集照片看的眼都花了

    Pascal voc2007数据集简单介绍

    具体细节查看点这里:数据集:Pascal voc2007数据集分析
    labelImg
    在Pascal voc2007中(对于2007_000392.jpg)对于这张图有如下的对应xml文件。(2007_000392.jpg图在下面)

    #2007_000392.xml
    <annotation>
        <folder>VOC2012</folder>                           
        <filename>2007_000392.jpg</filename>                               //文件名
        <source>                                                           //图像来源(不重要)
            <database>The VOC2007 Database</database>
            <annotation>PASCAL VOC2007</annotation>
            <image>flickr</image>
        </source>
        <size>                                             //图像尺寸(长宽以及通道数)                      
            <width>500</width>
            <height>332</height>
            <depth>3</depth>
        </size>
        <segmented>1</segmented>                                   //是否用于分割(在图像物体识别中01无所谓)
        <object>                                                           //检测到的物体
            <name>horse</name>                                         //物体类别
            <pose>Right</pose>                                         //拍摄角度
            <truncated>0</truncated>                                   //是否被截断(0表示完整)
            <difficult>0</difficult>                                   //目标是否难以识别(0表示容易识别)
            <bndbox>                                                   //bounding-box(包含左下角和右上角xy坐标)
                <xmin>100</xmin>
                <ymin>96</ymin>
                <xmax>355</xmax>
                <ymax>324</ymax>
            </bndbox>
        </object>
        <object>                                                           //检测到多个物体
            <name>person</name>
            <pose>Unspecified</pose>
            <truncated>0</truncated>
            <difficult>0</difficult>
            <bndbox>
                <xmin>198</xmin>
                <ymin>58</ymin>
                <xmax>286</xmax>
                <ymax>197</ymax>
            </bndbox>
        </object>
    </annotation>
    
    

    在2007_000392.jpg这张图里面有个人在骑马。在xml文件里面object有两个(person和horse),包括信息有是否识别困难,截断,角度等,左下角和右上角的坐标。

    2007_000392.jpg

    我的类别如下:


    把xml文件生成csv文件

    这里的path 就是保存xml文件的目录,data是你要保存csv文件的目录。

    具体查看请点这里:TensorFlow Object Detection API教程——利用自己制作的数据集进行训练预测和测试
    这里记得设置一下你的训练集和测试集的大小,这里是0.67

    # -*- coding: utf-8 -*-
    import os
    import glob
    import pandas as pd
    import xml.etree.ElementTree as ET
    
    def xml_to_csv(path):
        xml_list = []
        # 读取注释文件
        for xml_file in glob.glob(path + '/*.xml'):
            tree = ET.parse(xml_file)
            root = tree.getroot()
            for member in root.findall('object'):
                value = (root.find('filename').text + '.jpg',
                         int(root.find('size')[0].text),
                         int(root.find('size')[1].text),
                         member[0].text,
                         int(member[4][0].text),
                         int(member[4][1].text),
                         int(member[4][2].text),
                         int(member[4][3].text)
                         )
                xml_list.append(value)
        column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
    
        # 将所有数据分为样本集和验证集,一般按照3:1的比例
        train_list = xml_list[0: int(len(xml_list) * 0.67)]
        eval_list = xml_list[int(len(xml_list) * 0.67) + 1: ]
    
        # 保存为CSV格式
        train_df = pd.DataFrame(train_list, columns=column_name)
        eval_df = pd.DataFrame(eval_list, columns=column_name)
        train_df.to_csv('data/train.csv', index=None)
        eval_df.to_csv('data/eval.csv', index=None)
    
    
    def main():
        path = './xml'
        xml_to_csv(path)
        print('Successfully converted xml to csv.')
    
    main()
    
    

    把csv生成TFrecord文件

    import os
    import io
    import pandas as pd
    import tensorflow as tf
    
    from PIL import Image
    from object_detection.utils import dataset_util
    from collections import namedtuple, OrderedDict
    
    flags = tf.app.flags
    flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
    flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
    FLAGS = flags.FLAGS
    
    
    # 将分类名称转成ID号
    def class_text_to_int(row_label):
        if row_label == 'damimi':
            return 1
        elif row_label == 'fanbingbing':
            return 2
        elif row_label == 'liuyan':
            return 3
        elif row_label == 'nazha':
            return 4
        elif row_label == 'xiaowei':  
            return 5
        else:
            print('NONE: ' + row_label)
            None
    
    
    def split(df, group):
        data = namedtuple('data', ['filename', 'object'])
        gb = df.groupby(group)
        return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
    
    
    def create_tf_example(group, path):
        print(os.path.join(path, '{}'.format(group.filename)))
        with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
            encoded_jpg = fid.read()
        encoded_jpg_io = io.BytesIO(encoded_jpg)
        image = Image.open(encoded_jpg_io)
        width, height = image.size
    
        filename = (group.filename + '.jpg').encode('utf8')
        image_format = b'jpg'
        xmins = []
        xmaxs = []
        ymins = []
        ymaxs = []
        classes_text = []
        classes = []
    
        for index, row in group.object.iterrows():
            xmins.append(row['xmin'] / width)
            xmaxs.append(row['xmax'] / width)
            ymins.append(row['ymin'] / height)
            ymaxs.append(row['ymax'] / height)
            classes_text.append(row['class'].encode('utf8'))
            classes.append(class_text_to_int(row['class']))
    
        tf_example = tf.train.Example(features=tf.train.Features(feature={
            'image/height': dataset_util.int64_feature(height),
            'image/width': dataset_util.int64_feature(width),
            'image/filename': dataset_util.bytes_feature(filename),
            'image/source_id': dataset_util.bytes_feature(filename),
            'image/encoded': dataset_util.bytes_feature(encoded_jpg),
            'image/format': dataset_util.bytes_feature(image_format),
            'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
            'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
            'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
            'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
            'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
            'image/object/class/label': dataset_util.int64_list_feature(classes),
        }))
        return tf_example
    
    
    def main(csv_input, output_path, imgPath):
        writer = tf.python_io.TFRecordWriter(output_path)
        path = imgPath
        examples = pd.read_csv(csv_input)
        grouped = split(examples, 'filename')
        for group in grouped:
            tf_example = create_tf_example(group, path)
            writer.write(tf_example.SerializeToString())
    
        writer.close()
        print('Successfully created the TFRecords: {}'.format(output_path))
    
    
    if __name__ == '__main__':
    
        imgPath = './xml' #你的图片路径
    
        # 生成train.record文件
        output_path = 'data/train.record' #你的record保存路径
        csv_input = 'data/train.csv' #你的csv文件路径
        main(csv_input, output_path, imgPath)
    
        # 生成验证文件 eval.record
        output_path = 'data/eval.record'
        csv_input = 'data/eval.csv'
        main(csv_input, output_path, imgPath)
    
    
    

    设置一下图片路径和record的保存路径就行了

    修改ssd_inception_v2_coco.config文件

    就是修改一下目录,训练次数,record文件,和lable文件等信息:
    num_classes: 5
    num_steps: 10000
    batch_size: 20
    fine_tune_checkpoint:"ssd_inception_v2_coco_2018_01_28/model.ckpt"
    train_input_reader:的下面{
    input_path: "record/train.record"
    label_map_path: "record/label_map.pbtxt"}
    test_input_reader:的下面{
    input_path: "record/val.record"
    label_map_path: "record/label_map.pbtxt"}

    # SSD with Inception v2 configuration for MSCOCO Dataset.
    # Users should configure the fine_tune_checkpoint field in the train config as
    # well as the label_map_path and input_path fields in the train_input_reader and
    # eval_input_reader. Search for "PATH_TO_BE_CONFIGURED" to find the fields that
    # should be configured.
    
    model {
      ssd {
        num_classes: 5
        box_coder {
          faster_rcnn_box_coder {
            y_scale: 10.0
            x_scale: 10.0
            height_scale: 5.0
            width_scale: 5.0
          }
        }
        matcher {
          argmax_matcher {
            matched_threshold: 0.5
            unmatched_threshold: 0.5
            ignore_thresholds: false
            negatives_lower_than_unmatched: true
            force_match_for_each_row: true
          }
        }
        similarity_calculator {
          iou_similarity {
          }
        }
        anchor_generator {
          ssd_anchor_generator {
            num_layers: 6
            min_scale: 0.2
            max_scale: 0.95
            aspect_ratios: 1.0
            aspect_ratios: 2.0
            aspect_ratios: 0.5
            aspect_ratios: 3.0
            aspect_ratios: 0.3333
            reduce_boxes_in_lowest_layer: true
          }
        }
        image_resizer {
          fixed_shape_resizer {
            height: 300
            width: 300
          }
        }
        box_predictor {
          convolutional_box_predictor {
            min_depth: 0
            max_depth: 0
            num_layers_before_predictor: 0
            use_dropout: false
            dropout_keep_probability: 0.8
            kernel_size: 3
            box_code_size: 4
            apply_sigmoid_to_scores: false
            conv_hyperparams {
              activation: RELU_6,
              regularizer {
                l2_regularizer {
                  weight: 0.00004
                }
              }
              initializer {
                truncated_normal_initializer {
                  stddev: 0.03
                  mean: 0.0
                }
              }
            }
          }
        }
        feature_extractor {
          type: 'ssd_inception_v2'
          min_depth: 16
          depth_multiplier: 1.0
          conv_hyperparams {
            activation: RELU_6,
            regularizer {
              l2_regularizer {
                weight: 0.00004
              }
            }
            initializer {
              truncated_normal_initializer {
                stddev: 0.03
                mean: 0.0
              }
            }
            batch_norm {
              train: true,
              scale: true,
              center: true,
              decay: 0.9997,
              epsilon: 0.001,
            }
          }
        }
        loss {
          classification_loss {
            weighted_sigmoid {
              anchorwise_output: true
            }
          }
          localization_loss {
            weighted_smooth_l1 {
              anchorwise_output: true
            }
          }
          hard_example_miner {
            num_hard_examples: 3000
            iou_threshold: 0.99
            loss_type: CLASSIFICATION
            max_negatives_per_positive: 3
            min_negatives_per_image: 0
          }
          classification_weight: 1.0
          localization_weight: 1.0
        }
        normalize_loss_by_num_matches: true
        post_processing {
          batch_non_max_suppression {
            score_threshold: 1e-8
            iou_threshold: 0.6
            max_detections_per_class: 100
            max_total_detections: 100
          }
          score_converter: SIGMOID
        }
      }
    }
    
    train_config: {
      batch_size: 20
      optimizer {
        rms_prop_optimizer: {
          learning_rate: {
            exponential_decay_learning_rate {
              initial_learning_rate: 0.004
              decay_steps: 10000
              decay_factor: 0.95
            }
          }
          momentum_optimizer_value: 0.9
          decay: 0.9
          epsilon: 1.0
        }
      }
      fine_tune_checkpoint: "ssd_inception_v2_coco_2018_01_28/model.ckpt"
      from_detection_checkpoint: true
      # Note: The below line limits the training process to 200K steps, which we
      # empirically found to be sufficient enough to train the pets dataset. This
      # effectively bypasses the learning rate schedule (the learning rate will
      # never decay). Remove the below line to train indefinitely.
      num_steps: 1000
      data_augmentation_options {
        random_horizontal_flip {
        }
      }
      data_augmentation_options {
        ssd_random_crop {
        }
      }
    }
    
    train_input_reader: {
      tf_record_input_reader {
        input_path: "record/train.record"
      }
      label_map_path: "record/label_map.pbtxt"
    }
    
    eval_config: {
      num_examples: 4952
      # Note: The below line limits the evaluation process to 10 evaluations.
      # Remove the below line to evaluate indefinitely.
      max_evals: 10
    }
    
    eval_input_reader: {
      tf_record_input_reader {
        input_path: "record/val.record"
      }
      label_map_path: "record/label_map.pbtxt"
      shuffle: false
      num_readers: 1
      num_epochs: 1
    }
    
    

    训练

    这是我的目录图



    在cmd下执行,cpu训练一个晚上才666次,这个精度不咋滴。

    python train.py \
    --logtostderr  \
    --train_dir=train \
    --pipeline_config_path=ssd_inception_v2_coco.config
    

    生成pb文件

    训练666次了。早上起来直接Ctrl+c关掉,如果想继续在666上继续训练,直接执行上面的就可以了,它会读取train 下面的训练文件的。

    python export_inference_graph.py  
    --pipeline_config_path ssd_inception_v2_coco.config
     --trained_checkpoint_prefix "pb/model.ckpt-666" 
    --output_directory pb
    

    生成pb的时候出错了

    ValueError: Protocol message RewriterConfig has no "layout_optimizer" field
    

    在/object_detection/exporter.py”文件,将第72行的layout_optimizer与相互更改optimize_tensor_layout就解决问题啦。

    测试

    test_image.py

    import matplotlib.pyplot as plt
    import numpy as np
    import os 
    import tensorflow as tf
    from object_detection.utils import label_map_util
    from object_detection.utils import visualization_utils as vis_util
    from PIL import Image
    
    
    def test():
        #重置图
        tf.reset_default_graph()
        '''
        载入模型以及数据集样本标签,加载待测试的图片文件
        '''
        #指定要使用的模型的路径  包含图结构,以及参数
        PATH_TO_CKPT = 'pb/frozen_inference_graph.pb'
        
        #测试图片所在的路径
        PATH_TO_TEST_IMAGES_DIR = './test_images'
        
        TEST_IMAGE_PATHS = [os.path.join(PATH_TO_TEST_IMAGES_DIR,'{}.jpg'.format(i)) for i in range(1,11) ]
        
        #数据集对应的label pascal_label_map.pbtxt文件保存了index和类别名之间的映射
        PATH_TO_LABELS = "record/label_map.pbtxt"
        
        NUM_CLASSES = 5
         
        #重新定义一个图
        output_graph_def = tf.GraphDef()
        
        with tf.gfile.GFile(PATH_TO_CKPT,'rb') as fid:
            #将*.pb文件读入serialized_graph
            serialized_graph = fid.read()
            #将serialized_graph的内容恢复到图中
            output_graph_def.ParseFromString(serialized_graph)
            #print(output_graph_def)
            #将output_graph_def导入当前默认图中(加载模型)
            tf.import_graph_def(output_graph_def,name='')
            
        print('模型加载完成')    
        
        #载入coco数据集标签文件
        label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
        categories = label_map_util.convert_label_map_to_categories(label_map,max_num_classes = NUM_CLASSES,use_display_name = True)
        category_index = label_map_util.create_category_index(categories)
        
        
        '''
        定义session
        '''
        def load_image_into_numpy_array(image):
            '''
            将图片转换为ndarray数组的形式
            '''
            im_width,im_height = image.size
            return np.array(image.getdata()).reshape((im_height,im_width,3)).astype(np.uint0)
        
        #设置输出图片的大小
        IMAGE_SIZE = (12,8)
        
        #使用默认图,此时已经加载了模型
        detection_graph = tf.get_default_graph()
        
        with tf.Session(graph=detection_graph) as sess:
            for image_path in TEST_IMAGE_PATHS:
                image = Image.open(image_path)
                #将图片转换为numpy格式
                image_np = load_image_into_numpy_array(image)
                
                '''
                定义节点,运行并可视化
                '''
                #将图片扩展一维,最后进入神经网络的图片格式应该是[1,?,?,3]
                image_np_expanded = np.expand_dims(image_np,axis = 0)
                
                '''
                获取模型中的tensor
                '''
                image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
                            
                #boxes用来显示识别结果
                boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
                
                #Echo score代表识别出的物体与标签匹配的相似程度,在类型标签后面
                scores = detection_graph.get_tensor_by_name('detection_scores:0')
                classes = detection_graph.get_tensor_by_name('detection_classes:0')
                num_detections = detection_graph.get_tensor_by_name('num_detections:0')
                
                #开始检查
                boxes,scores,classes,num_detections = sess.run([boxes,scores,classes,num_detections],
                                                               feed_dict={image_tensor:image_np_expanded})
                
                #可视化结果
                vis_util.visualize_boxes_and_labels_on_image_array(
                        image_np,
                        np.squeeze(boxes),
                        np.squeeze(classes).astype(np.int32),
                        np.squeeze(scores),
                        category_index,
                        use_normalized_coordinates=True,
                        line_thickness=8)
                plt.figure(figsize=IMAGE_SIZE)
                print(type(image_np))
                print(image_np.shape)
                image_np = np.array(image_np,dtype=np.uint8)            
                plt.imshow(image_np)
                plt.show()
        
        
                    
    if __name__ == '__main__':
        test()
    

    如下执行结果:


    范冰冰



    这是大咪咪




    其他人的识别都不是很高,可能和训练图片有关

    相关文章

      网友评论

        本文标题:tensorflow人脸识别(自己的数据集)

        本文链接:https://www.haomeiwen.com/subject/ymstaftx.html