美文网首页我爱编程
tensorflow lite使用ssd_keras进行图片物体

tensorflow lite使用ssd_keras进行图片物体

作者: 陆号 | 来源:发表于2018-04-24 11:07 被阅读1287次

    使用的是这个工程 pierluigiferrari/ssd_keras

    1.数据集标注

    a.采用VoTT用于图像检测任务的数据集制作voc格式

    2.模型训练

    使用这个文件里面https://github.com/pierluigiferrari/ssd_keras/blob/master/ssd300_training.ipynb加载voc格式数据集的方法
    利用这个https://github.com/pierluigiferrari/ssd_keras/blob/master/ssd7_training.ipynb训练一个浅层的深度学习模型,这个模型部署到ncnn有若干层不支持
    由于我利用vott制作的数据集中有些属性缺少,加载数据集时会出错,需要对文件data_generator/object_detection_2d_data_generator.py作出如下修改,
    difficult = int(obj.difficult.text) 修改为 difficult = 0
    if batch_inverse_transforms: batch_inverse_transforms.pop(j)修改为
    if batch_inverse_transforms and j < len(batch_inverse_transforms): batch_inverse_transforms.pop(j)
    我在标注数据集的时候有些出错,在这里修改以下

                        for obj in objects:
                            print(filename)
                            class_name = obj.find('name').text
                            class_name = class_name.lower()
                            if class_name == 'thanks-1':
                               class_name = 'thanks' 
                            class_id = self.classes.index(class_name)
    

    3.利用tensorflow lite布置到移动端

    a.The TensorFlow Lite iOS Demo App,使用量化模型实时预测

    上面的demo如果要使用非量化模型需要做如下改动

    // If you have your own model, modify this to the file name, and make sure
    // you've added the file to your app resources too.
    //static NSString* model_file_name = @"mobilenet_quant_v1_224";
    static NSString* model_file_name = @"mobilenet_v1_1.0_224";
    static NSString* model_file_type = @"tflite";
    
    // Returns the top N confidence values over threshold in the provided vector,
    // sorted by confidence in descending order.
    static void GetTopN_float(const float* prediction, const int prediction_size, const int num_results,
                        const float threshold, std::vector<std::pair<float, int>>* top_results) {
        // Will contain top N results in ascending order.
        std::priority_queue<std::pair<float, int>, std::vector<std::pair<float, int>>,
        std::greater<std::pair<float, int>>>
        top_result_pq;
    
        const long count = prediction_size;
        for (int i = 0; i < count; ++i) {
            const float value = prediction[i];//#######不用除以255.0
            // Only add it if it beats the threshold and has a chance at being in
            // the top N.
            if (value < threshold) {
                continue;
            }
            
            top_result_pq.push(std::pair<float, int>(value, i));
            
            // If at capacity, kick the smallest value out.
            if (top_result_pq.size() > num_results) {
                top_result_pq.pop();
            }
        }
        
        // Copy to output vector and reverse into descending order.
        while (!top_result_pq.empty()) {
            top_results->push_back(top_result_pq.top());
            top_result_pq.pop();
        }
        std::reverse(top_results->begin(), top_results->end());
    }
    
    //注意如果相机是默认设置的话,pixelBuffer得到的图片是横着的
    - (void)runModelOnFrame_float:(CVPixelBufferRef)pixelBuffer {
        assert(pixelBuffer != NULL);
        
        OSType sourcePixelFormat = CVPixelBufferGetPixelFormatType(pixelBuffer);
        assert(sourcePixelFormat == kCVPixelFormatType_32ARGB ||
               sourcePixelFormat == kCVPixelFormatType_32BGRA);
        
        const int sourceRowBytes = (int)CVPixelBufferGetBytesPerRow(pixelBuffer);
        const int image_width = (int)CVPixelBufferGetWidth(pixelBuffer);
        const int fullHeight = (int)CVPixelBufferGetHeight(pixelBuffer);
        
        CVPixelBufferLockFlags unlockFlags = kNilOptions;
        CVPixelBufferLockBaseAddress(pixelBuffer, unlockFlags);
        
        unsigned char* sourceBaseAddr = (unsigned char*)(CVPixelBufferGetBaseAddress(pixelBuffer));
        int image_height;
        unsigned char* sourceStartAddr;
        if (fullHeight <= image_width) {
            image_height = fullHeight;
            sourceStartAddr = sourceBaseAddr;
        } else {
            image_height = image_width;
            const int marginY = ((fullHeight - image_width) / 2);
            sourceStartAddr = (sourceBaseAddr + (marginY * sourceRowBytes));
        }
        const int image_channels = 4;
        assert(image_channels >= wanted_input_channels);
        uint8_t* in = sourceStartAddr;
        //根据当前问题填充输入张量
        //得到输入张量数组中的第一个张量,也就是classifier中唯一的那个输入张量,input是个整型值,语义是张量列表中的引索。第二条语句有两个作用,
        //1)以input为索引,在TfLiteTensor* content_.tensors这个张量表得到具体的张量。
        int input = interpreter->inputs()[0];
        //2)返回该张量data.raw,它指示张量正关联着的内存块。有了out,app就把可以把要预测的图像数据填向它了。
        float* out = interpreter->typed_tensor<float>(input);
        
        const float input_mean = 127.5f;
        const float input_std = 127.5f;
        for (int y = 0; y < wanted_input_height; ++y) {
            float* out_row = out + (y * wanted_input_width * wanted_input_channels);
            for (int x = 0; x < wanted_input_width; ++x) {
                const int in_x = (y * image_width) / wanted_input_width;
                const int in_y = (x * image_height) / wanted_input_height;
                uint8_t* in_pixel = in + (in_y * image_width * image_channels) + (in_x * image_channels);
                float* out_pixel = out_row + (x * wanted_input_channels);
                for (int c = 0; c < wanted_input_channels; ++c) {
                   //#############归一化
                    out_pixel[c] = (in_pixel[c] - input_mean) / input_std;
                }
            }
        }
        //调用Invoke进行预测
        double startTimestamp = [[NSDate new] timeIntervalSince1970];
        if (interpreter->Invoke() != kTfLiteOk) {
            LOG(FATAL) << "Failed to invoke!";
        }
        double endTimestamp = [[NSDate new] timeIntervalSince1970];
        total_latency += (endTimestamp - startTimestamp);
        total_count += 1;
        NSLog(@"Time: %.4lf, avg: %.4lf, count: %d", endTimestamp - startTimestamp,
              total_latency / total_count, total_count);
        
        const int output_size = 1000;
        const int kNumResults = 5;
        const float kThreshold = 0.1f;
        
        std::vector<std::pair<float, int>> top_results;
        
        //解析输出张量得到识别结果
        //作用是得到输出张量关联的内存块。output存放的数据已是一维数组,之后就可用它得到识别结果了
        float* output = interpreter->typed_output_tensor<float>(0);
        //GetTopN用于计算output数组中最大的N个值(first),以及它们的位置(second)
        GetTopN_float(output, output_size, kNumResults, kThreshold, &top_results);
        
        NSMutableDictionary* newValues = [NSMutableDictionary dictionary];
        for (const auto& result : top_results) {
            const float confidence = result.first;
            const int index = result.second;
            NSString* labelObject = [NSString stringWithUTF8String:labels[index].c_str()];
            NSNumber* valueObject = [NSNumber numberWithFloat:confidence];
            [newValues setObject:valueObject forKey:labelObject];
        }
        dispatch_async(dispatch_get_main_queue(), ^(void) {
            [self setPredictionValues:newValues];
        });
        
        CVPixelBufferUnlockBaseAddress(pixelBuffer, unlockFlags);
        
        CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
    }
    

    运行非量化模型预测图片
    cd tensorflow/contrib/lite/examples/ios/simple
    pod install
    pod update
    open tflite_simple_example.xcworkspace

    demo的理解:
    TensorFlow Lite(2/3):tflite文件和AI Smart

    使用非压缩模型会奔溃,解决方案如下:
    mobilenet_quant_v1_224.tflite. but crash when i run my own model and give the error
    label_image.cc

    tensorflow .pb文件模型量化

    Fixed Point Quantization
    Using 8-bit calculations help your models run faster and use less power.

    b.生成TFLite文件,要先安装tensorflow源码和bazel方法,编译tensorflow源码生成tensorflow

    Installing Bazel

    从源代码安装 TensorFlow

    bazel build --config=opt //tensorflow/tools/pip_package:build_pip_package
    如果tensorflow环境已经搭建好的话,执行到上一步就可以了

    如果build过程失败了,我们也可以执行如下指令,只生成我们在模型转化时需要的执行文件
    .bazel build tensorflow/python/tools:freeze_graph
    bazel build --config=opt tensorflow/contrib/lite/toco:toco
    bazel build tensorflow/tools/graph_transforms:summarize_graph
    bazel build tensorflow/tools/quantization:quantize_graph

    Exporting the Inference Graph
    Convert the model format
    tensorflow保存pb文件

    Tensorflow Lite之编译生成tflite文件
    TensorFlow Lite学习笔记2:生成TFLite模型文件
    TensorFlow Lite相关
    如何量化现有的pb模型
    bazel-bin/tensorflow/tools/quantization/quantize_graph --input=/home/ljg/下载/mobilenet_v1_1.0_224/mobilenet_v1_1.0_224_frozen.pb --output_node_names=MobilenetV1/Predictions/Reshape_1 --output=quantized_graph.pb --mode=eightbit

    bazel-bin/tensorflow/contrib/lite/toco/toco --input_file=quantized_graph.pb --input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE --output_file=quantized_graphh.tflite --inference_type=QUANTIZED_UINT8 --inference_input_type=QUANTIZED_UINT8 --input_shapes=1,224,224,3 --input_arrays=input --output_arrays=MobilenetV1/Predictions/Reshape_1 --mean_values=128 --std_values=128 --default_ranges_min=0 --default_ranges_max=6

    tensorflow lite: error when convert frozen model to lite format

    相关文章

      网友评论

        本文标题:tensorflow lite使用ssd_keras进行图片物体

        本文链接:https://www.haomeiwen.com/subject/zzbdlftx.html