yolov3的不完整填坑手册

作者: conner是位好少年 | 来源:发表于2019-07-29 14:13 被阅读0次

yolov3的不完整填坑手册
Android填坑手册
12月7日
产品经理实战课程听课笔记
再挖个坑，明天填，push的
Wendy Shijia 的「 Escher's Gallery
Android 周报第十六期
composer的原理和使用（下）--先占坑
RAC
填坑

上一讲我们集中讲了如何使用yolov3来进行样本的训练和测试，但是其实这样的测试和训练还是留下了一些坑，现在废话不多说我们现在来面对一个又一个坑，并去解决它吧。

坑一，我们训练的过程中那些输出是什么意思？
具体解释，在上一篇中已经讲过了，这里就不赘述了，有疑问去查看我上一篇文章，顺便点赞点个关注。
坑二，我们训练到什么样的程度的时候跳出训练。或者说，什么时候跳出训练？
其实conner在昨天做训练的时候也是一脸懵逼，因为它一直在跑，一直在跑，我的执行又是通过在控制台的输入来完成的，一时不知道该怎么来控制跳出条件，明明两个loss都已经明显小于1了可就是没有办法退出。其实这个时候有两个方案，一个是用backup里每一百步存放的那个weight来用，但是总感觉这种方法不是很好，难道没有可以设置跳出状态的工具吗？其实是有的，我们上一讲中又一个文件：cfg/yolov3-voc.cfg ,里面有关于隐藏层的配置，我们找到这个关键字

max_batches = 50200

设置为你需要的步数，然后当训练这么多步之后就会跳出，并会在你的backup文件中存放这样一个文件

image.png

这个就是你最终的训练结果。

坑三，训练跳出后如何测试
根据上一章的结果运行

./darknet detector test cfg/voc.data cfg/yolov3-voc.cfg backup/yolov3-voc_final.weights one.jpg

运行结果如下所示：

one.jpg: Predicted in 0.025834 seconds.
xxx: 99%
xxx: 86%
xxx: 69%
./darknet detector test cfg/voc.data cfg/yolov3-voc.cfg    4.16s user 0.50s system 99% cpu 4.669 total

conner：诶怎么回事呀，小老弟？我想要的不是告诉我结果是啥，而是我想识别的内容在哪里，我需要的是位置，不是内容，这可怎么办。
yolo：很简单，修改代码
conner：可是c++我不会啊。
yolo：你可是拷贝忍着卡卡西，网上去找吧。
按照yolo的推荐我找到了如下的修改方式（参考博客链接https://blog.csdn.net/qq_36362060/article/details/88895161
）

首先修改YOLOV3中src/imge.c中的void draw_detections函数

void draw_detections(char *filename, image im, detection *dets, int num, float thresh, char **names, image **alphabet, int classes)
{
    int i,j;
    //将filename的文件名提取出来，即去掉.jpg的后缀，加上.txt的后缀
    char *output = filename;
    int haha = 0;
    for (haha = strlen(filename)-1; haha>=0; haha--){
        if((filename[haha]!='j')&&(filename[haha]!='p')&&(filename[haha]!='g')&&(filename[haha]!='.')){
            break;
        }
        else{
            output[haha] = '\0';
        }
    }
    output = strcat(filename, ".txt");
    //创建保存位置信息txt文档
    FILE *fp;
     if ( (fp = fopen(output, "w+")) == NULL ){
            printf("创建文档失败:\n");
        }
    for(i = 0; i < num; ++i){
        char labelstr[4096] = {0};
        int class = -1;
        for(j = 0; j < classes; ++j){
           //只显示测试图中dog类的信息
          if (strcmp(names[j], "dog") != 0)
             {continue;}
            if (dets[i].prob[j] > thresh){
                if (class < 0) {
                    strcat(labelstr, names[j]);
                    class = j;
                } else {
                    strcat(labelstr, ", ");
                    strcat(labelstr, names[j]);
                }
                printf("%s: %.0f%%\n", names[j], dets[i].prob[j]*100);
            }
        }
        if(class >= 0){
            int width = im.h * .006;
            //printf("%d %s: %.0f%%\n", i, names[class], prob*100);
            int offset = class*123457 % classes;
            float red = get_color(2,offset,classes);
            float green = get_color(1,offset,classes);
            float blue = get_color(0,offset,classes);
            float rgb[3];
            rgb[0] = red;
            rgb[1] = green;
            rgb[2] = blue;
            box b = dets[i].bbox;
            int left  = (b.x-b.w/2.)*im.w;
            int right = (b.x+b.w/2.)*im.w;
            int top   = (b.y-b.h/2.)*im.h;
            int bot   = (b.y+b.h/2.)*im.h;
            if(left < 0) left = 0;
            if(right > im.w-1) right = im.w-1;
            if(top < 0) top = 0;
            if(bot > im.h-1) bot = im.h-1;
            draw_box_width(im, left, top, right, bot, width, red, green, blue);
            printf("left:%d, top:%d, right:%d, bot:%d \n",left,top,right,bot);
            fprintf(fp, "%d %d %d %d\n",left, top, right, bot);
            if (alphabet) {
                image label = get_label(alphabet, labelstr, (im.h*.03));
                draw_label(im, top + width, left, label, rgb);
                free_image(label);
            }
            if (dets[i].mask){
                image mask = float_to_image(14, 14, 1, dets[i].mask);
                image resized_mask = resize_image(mask, b.w*im.w, b.h*im.h);
                image tmask = threshold_image(resized_mask, .5);
                embed_image(tmask, im, left, top);
                free_image(mask);
                free_image(resized_mask);
                free_image(tmask);
            }
        }
    }
  fclose(fp);
}

修改example中coco.c, yolo.c, detector.c,文件，修改include中darknet.h文件，具体修改如下

coco.c

draw_detections(im, dets, l.side*l.side*l.n, thresh, coco_classes, alphabet, 80);
draw_detections(input, im, dets, l.side*l.side*l.n, thresh, coco_classes, alphabet, 80);

yolo.c

draw_detections( im, dets, l.side*l.side*l.n, thresh, voc_names, alphabet, 20);
draw_detections(input, im, dets, l.side*l.side*l.n, thresh, voc_names, alphabet, 20);

detector.c

draw_detections(im, dets, nboxes, thresh, names, alphabet, l.classes);
draw_detections(input, im, dets, nboxes, thresh, names, alphabet, l.classes);

darknet.h

void draw_detections(image im, detection *dets, int num, float thresh, char **names, image **alphabet, int classes);
void draw_detections(char *filename, image im, detection *dets, int num, float thresh, char **names, image **alphabet, int classes);

然后
去到/darknet执行：make clean 再执行make
运行上面那一行测试代码结果如下：

xxx: 100%
left:175, top:19, right:237, bot:104 
xxx: 100%
left:8, top:104, right:92, bot:187 
xxx: 98%
left:177, top:121, right:239, bot:185 
xxx: 85%
left:79, top:211, right:155, bot:278

OK，完成啦，同时本地还会生成一个predictions.jpg打开看一下，完美。这就是我们想要的结果。

坑四训练中的样本控制（过拟合与欠拟合）
说玄乎一点就是一些根据经验来的值，说的难听一点就是我也不知道原因，试出来的我没有办法解释的值。这里先讲值，以后再去关注原理吧。
训练时候的欠拟合，样本量过少。增加训练的样本量。batch=64，subdivisions=16。前者表示批次，后者表示训练迭代包含的组数