利用FCN-8s网络训练自己数据集(NYUD为例)

作者: 李亚鑫 | 来源:发表于2017-08-31 16:33 被阅读890次

    FCN 官方Github 地址:shelhamer/fcn.berkeleyvision.org
    我修改后的Gitbub 地址: yxliwhu/NYUD-FCN8s

    Papers:
    Fully Convolutional Models for Semantic Segmentation
    Evan Shelhamer, Jonathan Long, Trevor Darrell
    PAMI 2016
    arXiv:1605.06211
    Fully Convolutional Models for Semantic Segmentation
    Jonathan Long, Evan Shelhamer, Trevor Darrell
    CVPR 2015
    arXiv:1411.4038

    官方的Code提供了PASCAL VOC models,SIFT Flow models,PASCAL-Context models的完整(32s,16s,8s)的代码,但对于NYUD只提供了32s的代码,这里我们就以NYUD作为例子说明一下FCN-8s训练的完整过程(网上有很多教程,但不是不完整,就是存在错误)。

    源代码下载和数据集预处理

    • 下载官方源代码:
    git clone https://github.com/shelhamer/fcn.berkeleyvision.org.git
    
    • 下载VGG16的预训练模型并放在FCN源码文件夹中的ilsvrc-nets文件夹下:
    cd /fcn.berkeleyvision.org/ilsvrc-nets
    wget http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_16_layers.caffemodel
    
    • 获取与其相对应的deploy文件:
    wget https://gist.githubusercontent.com/ksimonyan/211839e770f7b538e2d8/raw/0067c9b32f60362c74f4c445a080beed06b07eb3/VGG_ILSVRC_16_layers_deploy.prototxt
    
    • 下载数据集:
    cd data/nyud/
    wget http://people.eecs.berkeley.edu/~sgupta/cvpr13/data.tgz
    tar -xvf data.tgz
    
    • 这个时候,nyud的文件夹应该是这个样子:


      1.png

      其中data文件夹中有三个子文件夹:benchmarkData, colorImage, pointCloud. 其中benchmarkData/groundTruth中储存这所有我们需要的分割的真值,colorImage文件夹储存着原始的RGB文件.由于源代码设置的groundTruth路径和现有的路径不一样,所以我们要把groundTruth文件copy到指定路径:

    mkdir segmentation
    cp data/benchmarkData/groundTruth/*.mat segmentation/
    
    • 同时,我们要合并train.txt和val.txt: 在nyud文件夹中新建一个空白的.txt文件并命名为trainval.txt,然后将train.txt和val.txt中的内容Copy过去.这个时候nyud文件夹应为是这个样子:
    image.png
    • 数据准备完毕,现在开始训练FCN-32s网络.

    FCN-32s网络训练:

    • 把要用到的.py文件Copy到nyud-fcn32s-color文件夹:
    cd fcn.berkeleyvision.org
    cp *.py nyud-fcn32s-color/
    cd nyud-fcn32s-color
    rm pascalcontext_layers.py
    rm voc_helper.py
    rm voc_layer.py
    rm siftflow_layers.py
    
    train_net: "trainval.prototxt"
    test_net: "test.prototxt"
    test_iter: 200
    # make test net, but don't invoke it from the solver itself
    test_interval: 999999999
    display: 20
    average_loss: 20
    lr_policy: "fixed"
    # lr for unnormalized softmax
    base_lr: 1e-10
    # high momentum
    momentum: 0.99
    # no gradient accumulation
    iter_size: 1
    max_iter: 300000
    weight_decay: 0.0005
    snapshot: 5000
    snapshot_prefix: "snapshot/train"
    test_initialization: false
    
    • solve.py文件修改:
      在这里郑重声明一下:如果训练fcn32s的网络模型,一定要修改solve.py,利用transplant的方式获取vgg16的网络权重.具体操作为:
    import caffe
    import surgery, score
    
    import numpy as np
    import os
    import sys
    
    try:
        import setproctitle
        setproctitle.setproctitle(os.path.basename(os.getcwd()))
    except:
        pass
    
    vgg_weights = '../ilsvrc-nets/VGG_ILSVRC_16_layers.caffemodel'
    vgg_proto = '../ilsvrc-nets/VGG_ILSVRC_16_layers_deploy.prototxt'
    
    # init
    #caffe.set_device(int(sys.argv[1]))
    #caffe.set_device(0)
    #caffe.set_mode_gpu()
    caffe.set_mode_cpu()
    
    # solver = caffe.SGDSolver('solver.prototxt')
    # solver.net.copy_from(weights)
    solver = caffe.SGDSolver('solver.prototxt')
    vgg_net = caffe.Net(vgg_proto, vgg_weights, caffe.TRAIN)
    surgery.transplant(solver.net, vgg_net)
    del vgg_net
    
    # surgeries
    interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
    surgery.interp(solver.net, interp_layers)
    
    # scoring
    test = np.loadtxt('../data/nyud/test.txt', dtype=str)
    
    for _ in range(50):
        solver.step(2000)
        score.seg_tests(solver, False, test, layer='score')
    

    可以看到我注释了:

    #weights = '../ilsvrc-nets/vgg16-fcn.caffemodel
    ...
    ...
    # solver = caffe.SGDSolver('solver.prototxt')
    # solver.net.copy_from(weights)
    

    添加了:

    vgg_weights = '../ilsvrc-nets/VGG_ILSVRC_16_layers.caffemodel'
    vgg_proto = '../ilsvrc-nets/VGG_ILSVRC_16_layers_deploy.prototxt'
    ...
    ...
    solver = caffe.SGDSolver('solver.prototxt')
    vgg_net = caffe.Net(vgg_proto, vgg_weights, caffe.TRAIN)
    surgery.transplant(solver.net, vgg_net)
    del vgg_net
    
    • 关于transplant函数的解释可以再surgery.py文件中找到:


      2.png
    • 同时由于路径原因,需要修改nyud_layers.py中laod_label function 的内容:
    #label = scipy.io.loadmat('{}/segmentation/img_{}.mat'.format(self.nyud_dir, idx))['segmentation'].astype(np.uint8)
    label = scipy.io.loadmat('{}/segmentation/img_{}.mat'.format(self.nyud_dir, idx))['groundTruth'][0,0][0,0]['SegmentationClass'].astype(np.uint16)
    for (x,y), value in np.ndenumerate(label):
    label[x,y] = self.class_map[0][value-1]
    label = label.astype(np.uint8)
    
    • 以上配置全部结束,开始进行模型训练:
    cd nyud-fcn32s-color
    mkdir snapshot
    python solve.py
    
    • 大概迭代150000次以后,就可以达到论文描述的精度.
      测试单张图片
      在fcn源码文件夹,找到infer.py,重命名为test.py 并修改:
    im = Image.open('/home/li/Documents/fcn.berkeleyvision.org/nyud-fcn32s-color/test.png')
    ...
    ...
    net = caffe.Net('/home/li/Documents/fcn.berkeleyvision.org/nyud-fcn32s-color/deploy.prototxt', '/home/li/Downloads/nyud-fcn32s-color-heavy.caffemodel', caffe.TEST)
    

    其中:nyud-fcn32s-color-heavy.caffemodel 为训练得到的model文件, test.png 为测试文件.

    • 我附上个人完整的test.py的代码:
    import numpy as np  
    from PIL import Image  
    import matplotlib.pyplot as plt  
    import sys     
    import caffe  
    import cv
    import scipy.io
    # import pydensecrf.densecrf as dcrf 
    # from pydensecrf.utils import compute_unary, create_pairwise_bilateral,create_pairwise_gaussian, softmax_to_unary 
    import pdb
    # matplotlib inline  
    # load image, switch to BGR, subtract mean, and make dims C x H x W for Caffe  
    im = Image.open('/home/li/Documents/fcn.berkeleyvision.org/nyud-fcn32s-color/test.png')  
    in_ = np.array(im, dtype=np.float32)  
    in_ = in_[:,:,::-1]  
    in_ -= np.array((104.00698793,116.66876762,122.67891434))  
    in_ = in_.transpose((2,0,1))  
      
    # load net  
    net = caffe.Net('/home/li/Documents/fcn.berkeleyvision.org/nyud-fcn32s-color/deploy.prototxt', '/home/li/Downloads/nyud-fcn32s-color-heavy.caffemodel', caffe.TEST)  
    # shape for input (data blob is N x C x H x W), set data  
    net.blobs['data'].reshape(1, *in_.shape)  
    net.blobs['data'].data[...] = in_  
    # run net and take argmax for prediction  
    net.forward()  
    # pdb.set_trace()
    out = net.blobs['score'].data[0].argmax(axis=0) 
    scipy.io.savemat('/home/li/Documents/fcn.berkeleyvision.org/nyud-fcn32s-color/out.mat',{'X':out}) 
    #print "hello,python!"  
      
    #plt.imshow(out,cmap='gray');  
    plt.imshow(out)  
    plt.axis('off')  
    plt.savefig('testout_32s.png')  
    
    
    • 如果没有deploy文件,可以参考如下方法:
      首先,根据你利用的模型,例如模型是nyud-fcn32s-color的,那么你就去nyud-fcn32s-color的文件夹,里面有trainval.prototxt文件,将文件打开,全选,复制,新建一个名为deploy.prototxt文件,粘贴进去,然后ctrl+F 寻找所有名为loss的layer 将这个layer统统删除
      然后在文件顶部加上
    layer {
    name: "input"
    type: "Input"
    top: "data"
    input_param {
    # These dimensions are purely for sake of example;
    # see infer.py for how to reshape the net to the given input size.
    shape { dim: 1 dim: 3 dim: 425 dim: 540 }
    }
    }
    

    其中shape{dim:1 dim:3 dim:425 dim:540}, 这里425 和540代表测试文件的维度.

    4.png

    FCN-16s网络训练:

    • 对于FCN-16s网络的训练,由于没有对应的源代码,所以一切的东西都要我们自己来做,还好官方提供了其他dataset的源代码,我们可以依照这些内容生成相应的训练文件.我们可以先比较一下voc-fcn16s和voc-fcn32s 相对应的net.py(用来生成.prototxt文件)代码:
    3.png

    红色框是两个文件的不同的地方,对比两者的network结构可以清楚的看到区别.
    想要获取network,运行/caffe/python 文件夹下的draw_net.py文件,这里就不展开了.
    所以我们可以根据上图对比的结构从nyud-fcn32s-color/net.py 修改得到新的net.py文件:

    cd fcn.berkeleyvision.org
    mkdir nyud-fcn16s-color
    cp nyud-fcn32s-color/net.py nyud-fcn16s-color/net.py
    
    • 修改后的net.py文件为:
    import caffe
    from caffe import layers as L, params as P
    from caffe.coord_map import crop
    
    def conv_relu(bottom, nout, ks=3, stride=1, pad=1):
        conv = L.Convolution(bottom, kernel_size=ks, stride=stride,
            num_output=nout, pad=pad,
            param=[dict(lr_mult=1, decay_mult=1), dict(lr_mult=2, decay_mult=0)])
        return conv, L.ReLU(conv, in_place=True)
    
    def max_pool(bottom, ks=2, stride=2):
        return L.Pooling(bottom, pool=P.Pooling.MAX, kernel_size=ks, stride=stride)
    
    def fcn(split, tops):
        n = caffe.NetSpec()
        n.data, n.label = L.Python(module='nyud_layers',
                layer='NYUDSegDataLayer', ntop=2,
                param_str=str(dict(nyud_dir='../data/nyud', split=split,
                    tops=tops, seed=1337)))
    
        # the base net
        n.conv1_1, n.relu1_1 = conv_relu(n.data, 64, pad=100)
        n.conv1_2, n.relu1_2 = conv_relu(n.relu1_1, 64)
        n.pool1 = max_pool(n.relu1_2)
    
        n.conv2_1, n.relu2_1 = conv_relu(n.pool1, 128)
        n.conv2_2, n.relu2_2 = conv_relu(n.relu2_1, 128)
        n.pool2 = max_pool(n.relu2_2)
    
        n.conv3_1, n.relu3_1 = conv_relu(n.pool2, 256)
        n.conv3_2, n.relu3_2 = conv_relu(n.relu3_1, 256)
        n.conv3_3, n.relu3_3 = conv_relu(n.relu3_2, 256)
        n.pool3 = max_pool(n.relu3_3)
    
        n.conv4_1, n.relu4_1 = conv_relu(n.pool3, 512)
        n.conv4_2, n.relu4_2 = conv_relu(n.relu4_1, 512)
        n.conv4_3, n.relu4_3 = conv_relu(n.relu4_2, 512)
        n.pool4 = max_pool(n.relu4_3)
    
        n.conv5_1, n.relu5_1 = conv_relu(n.pool4, 512)
        n.conv5_2, n.relu5_2 = conv_relu(n.relu5_1, 512)
        n.conv5_3, n.relu5_3 = conv_relu(n.relu5_2, 512)
        n.pool5 = max_pool(n.relu5_3)
    
        # fully conv
        n.fc6, n.relu6 = conv_relu(n.pool5, 4096, ks=7, pad=0)
        n.drop6 = L.Dropout(n.relu6, dropout_ratio=0.5, in_place=True)
        n.fc7, n.relu7 = conv_relu(n.drop6, 4096, ks=1, pad=0)
        n.drop7 = L.Dropout(n.relu7, dropout_ratio=0.5, in_place=True)
    
        n.score_fr = L.Convolution(n.drop7, num_output=40, kernel_size=1, pad=0,
            param=[dict(lr_mult=1, decay_mult=1), dict(lr_mult=2, decay_mult=0)])
        n.upscore2 = L.Deconvolution(n.score_fr,
            convolution_param=dict(num_output=40, kernel_size=4, stride=2,
                bias_term=False),
            param=[dict(lr_mult=0)])
    
        n.score_pool4 = L.Convolution(n.pool4, num_output=40, kernel_size=1, pad=0,
            param=[dict(lr_mult=1, decay_mult=1), dict(lr_mult=2, decay_mult=0)])
        n.score_pool4c = crop(n.score_pool4, n.upscore2)
        n.fuse_pool4 = L.Eltwise(n.upscore2, n.score_pool4c,
                operation=P.Eltwise.SUM)
        n.upscore16 = L.Deconvolution(n.fuse_pool4,
            convolution_param=dict(num_output=40, kernel_size=32, stride=16,
                bias_term=False),
            param=[dict(lr_mult=0)])
    
        n.score = crop(n.upscore16, n.data)
        n.loss = L.SoftmaxWithLoss(n.score, n.label,
                loss_param=dict(normalize=False, ignore_label=255))
    
        return n.to_proto()
    
    def make_net():
        tops = ['color', 'label']
        with open('trainval.prototxt', 'w') as f:
            f.write(str(fcn('trainval', tops)))
    
        with open('test.prototxt', 'w') as f:
            f.write(str(fcn('test', tops)))
    
    if __name__ == '__main__':
        make_net()
    
    
    • 运行net.py文件来生成.prototxt 文件:
    cd nyud-fcn16s-color/
    python net.py
    
    • Copy并修改solve.py文件:
    import caffe
    import surgery, score
    
    import numpy as np
    import os
    import sys
    
    try:
        import setproctitle
        setproctitle.setproctitle(os.path.basename(os.getcwd()))
    except:
        pass
    
    weights = '../nyud-fcn32s-color/nyud-fcn32s-color-heavy.caffemodel'
    #vgg_weights = '../ilsvrc-nets/VGG_ILSVRC_16_layers.caffemodel'
    #vgg_proto = '../ilsvrc-nets/VGG_ILSVRC_16_layers_deploy.prototxt'
    
    # init
    #caffe.set_device(int(sys.argv[1]))
    #caffe.set_device(0)
    #caffe.set_mode_gpu()
    caffe.set_mode_cpu()
    
    solver = caffe.SGDSolver('solver.prototxt')
    solver.net.copy_from(weights)
    #solver = caffe.SGDSolver('solver.prototxt')
    #vgg_net = caffe.Net(vgg_proto, vgg_weights, caffe.TRAIN)
    #surgery.transplant(solver.net, vgg_net)
    #del vgg_net
    
    # surgeries
    interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
    surgery.interp(solver.net, interp_layers)
    
    # scoring
    test = np.loadtxt('../data/nyud/test.txt', dtype=str)
    
    for _ in range(50):
        solver.step(5000)
        score.seg_tests(solver, False, test, layer='score')
    
    
    • Copy并修改solver.prototxt文件(主要是修改base_lr的值,也就是Learning rate):
    train_net: "trainval.prototxt"
    test_net: "test.prototxt"
    test_iter: 200
    # make test net, but don't invoke it from the solver itself
    test_interval: 999999999
    display: 20
    average_loss: 20
    lr_policy: "fixed"
    # lr for unnormalized softmax
    base_lr: 1e-12
    # high momentum
    momentum: 0.99
    # no gradient accumulation
    iter_size: 1
    max_iter: 300000
    weight_decay: 0.0005
    snapshot: 5000
    snapshot_prefix: "snapshot/train"
    test_initialization: false
    
    • 然后把要用到的.py文件拷贝到nyud-fcn16s-color文件夹:
    cd fcn.berkeleyvision.org
    cp *.py nyud-fcn16s-color/
    cd nyud-fcn32s-color
    rm pascalcontext_layers.py
    rm voc_helper.py
    rm voc_layer.py
    rm siftflow_layers.py
    
    • 别忘修改nyud_layer.py文件.
      运行solve.py开始训练:
    cd nyud-fcn32s-color
    mkdir snapshot
    python solve.py
    
    • 测试的过程和FCN-32s相同,对应的test.py文件为:
    import numpy as np
    from PIL import Image
    import matplotlib.pyplot as plt
    import sys
    import caffe
    import cv
    import scipy.io
    # import pydensecrf.densecrf as dcrf
    # from pydensecrf.utils import compute_unary, create_pairwise_bilateral,create_pairwise_gaussian, softmax_to_unary
    import pdb
    # matplotlib inline
    # load image, switch to BGR, subtract mean, and make dims C x H x W for Caffe
    im = Image.open('/home/li/Documents/fcn.berkeleyvision.org/nyud-fcn16s-color/test2.png')
    in_ = np.array(im, dtype=np.float32)
    in_ = in_[:,:,::-1]
    in_ -= np.array((104.00698793,116.66876762,122.67891434))
    in_ = in_.transpose((2,0,1))
    # load net
    net = caffe.Net('/home/li/Documents/fcn.berkeleyvision.org/nyud-fcn16s-color/deploy.prototxt', '/home/li/Downloads/16_170000.caffemodel', caffe.TEST)
    # shape for input (data blob is N x C x H x W), set data
    net.blobs['data'].reshape(1, *in_.shape)
    net.blobs['data'].data[...] = in_
    # run net and take argmax for prediction
    net.forward()
    # pdb.set_trace()
    out = net.blobs['score'].data[0].argmax(axis=0)
    scipy.io.savemat('/home/li/Documents/fcn.berkeleyvision.org/nyud-fcn16s-color/out.mat',{'X':out})
    #print "hello,python!"
    #plt.imshow(out,cmap='gray');
    plt.imshow(out)
    plt.axis('off')
    plt.savefig('testout2_170000.png')
    
    5.png

    FCN-8s网络训练:

    • 代码修改和FCN-16s相似,对应的net.py文件为:
    import caffe
    from caffe import layers as L, params as P
    from caffe.coord_map import crop
    
    def conv_relu(bottom, nout, ks=3, stride=1, pad=1):
        conv = L.Convolution(bottom, kernel_size=ks, stride=stride,
            num_output=nout, pad=pad,
            param=[dict(lr_mult=1, decay_mult=1), dict(lr_mult=2, decay_mult=0)])
        return conv, L.ReLU(conv, in_place=True)
    
    def max_pool(bottom, ks=2, stride=2):
        return L.Pooling(bottom, pool=P.Pooling.MAX, kernel_size=ks, stride=stride)
    
    def fcn(split, tops):
        n = caffe.NetSpec()
        n.data, n.label = L.Python(module='nyud_layers',
                layer='NYUDSegDataLayer', ntop=2,
                param_str=str(dict(nyud_dir='../data/nyud', split=split,
                    tops=tops, seed=1337)))
    
        # the base net
        n.conv1_1, n.relu1_1 = conv_relu(n.data, 64, pad=100)
        n.conv1_2, n.relu1_2 = conv_relu(n.relu1_1, 64)
        n.pool1 = max_pool(n.relu1_2)
    
        n.conv2_1, n.relu2_1 = conv_relu(n.pool1, 128)
        n.conv2_2, n.relu2_2 = conv_relu(n.relu2_1, 128)
        n.pool2 = max_pool(n.relu2_2)
    
        n.conv3_1, n.relu3_1 = conv_relu(n.pool2, 256)
        n.conv3_2, n.relu3_2 = conv_relu(n.relu3_1, 256)
        n.conv3_3, n.relu3_3 = conv_relu(n.relu3_2, 256)
        n.pool3 = max_pool(n.relu3_3)
    
        n.conv4_1, n.relu4_1 = conv_relu(n.pool3, 512)
        n.conv4_2, n.relu4_2 = conv_relu(n.relu4_1, 512)
        n.conv4_3, n.relu4_3 = conv_relu(n.relu4_2, 512)
        n.pool4 = max_pool(n.relu4_3)
    
        n.conv5_1, n.relu5_1 = conv_relu(n.pool4, 512)
        n.conv5_2, n.relu5_2 = conv_relu(n.relu5_1, 512)
        n.conv5_3, n.relu5_3 = conv_relu(n.relu5_2, 512)
        n.pool5 = max_pool(n.relu5_3)
    
        # fully conv
        n.fc6, n.relu6 = conv_relu(n.pool5, 4096, ks=7, pad=0)
        n.drop6 = L.Dropout(n.relu6, dropout_ratio=0.5, in_place=True)
        n.fc7, n.relu7 = conv_relu(n.drop6, 4096, ks=1, pad=0)
        n.drop7 = L.Dropout(n.relu7, dropout_ratio=0.5, in_place=True)
    
        n.score_fr = L.Convolution(n.drop7, num_output=40, kernel_size=1, pad=0,
            param=[dict(lr_mult=1, decay_mult=1), dict(lr_mult=2, decay_mult=0)])
        n.upscore2 = L.Deconvolution(n.score_fr,
            convolution_param=dict(num_output=40, kernel_size=4, stride=2,
                bias_term=False),
            param=[dict(lr_mult=0)])
    
        n.score_pool4 = L.Convolution(n.pool4, num_output=40, kernel_size=1, pad=0,
            param=[dict(lr_mult=1, decay_mult=1), dict(lr_mult=2, decay_mult=0)])
        n.score_pool4c = crop(n.score_pool4, n.upscore2)
        n.fuse_pool4 = L.Eltwise(n.upscore2, n.score_pool4c,
                operation=P.Eltwise.SUM)
        n.upscore_pool4 = L.Deconvolution(n.fuse_pool4,
            convolution_param=dict(num_output=40, kernel_size=4, stride=2,
                bias_term=False),
            param=[dict(lr_mult=0)])
    
        n.score_pool3 = L.Convolution(n.pool3, num_output=40, kernel_size=1, pad=0,
            param=[dict(lr_mult=1, decay_mult=1), dict(lr_mult=2, decay_mult=0)])
        n.score_pool3c = crop(n.score_pool3, n.upscore_pool4)
        n.fuse_pool3 = L.Eltwise(n.upscore_pool4, n.score_pool3c,
                operation=P.Eltwise.SUM)
        n.upscore8 = L.Deconvolution(n.fuse_pool3,
            convolution_param=dict(num_output=40, kernel_size=16, stride=8,
                bias_term=False),
            param=[dict(lr_mult=0)])
    
        n.score = crop(n.upscore8, n.data)
        n.loss = L.SoftmaxWithLoss(n.score, n.label,
                loss_param=dict(normalize=False, ignore_label=255))
    
        return n.to_proto()
    
    def make_net():
        tops = ['color', 'label']
        with open('trainval.prototxt', 'w') as f:
            f.write(str(fcn('trainval', tops)))
    
        with open('test.prototxt', 'w') as f:
            f.write(str(fcn('test', tops)))
    
    if __name__ == '__main__':
        make_net()
    
    

    solve.py文件为:

    import caffe
    import surgery, score
    
    import numpy as np
    import os
    import sys
    
    try:
        import setproctitle
        setproctitle.setproctitle(os.path.basename(os.getcwd()))
    except:
        pass
    
    weights = '../nyud-fcn16s-color/snapshot/train_iter_170000.caffemodel'
    #vgg_weights = '../ilsvrc-nets/VGG_ILSVRC_16_layers.caffemodel'
    #vgg_proto = '../ilsvrc-nets/VGG_ILSVRC_16_layers_deploy.prototxt'
    
    # init
    #caffe.set_device(int(sys.argv[1]))
    caffe.set_device(0)
    caffe.set_mode_gpu()
    #caffe.set_mode_cpu()
    
    solver = caffe.SGDSolver('solver.prototxt')
    solver.net.copy_from(weights)
    #solver = caffe.SGDSolver('solver.prototxt')
    #vgg_net = caffe.Net(vgg_proto, vgg_weights, caffe.TRAIN)
    #surgery.transplant(solver.net, vgg_net)
    #del vgg_net
    
    # surgeries
    interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
    surgery.interp(solver.net, interp_layers)
    
    # scoring
    test = np.loadtxt('../data/nyud/test.txt', dtype=str)
    
    for _ in range(50):
        solver.step(5000)
        score.seg_tests(solver, False, test, layer='score')
    
    

    test.py文件为:

    import numpy as np  
    from PIL import Image  
    import matplotlib.pyplot as plt  
    import sys     
    import caffe  
    import cv
    import scipy.io
    # import pydensecrf.densecrf as dcrf 
    # from pydensecrf.utils import compute_unary, create_pairwise_bilateral,create_pairwise_gaussian, softmax_to_unary 
    import pdb
    # matplotlib inline  
    # load image, switch to BGR, subtract mean, and make dims C x H x W for Caffe  
    im = Image.open('/home/li/Documents/fcn.berkeleyvision.org/nyud-fcn8s-color/door/RGB_69388896_000000.jpg')  
    in_ = np.array(im, dtype=np.float32)  
    in_ = in_[:,:,::-1]  
    in_ -= np.array((104.00698793,116.66876762,122.67891434))  
    in_ = in_.transpose((2,0,1))  
      
    # load net  
    net = caffe.Net('/home/li/Documents/fcn.berkeleyvision.org/nyud-fcn8s-color/deploy.prototxt', '/home/li/Documents/fcn.berkeleyvision.org/nyud-fcn8s-color/snapshot/train_iter_130000.caffemodel', caffe.TEST)  
    # shape for input (data blob is N x C x H x W), set data  
    net.blobs['data'].reshape(1, *in_.shape)  
    net.blobs['data'].data[...] = in_  
    # run net and take argmax for prediction  
    net.forward()  
    # pdb.set_trace()
    out = net.blobs['score'].data[0].argmax(axis=0) 
    scipy.io.savemat('/home/li/Documents/fcn.berkeleyvision.org/nyud-fcn8s-color/out.mat',{'X':out}) 
    #print "hello,python!"  
      
    #plt.imshow(out,cmap='gray');  
    plt.imshow(out)  
    plt.axis('off')  
    plt.savefig('./door/out/RGB_69388896_000000.png')  
    

    Reference:
    FCN网络训练 SIFTFLOW数据集**
    shelhamer/fcn.berkeleyvision.org**
    Caffe学习系列(7):solver及其配置 - denny402 - 博客园**
    https://gist.githubusercontent.com/ksimonyan/211839e770f7b538e2d8/raw/0067c9b32f60362c74f4c445a080beed06b07eb3/VGG_ILSVRC_16_layers_deploy.prototxt**

    相关文章

      网友评论

        本文标题:利用FCN-8s网络训练自己数据集(NYUD为例)

        本文链接:https://www.haomeiwen.com/subject/ngvqjxtx.html