美文网首页
Keras 实例教程(三)- 使用VGG-16识别

Keras 实例教程(三)- 使用VGG-16识别

作者: chengjian666 | 来源:发表于2018-12-06 17:21 被阅读72次

Keras 作为当前深度学习框架中的热门之一,使用起来是极其简便的,它所提供的各种友好而灵活的API,即使对于新手而言,相比于TensorFlow也非常容易上手。更特别的是,Keras中还预置了多种已经训练好的、非常流行的神经网络模型:

Model Size Top-1 Accuracy Top-5 Accuracy Parameters Depth
Xception 88 MB 0.790 0.945 22,910,480 126
VGG16 528MB 0.713 0.901 138,357,544 23
VGG19 549 MB 0.713 0.900 143,667,240 26
ResNet50 99 MB 0.749 0.921 25,636,712 168
InceptionV3 92 MB 0.779 0.937 23,851,784 159
InceptionResNetV2 215 MB 0.803 0.953 55,873,736 572
MobileNet 16 MB 0.704 0.895 4,253,864 88
MobileNetV2 14 MB 0.713 0.901 3,538,984 88
DenseNet121 33 MB 0.750 0.923 8,062,504 121
DenseNet169 57 MB 0.762 0.932 14,307,880 169
DenseNet201 80 MB 0.773 0.936 20,242,984 201
NASNetMobile 23 MB 0.744 0.919 5,326,716 -
NASNetLarge 343 MB 0.825 0.960 88,949,818 -

VGG 结构简介

使用者可以非常方便地以他山之石来解决自己的问题。本文将以VGG16为例来演示,如何在Keras中执行物体识别(Object Recognization)任务。VGG16是由来自牛津大学的研究团队涉及并实现的一个基于CNN的深度学习网络,它的深度为23(包括16个layers),所有的权重总计超过500M,下图给出了它的一个基本结构(参考D列):

image

通过下图可以更加清晰了解:

image

简单概括其结构为:
VGG-16,输入层224x224x3,经过两层相同的卷积,卷积filter为3*3,stride为1,filter数为64,然后经过一层pooling。接着按照相同的方式,让宽和高越来越小,而通道数逐倍增加,直到512。最后用两层相同全连接加一个softmax。使用流程图即为:

image

这里有更加清楚的VGG结构图

VGG-16使用

可以使用下面的命令直接导入已经训练好的VGG16网络,注意因为全部的参数总计超过500M,因此当你首次使用下面的命令时,Keras需要从网上先下载这些参数,这可能需要耗用一些时间。

from keras.applications.vgg16 import VGG16
model = VGG16()
print(model.summary())

最后一句会输入VGG16网络的层级结构,不仅如此,VGG()这个类中还提供了一些参数,这些参数可以令你非常方便地定制个性化的网络结构,这一点在迁移学习(Transfer Learning)中尤其有用,摘列部分参数如下:

  • include_top (True): Whether or not to include the output layers for the model. You don’t need these if you are fitting the model on your own problem.
  • weights (‘imagenet‘): What weights to load. You can specify None to not load pre-trained weights if you are interested in training the model yourself from scratch.
  • input_tensor (None): A new input layer if you intend to fit the model on new data of a different size.
  • input_shape (None): The size of images that the model is expected to take if you change the input layer.
  • pooling (None): The type of pooling to use when you are training a new set of output layers.
  • classes (1000): The number of classes (e.g. size of output vector) for the model.

当你需要直接使用VGG-16输出识别结果时,需要enable include_top来包含output layer。

加载图片及处理

准确好一张待识别的图片,其内容为一只金毛犬(golden_retriever):


image
from keras.applications.vgg16 import VGG16, preprocess_input, decode_predictions

from keras.preprocessing.image import load_img, img_to_array
import numpy as np

image = load_img('C:/Pictures/test_imgs/golden.jpg', target_size=(224, 224))
image_data = img_to_array(image)

# reshape it into the specific format
image_data = image_data.reshape((1,) + image_data.shape)
print(image_data.shape)

# prepare the image data for VGG
image_data = preprocess_input(image_data)

注意点:

  • 输入图片的dim是224x224;
  • 需要reshape为(samples,dims),即dim为(224,224,3)的若干输入样本;
  • 最后还需要preprocess_input()是将其转化为VGG-16能够接受的输入,实际上为每个像素减去均值(见原文描述):
The only preprocessing we do is subtracting the mean RGB value, computed on the training set, from each pixel.
进行预测和解析
# using the pre-trained model to predict
prediction = model.predict(image_data)

# decode the prediction results
results = decode_predictions(prediction, top=3)

print(results)

我们将得到可能性最高的前三个识别结果:

[[('n02099601', 'golden_retriever', 0.9698627), ('n04409515', 'tennis_ball', 0.008626293), ('n02100877', 'Irish_setter', 0.004562445)]]

可见与结果试一致的,97%预测是金毛。

完整代码:
from keras.applications.vgg16 import VGG16, preprocess_input, decode_predictions

from keras.preprocessing.image import load_img, img_to_array
import numpy as np
# VGG-16 instance
model = VGG16(weights='imagenet', include_top=True)

image = load_img('C:/Pictures/Pictures/test_imgs/golden.jpg', target_size=(224, 224))
image_data = img_to_array(image)

# reshape it into the specific format
image_data = image_data.reshape((1,) + image_data.shape)
print(image_data.shape)

# prepare the image data for VGG
image_data = preprocess_input(image_data)

# using the pre-trained model to predict
prediction = model.predict(image_data)

# decode the prediction results
results = decode_predictions(prediction, top=3)

print(results)

相关文章

网友评论

      本文标题:Keras 实例教程(三)- 使用VGG-16识别

      本文链接:https://www.haomeiwen.com/subject/xynocqtx.html