目的:帮助那些学习过机器学习、深度学习算法、了解keras框架以及熟悉python用 法却对实际应用无从下手的同学,
从简单交通标志识别入手,彻底掌握深度学习算法在图像识别领域的应用。
项目完整地址Github:ImageClassification
注:目前已经支持的网络结构有 :
- LeNet
- AlexNet
- VGGNet
- ZFNet
- GoogLeNet
- ResNet_18/34/50/101/152
- DenseNet_161
1.项目准备工作
数据集下载地址交通标志数据集
1.1 完整目录
创建一下三个文件夹,其中data用来存放待训练的数据以及测试数据,log用来存放keras训练好的模型,便于后续预测图像类别时对训练好的模型进行加载,src用来存放所有的python文件。
完整目录
1.2 data文件夹
结构如下,其中待识别的图像直接放到test文件夹下面
├─data
│ ├─test
│ └─train
1.3 train文件夹
结构如下,每个文件夹的名称代表一个类别(在代码中会有所体现),每个文件夹存放了一定数量的对应类别的图像
└─train
├─00000
├─00001
├─00002
├─00003
├─00004
├─00005
├─00006
├─00007
├─00008
└─00009
2. 使用keras搭建网络结构
在完成项目文件夹以及数据准备后,开始创建网络结构,在完成该步骤后,可以修改参数实现选择不同的网络结构对数据进行训练
2.1 创建src/config.py,用来定以所需要的一些列参数
class DefaultConfigs(object):
"""docstring for DefaultConfig"""
train_data_path = "../data/train/" #训练数据所在路径
test_data_path = "../data/test/" #要识别的图像存储路径
weights_path = "../log/" #模型保存路径
normal_size = 64 #图像输入网络之前需要被resize的大小
channels = 3 #RGB通道数
epochs = 60 #训练的epoch次数
batch_size = 64 #训练的batch 数
classes = 10 #要识别的类数
data_augmentation = True #是否使用keras的数据增强模块
model_name = "ResNet_18" #选择所要使用的网络结构名称
config = DefaultConfigs()
2.2 创建src/utils.py 包括数据读取和不同网络模型的加载
from keras.preprocessing.image import img_to_array
from keras.utils import to_categorical
from models import AlexNet,resnet
import cv2
import os
import numpy as np
np.random.seed(42)
def load_data(config):
labels = [] #存放标签信息
images_data = [] #存放图片信息
print("loading dataset......")
data_path = config. train_data_path #也即是train文件夹
category_paths = os.listdir(data_path) #每个类别所在的文件夹,返回list格式
category_paths = list(map(lambda x:data_path+x,category_paths)) #组合成合法的路径,如../data/train/00000
np.random.shuffle(category_paths)
for category_path in category_paths:
images_files_list = os.listdir(category_path) #获取每个类别下的图像名称
print(category_path)
for image_file in images_files_list:
file_name = category_path + "/"+image_file #每张图片的路径,便于读取
label = int(category_path[-2:]) #提取类别信息
labels.append(label)
image = cv2.imread(file_name) #使用opencv读取图像
image = cv2.resize(image,(config.normal_size,config.normal_size))
image = img_to_array(image) #将图像转换成array形式
images_data.append(image)
#缩放图像数据
images_data = np.array(images_data,dtype="float") / 255.0
labels = np.array(labels) #将label转换成np.array格式
labels = to_categorical(labels, num_classes=config.classes)
return images_data,labels
def build_model(config):
#根据选择的网络模型构建
if config.model_name == "AlexNet":
model = AlexNet.AlexNet(config)
elif config.model_name == "ResNet_18":
model = resnet.ResnetBuilder.build_resnet_18(config)
elif config.model_name == "ResNet_34":
model = resnet.ResnetBuilder.build_resnet_34(config)
elif config.model_name == "ResNet_50":
model = resnet.ResnetBuilder.build_resnet_50(config)
elif config.model_name == "ResNet_101":
model = resnet.ResnetBuilder.build_resnet_101(config)
elif config.model_name == "ResNet_152":
model = resnet.ResnetBuilder.build_resnet_152(config)
else:
print("The model you have selected doesn't exists!")
return model
2.3 创建src/train.py,进行训练参数设置的主文件
from utils import build_model,load_data
from sklearn.model_selection import train_test_split
from keras.callbacks import ReduceLROnPlateau, CSVLogger, EarlyStopping, ModelCheckpoint
from keras.preprocessing.image import ImageDataGenerator
from sklearn.metrics import accuracy_score
from config import config
def train(config,train_x,train_y,dev_x,dev_y):
model = build_model(config)
lr_reducer = ReduceLROnPlateau(factor=0.005, cooldown=0, patience=5, min_lr=0.5e-6,verbose=1) #设置学习率衰减
early_stopper = EarlyStopping(min_delta=0.001, patience=10,verbose=1) #设置早停参数
checkpoint = ModelCheckpoint(config.weights_path + config.model_name + "_model.h5",
monitor="val_acc", verbose=1,
save_best_only=True, save_weights_only=True,mode="max") #保存训练过程中,在验证集上效果最好的模型
#使用数据增强
if config.data_augmentation:
print("using data augmentation method")
data_aug = ImageDataGenerator(
rotation_range=90, #图像旋转的角度
width_shift_range=0.2, #左右平移参数
height_shift_range=0.2, #上下平移参数
zoom_range=0.3, #随机放大或者缩小
horizontal_flip=True, #随机翻转
)
data_aug.fit(train_x)
model.fit_generator(
data_aug.flow(train_x,train_y,batch_size=config.batch_size),
steps_per_epoch=train_x.shape[0] // config.batch_size,
validation_data=(dev_x,dev_y),
shuffle=True,
epochs=config.epochs,verbose=1,max_queue_size=100,
callbacks=[lr_reducer,early_stopper,checkpoint]
)
else:
print("don't use data augmentation method")
model.fit(train_x,train_y,batch_size = config.batch_size,
nb_epoch=config.epochs,
validation_data=(dev_x, dev_y),
shuffle=True,
callbacks=[lr_reducer, early_stopper, checkpoint]
)
if __name__ == "__main__":
images_data, labels = load_data(config)
train_x,dev_x,train_y,dev_y = train_test_split(images_data,labels,test_size=0.25) #随机切分数据集,分为训练和验证集
train(config,train_x,train_y,dev_x,dev_y)
2.4 创建src/models/AlexNet.py,搭建AlexNet网络结构,其他网络结构不再一一列出
from keras.models import Sequential
from keras.layers.core import Flatten, Dense
from keras.layers.convolutional import Convolution2D
from keras.layers.pooling import MaxPooling2D
from keras.layers import Dropout
def AlexNet(config):
model = Sequential()
input_shape = (config.normal_size,config.normal_size,config.channels)
model.add(Convolution2D(96, (11, 11), strides=(4, 4), input_shape=input_shape, padding='valid', activation='relu',
kernel_initializer='uniform'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Convolution2D(256, (5, 5), strides=(1, 1), padding='same', activation='relu', kernel_initializer='uniform'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Convolution2D(384, (3, 3), strides=(1, 1), padding='same', activation='relu', kernel_initializer='uniform'))
model.add(Convolution2D(384, (3, 3), strides=(1, 1), padding='same', activation='relu', kernel_initializer='uniform'))
model.add(Convolution2D(256, (3, 3), strides=(1, 1), padding='same', activation='relu', kernel_initializer='uniform'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Flatten())
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4096, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(config.classes, activation='softmax'))
model.compile(loss="categorical_crossentropy", optimizer="adam",metrics=["accuracy"])
return model
注1:之所以数据的存储文件格式要按照前文的要求,是因为在代码utils.py中的load_data()函数在读取文件是,按照层级结构,根据os.listdir()寻找文件夹和文件。
ps:当你很熟练的时候,当然不用根据代码去调整文件夹了,而是根据数据去调整代码~
3.训练日志
Epoch 00019: val_acc improved from 0.96401 to 0.96429, saving model to ../log/ResNet_18_model.h5
Epoch 20/60
1/171 [..............................] - ETA: 16s - loss: 0.3963 - acc: 0.9531
2/171 [..............................] - ETA: 15s - loss: 0.4064 - acc: 0.9453
3/171 [..............................] - ETA: 15s - loss: 0.4375 - acc: 0.9323
选择的是resnet18,后续还有提升,没有继续训练,有兴趣的可以尝试不同的网络结构~
4.TODO
希望能够添加更多的网络模型进去,目前该项目只支持图像识别任务,后续做一个检测类的项目。
5.项目完整地址
6.参考文献
[1] Lécun Y, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11):2278-2324.
[2] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]// International Conference on Neural Information Processing Systems. Curran Associates Inc. 2012:1097-1105.
[3] Simonyan K, Zisserman A. Very Deep Convolutional Networks for Large-Scale Image Recognition[J]. Computer Science, 2014.
[4] Zeiler M D, Fergus R. Visualizing and Understanding Convolutional Networks[J]. 2014, 8689:818-833.
[5] He K, Zhang X, Ren S, et al. Deep Residual Learning for Image Recognition[C]// IEEE Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, 2016:770-778.
[6] Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions[C]// IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 2015:1-9.
[7] Huang G, Liu Z, Laurens V D M, et al. Densely Connected Convolutional Networks[J]. 2016:2261-2269.
[8] Introduce the cnns from LeNet to DensNet
[9] DenseNet-Keras
[10] keras-resnet
网友评论