
过拟合经常发生:
- 训练epochs过多
- 模型过大
- 训练数据过少
先看范例:
import tensorflow as tf
from tensorflow import keras
from keras.layers import Conv2D, Dense, Flatten, Dropout
# 载入数据
fashion_mnist = keras.datasets.fashion_mnist
(x_train, y_train),(x_test, y_test) = fashion_mnist.load_data()
x_train, x_test = x_train/255.0, x_test/255.0
x_train = x_train[..., tf.newaxis] # add channels
x_test = x_test[..., tf.newaxis] # add channels
train_ds = tf.data.Dataset.from_tensor_slices(
(x_train, y_train)
).shuffle(10000).batch(32)
test_ds = tf.data.Dataset.from_tensor_slices(
(x_test, y_test)
).batch(32)
#创建模型
class MyModel(keras.Model):
def __init__(self):
super(MyModel, self).__init__()
self.conv1 = Conv2D(32, 3, activation='relu')
self.flatten = Flatten()
self.d1 = Dense(128, activation='relu')
self.d2 = Dense(10, activation='softmax')
def call(self, x):
x = self.conv1(x)
x = self.flatten(x)
x = self.d1(x)
return self.d2(x)
model = MyModel()
model.compile(
optimizer = 'adam',
loss = 'sparse_categorical_crossentropy',
metrics = ['accuracy']
)
#训练模型
model.fit(
train_ds,
epochs=10
)
#评估模型
test_loss, test_acc = model.evaluate(test_ds, verbose=2)
print('\nTest accuracy:', test_acc)
运行结果:
Epoch 10/10
1875/1875 [==============================] - 5s 3ms/step - loss: 0.0424 - accuracy: 0.9851
Test accuracy: 0.8999999761581421
模型在训练集上的精度是98.5%, 在测试集上的精度是90.0%,显然发生过拟合了。
解决办法:
- 加入Dropout层
- 加入权重正则化。
加入Dropout层,模型修改为:
#创建模型
class MyModel(keras.Model):
def __init__(self):
super(MyModel, self).__init__()
self.conv1 = Conv2D(32, 3, activation='relu')
self.flatten = Flatten()
self.d1 = Dense(128, activation='relu')
self.dropout = Dropout(0.5)
self.d2 = Dense(10, activation='softmax')
def call(self, x):
x = self.conv1(x)
x = self.flatten(x)
x = self.d1(x)
x = self.dropout(x)
return self.d2(x)
运行结果:
Epoch 10/10
1875/1875 [==============================] - 6s 3ms/step - loss: 0.1577 - accuracy: 0.9392
Test accuracy: 0.9052000045776367
模型在训练集上的精度是93.9%, 在测试集上的精度是90.5%,显然增加过Dropout后,模型过拟合好一些了
加入权重正则化,模型修改为
#创建模型
class MyModel(keras.Model):
def __init__(self):
super(MyModel, self).__init__()
self.conv1 = Conv2D(32, 3, activation='relu')
self.flatten = Flatten()
self.d1 = Dense(128, activation='relu', kernel_regularizer=keras.regularizers.L2(0.001))
self.dropout = Dropout(0.5)
self.d2 = Dense(10, activation='softmax')
def call(self, x):
x = self.conv1(x)
x = self.flatten(x)
x = self.d1(x)
x = self.dropout(x)
return self.d2(x)
运行结果:
Epoch 10/10
1875/1875 [==============================] - 6s 3ms/step - loss: 0.5609 - accuracy: 0.8631
Test accuracy: 0.8743000030517578
模型在训练集上的精度是86.3%, 在测试集上的精度是87.4%,显然增加过Dropout+L2后,模型过拟合好了,但又发生了欠拟合
把Epochs增加为:20,其运行结果:
Epoch 20/20
469/469 [==============================] - 2s 5ms/step - loss: 0.4477 - accuracy: 0.8802
Test accuracy: 0.8870999813079834
模型在训练集上的精度是88.02%, 在测试集上的精度是88.7%
结论:
- 参数越多的模型将具有更多的“记忆能力”,强行“记住”训练样本与其标签之前的映射关系是一种没有任何泛化能力的映射,这种“死记硬背”的拟合在测试数据上是没有用的。
- 深度学习模型往往擅长拟合训练数据,但真正的挑战是泛化,而不是拟合
- 当发生拟合时,解决方式:1,获取更多训练数据;2,减少网络容量;3,添加权重正则化+Dropout;4,数据增强;5,批归一化Batch Normalization
网友评论