测试集包含和训练集相似的图像。通常,我们会将原始数据集的 10-20% 作为测试和验证集,剩下的用于训练。

验证的目的是衡量模型在非训练集数据上的效果。效果标准由开发者自己决定。通常用准确率表示,即网络预测正确的类别所占百分比。其他标准包括精确率召回率以及top-5 错误率。我们将侧重于准确率。首先,将使用测试集中的一批数据进行前向传播。

一、 过拟合



网络能越来越好地学习训练集中的规律,导致训练损失越来越低。但是,它在泛化到训练集之外的数据时开始出现问题,导致验证损失上升。任何深度学习模型的最终目标是对新数据进行预测,因此我们要尽量降低验证损失。一种方法是使用验证损失最低的模型,在此例中是训练周期约为 8-10 次的模型。这种策略称为早停法 (early stopping)在实践中,你需要在训练时频繁地保存模型,以便之后选择验证损失最低的模型

最常用的减少过拟合方法(早停法除外)是丢弃,即随机丢弃输入单元。这样就促使网络在权重之间共享信息,使其更能泛化到新数据。在 PyTorch 中添加丢弃层很简单,使用 nn.Dropout 模块即可。

在训练过程中,我们需要使用丢弃防止过拟合,但是在推理过程中,我们需要使用整个网络。因此在验证、测试和使用网络进行预测时,我们需要关闭丢弃功能。你可以使用 model.eval()。它会将模型设为验证模式,使丢弃率变成 0。也可以使用 model.train() ,将模型设为训练模式,重新开启丢弃功能。通常,验证循环的规律将为:关闭梯度,将模型设为评估模式,计算验证损失和指标,然后将模型重新设为训练模式。

# turn off gradients
with torch.no_grad():
    # set model to evaluation mode
    # validation pass here
    for images, labels in testloader:

# set model back to train mode

二、 推理

训练好模型后,我们可以用它推理了。之前已经进行过这一步骤,但是现在需要使用 model.eval() 将模型设为推理模式。对于 torch.no_grad(),你需要关闭 autograd。

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

import matplotlib.pyplot as plt

import torch
from torchvision import datasets, transforms

# Define a transform to normalize the data
transform = transforms.Compose([transforms.ToTensor(),
                                transforms.Normalize((0.5,), (0.5,))])
# Download and load the training data
trainset = datasets.FashionMNIST('~/.pytorch/F_MNIST_data/', download=True, train=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

# Download and load the test data
testset = datasets.FashionMNIST('~/.pytorch/F_MNIST_data/', download=True, train=False, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=True)

from torch import nn, optim
import torch.nn.functional as F
##  Define your model with dropout added
class Classifier(nn.Module):
    def __init__(self):
        self.fc1 = nn.Linear(784, 256)
        self.fc2 = nn.Linear(256, 128)
        self.fc3 = nn.Linear(128, 64)
        self.fc4 = nn.Linear(64, 10)
        #Dropout module with 0.2 drop probability
        self.dropout = nn.Dropout(p = 0.2)
    def forward(self, x):
        x = x.view(x.shape[0], -1)
        #with dropout
        x = self.dropout(F.relu(self.fc1(x)))
        x = self.dropout(F.relu(self.fc2(x)))
        x = self.dropout(F.relu(self.fc3(x)))
        x = F.log_softmax(self.fc4(x), dim=1)
        return x
##Train your model with dropout, and monitor the training progress with the validation loss and accuracy
model = Classifier()
criterion = nn.NLLLoss()
optimizer = optim.Adam(model.parameters(), lr=0.003)

epochs = 30
steps = 0

train_losses, test_losses = [],[]
for e in range(epochs):
    running_loss = 0
    for images, labels in trainloader:
        log_ps = model(images)
        loss = criterion(log_ps, labels)
        running_loss += loss.item()
        test_loss = 0
        accuracy = 0
        with torch.no_grad():
            for images, labels in testloader:
                log_ps = model(images)
                loss = criterion(log_ps, labels)
                test_loss += loss
                ps = torch.exp(log_ps)
                top_p, top_class = ps.topk(1, dim=1)
                equals = top_class == labels.view(*top_class.shape)
                accuracy += torch.mean(equals.type(torch.FloatTensor))
        print("Epoch: {}/{}..".format(e+1, epochs),
              "Training Loss: {:.3f}..".format(running_loss/len(trainloader)),
              "Test Loss: {:.3f}..".format(test_loss/len(testloader)),
              "Test Accuracy: {:.3f}".format(accuracy/len(testloader))

plt.plot(train_losses, label='Training loss')
plt.plot(test_losses, label='Validation loss')
# Import helper module (should be in the repo)
import helper

# Test out your network!


dataiter = iter(testloader)
images, labels = dataiter.next()
img = images[0]
# Convert 2D image to 1D vector
img = img.view(1, 784)

# Calculate the class probabilities (softmax) for img
with torch.no_grad():
    output = model.forward(img)

ps = torch.exp(output)

# Plot the image and probabilities
helper.view_classify(img.view(1, 28, 28), ps, version='Fashion')

