Pytorch版-计算机视觉之三

作者: 深思海数_willschang | 来源:发表于2021-08-06 14:19 被阅读0次

Pytorch版-计算机视觉之三
机器学习工具pytorch中文英文工具书籍下载-持续更新
Pytorch版-计算机视觉之一
Pytorch版-计算机视觉之二
Pytorch版-计算机视觉之四
pytorch之transforms.Compose()函数理解
pytorch之transforms.Compose()函数理解
pytorch 图像预处理 transforms与normali
PyTorch 关键运行机制汇总
《OpenCV计算机视觉编程攻略（第3版）》高清中文+英文版PD

Chapter3 Building a Deep Neural Network with Pytorch

从前面的章节中，我们知道一个完整的神经网络模型中是有许多超参数（hyperparameters），如batch size，学习率，优化函数（loss optimizer）等。

超参数：模型训练时人为给定的值，不需要通过模型学习更新的
参数：模型中的参数（权重，偏置等）是需要通过学习训练更新的（which are learned during training）。

因此对超参数的设置也是会直接影响到模型的优劣程度的，本章主要对这些超参的不同设置值进行模型训练，说明各超参在模型中的重要性。

Changing different aspects of each hyperparameter is likely to affect the accuracy or speed of training a neural network.

同时也介绍了些tricks或技巧，如标准化，批标准化和正则化等提升模型性能的方法。

Furthermore, a few additional techniques such as scaling, batch normalization, and regularization help in improving the performance of a neural network.

理解图片

一张数字图片文件一般都是存储为jpeg或png格式，其实它们都像素数组的组合而已。一像素是图片的最小构成单元。

在灰色图片里，每个像素值为0~255，0代表黑色，255代表白色，在这两值期间为不同深浅的灰色值。

Gray image
在彩色图片里，每个像素都是由一个3维的向量，分别表示RGB三个通道值。

cv2 image

为什么神经网络能在图像分析上有所作为呢？

传统方式需要事先知道所输入的图片的一些特征信息，如直方图（Histogram feature），边缘或拐角信息（Edges and Corners feature），彩色通道分离（Color separation feature）和图像梯度特征（Image gradients feature）等关于图像信息的经验特征提取方法。
相比于传统方式的这些缺点，神经网络的特征自动提取和分类优势在图像分析上的应用就有很大的优势。

The main drawback of creating these features is that you need to be an expert in image and signal analysis and should fully understand what features are best suited to solve a problem.

训练神经网络

在完成一个神经网络模型过程中，一般情况下我们以下几个步骤需要完成

加载相关包

%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
from torch.optim import SGD, Adam
from torchvision import datasets

device = 'cuda' if torch.cuda.is_available() else 'cpu'

构建所需数据结构及获取方式 (本章节数据以 FMNIST为例)

data_folder = './data/FMNIST'
fmnist = datasets.FashionMNIST(data_folder, download=True, train=True)
train_data  = fmnist.data
train_labels = fmnist.targets

# 查看一下数据情况
unique_values = train_labels.unique()
print(f'tr_images & tr_targets:\n\tX -{train_data.shape}\n\tY \
-{train_labels.shape}\n\tY-Unique Values : {unique_values}')
print(f'TASK:\n\t{len(unique_values)} class Classification')
print(f'UNIQUE CLASSES:\n\t{fmnist.classes}')

"""
tr_images & tr_targets:
    X -torch.Size([60000, 28, 28])
    Y -torch.Size([60000])
    Y-Unique Values : tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
TASK:
    10 class Classification
UNIQUE CLASSES:
    ['T-shirt/top', 'Trouser', 'Pullover', 'Dress', 'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot']
"""

可视化查看各类别图片

# 行为类别总数，10列
rows, columns = len(train_labels.unique()), 10
# 图矩阵 10*10
fig, ax = plt.subplots(rows, columns, figsize=(10,10))
# 每个类别一行，一行十列展示图片
for label_class, plot_row in enumerate(ax):
    label_x_rows = np.where(train_labels == label_class)[0]
    for plot_cell in plot_row:
        plot_cell.grid(False)
        plot_cell.axis('off')
        ix = np.random.choice(label_x_rows)
        x, y = train_data[ix], train_labels[ix]
        plot_cell.imshow(x, cmap='gray')
plt.tight_layout()

image.png

封装数据加载器

# 构建dataset
class FMNISTDataset(Dataset):
    def __init__(self, x, y):
        # 数据归一化处理
        x = x.float() / 255
        x = x.view(-1,28*28)
        self.x, self.y = x, y 
    def __getitem__(self, ix):
        x, y = self.x[ix], self.y[ix] 
        return x.to(device), y.to(device)
    def __len__(self): 
        return len(self.x)

def get_data(): 
    train = FMNISTDataset(train_data, train_labels) 
    trn_dl = DataLoader(train, batch_size=32, shuffle=True)
    return trn_dl

构建网络结构，损失函数和优化器

def get_model():
    model = nn.Sequential(
        nn.Linear(28*28, 1000),
        nn.ReLU(),
        nn.Linear(1000, 10)
    ).to(device)
    loss_fn = nn.CrossEntropyLoss()
    optimizer = SGD(model.parameters(), lr=1e-2)
    return model, loss_fn, optimizer

分别定义训练及验证数据的批量提取器

def train_batch(x, y, model, optimizer, loss_fn):
    model.train()
    
    prediction = model(x)
    batch_loss = loss_fn(prediction, y)
    batch_loss.backward()
    optimizer.step()
    optimizer.zero_grad()
    return batch_loss.item()

定义模型评价函数

@torch.no_grad()
def accuracy(x, y, model):
    model.eval()
    # this is the same as @torch.no_grad 
    # at the top of function, only difference
    # being, grad is not computed in the with scope
    prediction = model(x)
    max_values, argmaxs = prediction.max(-1)
    is_correct = argmaxs == y
    
    return is_correct.cpu().numpy().tolist()

迭代优化每轮次中的权重批量更新

trn_dl = get_data()
model, loss_fn, optimizer = get_model()

losses, accuracies = [], []

for epoch in range(5):
    print(epoch)
    epoch_losses, epoch_accuracies = [], []
    
    for ix, batch in enumerate(iter(trn_dl)):
        x, y = batch
        batch_loss = train_batch(x, y, model, optimizer, loss_fn)
        epoch_losses.append(batch_loss)
    epoch_loss = np.array(epoch_losses).mean()
    
    for ix, batch in enumerate(iter(trn_dl)):
        x, y = batch
        is_correct = accuracy(x, y, model)
        epoch_accuracies.extend(is_correct)
    epoch_accuracy = np.mean(epoch_accuracies)
    losses.append(epoch_loss)
    accuracies.append(epoch_accuracy)

可视化查看每轮次后模型准确率

epochs = np.arange(5)+1
plt.figure(figsize=(20,5))
plt.subplot(121)
plt.title('Loss value over increasing epochs')
plt.plot(epochs, losses, label='Training Loss')
plt.legend()
plt.subplot(122)
plt.title('Accuracy value over increasing epochs')
plt.plot(epochs, accuracies, label='Training Accuracy')
plt.gca().set_yticklabels(['{:.0f}%'.format(x*100) for x in plt.gca().get_yticks()]) 
plt.legend()

image.png