美文网首页华南理工大学创维维生素
深度学习计算机视觉入门过程

深度学习计算机视觉入门过程

作者: 10_83ce | 来源:发表于2021-06-10 01:48 被阅读0次

    计算机视觉入门过程(用时约为2个月):

    一,理论学习

    1,复习了线性代数和概率论

    2,学习了python的numpy库和pytorch库的使用

    3,李飞飞cs213n课程视频

    4,吴恩达深度学习课程视频

    二.实践

    1,搭建深度学习环境

    2,mnist最高达到98.2%,cifar最高达到94.7%


    因为李飞飞cs213n课程和吴恩达深度学习课程都是全英课程,所以在CSDN找了对应的笔记,边看笔记边看视频,这样学起来轻松一些。

    (课程视频在B站可以找到)

    笔记链接如下:

    1.李飞飞cs213n课程:https://blog.csdn.net/qq_34611579/article/details/81072920?utm_source=app&app_version=4.8.0&code=app_1562916241&uLinkId=usr1mkqgl919blen

    2.吴恩达深度学习课程:https://blog.csdn.net/wuzhongqiang/article/details/89702268?utm_source=app&app_version=4.8.0&code=app_1562916241&uLinkId=usr1mkqgl919blen

    还有一些在学习过程中疑问百度到的结果:

    1.numpy 中的随机打乱数据方法np.random.shuffle

    https://blog.csdn.net/weixin_43896259/article/details/106116955

    2,图像预处理Transforms与normalize

    https://blog.csdn.net/aidanmo/article/details/104059612

    3.关于transforms.Normalize()函数

    https://blog.csdn.net/jzwong/article/details/104272600

    4,numpy.floor()函数作用:向下取整

    5.torch.utils.data.DataLoader()详解

    https://blog.csdn.net/qq_40520596/article/details/106981039

    pytorch中 model.cuda的作用

    https://www.cnblogs.com/pogeba/p/13890846.html

    Pytorch里面nn.CrossEntropyLoss的含义

    https://blog.csdn.net/lang_yubo/article/details/105108174

    model.train()和model.eval()用法和区别

    https://zhuanlan.zhihu.com/p/357075502

    以optim.SGD为例介绍pytorch优化器

    https://www.sogou.com/link?url=hedJjaC291OV7dVab-QfvHtdr0qpeLU_JZ6a8oyfxdi0c29X6nLNTA..


    下面来讲讲mnist和cifar数据集的训练过程

    1.mnist数据集

    from torchvision import datasets, transforms

    import numpy as np

    from sklearn.metrics import accuracy_score

    import torch

    # from tqdm import tqdm

    import time

    # matrix func

    def knn(train_x, train_y, test_x, test_y,k):

        since = time.time()    # 获取当前时间

        m = test_x.size(0)    # test_s是torch.tensor类,m是在求它的数据个数

        n = train_x.size(0)

        # 计算欧几里得距离,得到m*n矩阵,ij表示第i个测试图片与第j个图片的欧几里得距离

        print("cal dist matrix")

        xx = (test_x ** 2).sum(dim=1, keepdim=True).expand(m, n)

        # **2为对每个元素平方,.sum中dim=1,对行求和,keepdim=True时保持二维,=false时降一维。text原来是m*1,.expand后变成m*n。

        yy = (train_x ** 2).sum(dim=1, keepdim=True).expand(n, m).transpose(0, 1)

        dist_mat = xx + yy - 2 * test_x.matmul(train_x.transpose(0, 1))

        mink_idxs = dist_mat.argsort(dim=-1)

        res = []

        for idxs in mink_idxs:

            # voting

            res.append(np.bincount(np.array([train_y[idx] for idx in idxs[:k]])).argmax())

        assert len(res) == len(test_y)

        print("识别率:", accuracy_score(test_y, res))

        time_elapsed = time.time() - since

        print('KNN mat training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))

    if __name__ == "__main__":

        train_dataset = datasets.CIFAR10(root="./data2", transform=transforms.ToTensor(), train=True)

        #参数说明: - root : processed/training.pt 和 processed/test.pt 的主目录

        # -train : True = 训练集, False = 测试集

        # - download : True = 从互联网上下载数据集,并把数据集放在root目录下.

                    # 如果数据集之前下载过,将处理过的数据(minist.py中有相关函数)放在processed文件夹下。

        test_dataset = datasets.CIFAR10(root="./data2", transform=transforms.ToTensor(), train=False)

        # build train&test data

        train_x = []

        train_y = []

        for i in range(len(train_dataset)):  #i为int,从0到len(train_dataset)-1

            img, target = train_dataset[i]    #train_dataset[i]是二元组

            train_x.append(img.view(-1))

            # view(-1)将多维img(tensor([  [[],[]] , [[],[]] ])转化为一维tensor([])

            # ( train_x()是二维[tensor([ , , ,]) , tensor([ , , ,])],第一维是tensor([ , , ,]) ),

            # 然后加进去train_x数组里面

            train_y.append(target)

            if i > 50000:

                break

        # print(set(train_y))

        test_x = []

        test_y = []

        for i in range(len(test_dataset)):

            img, target = test_dataset[i]

            test_x.append(img.view(-1))

            test_y.append(target)

            if i > 9000:

                break

        print("classes:", set(train_y))      #将所有标签类输出,因为set变成集合后无重复

        knn(torch.stack(train_x), train_y, torch.stack(test_x), test_y, 7)#stack将[tensor([]),tensor([])]转化为tensor([[],[]])

        # knn_by_iter(torch.stack(train_x), train_y, torch.stack(test_x), test_y, 10)

    2.cifar数据集

    import torch

    import numpy as np

    from torchvision import datasets

    import torchvision.transforms as transforms

    from torch.utils.data.sampler import SubsetRandomSampler

    import torch.nn as nn

    import torch.nn.functional as F

    import torch.optim as optim

    #将数据转换为torch.FloatTensor,并标准化

    #ToTensor()能够把灰度范围从0-255变换到0-1之间,而后面的transform.Normalize()则把0-1变换到(-1,1).

    transform = transforms.Compose([

        transforms.ToTensor(),

        transforms.Normalize((0.5,0.5,0.5),(0.5,0.5,0.5))

    ])

    #选择训练集与测试集的数据

    train_data = datasets.CIFAR10( 'data',train=True,download=False,transform=transform)

    test_data = datasets.CIFAR10('data',train=True,download=False,transform=transform)

    # percentage of training set to use as validation

    valid_size = 0.2

    #obtain training indices that will be used for validation划分训练集和验证集

    num_train = len(train_data)

    indices = list(range(num_train))

    np.random.shuffle(indices)

    split = int (np.floor(valid_size*num_train))

    train_idx,valid_idx = indices[split:],indices[:split]

    #define samplers for obtaining training and validation batches

    train_sampler = SubsetRandomSampler(train_idx)

    valid_sampler = SubsetRandomSampler(valid_idx)

    #加载数据

    num_workers = 0

    #每批加载16张图片

    batch_size = 16

    #perpare data loaders(combine dataset and sampler)

    train_loader = torch.utils.data.DataLoader(train_data,batch_size=batch_size,

                                              sampler=train_sampler,num_workers=num_workers)

    valid_loader = torch.utils.data.DataLoader(train_data,batch_size=batch_size,

                                              sampler=valid_sampler,num_workers=num_workers)

    test_loader = torch.utils.data.DataLoader(test_data,batch_size=batch_size,

                                              num_workers=num_workers)

    #10classes

    classes = ['airplane','automobile','bird','cat','deer','dog','frog','horse','ship','truck']

    # 定义卷积神经网络结构

    class Net(nn.Module):

        def __init__(self):

            super(Net,self).__init__()

            #卷积层(32*32*3的图像)

            self.conv1 = nn.Conv2d(3,16,3,padding=1)

            #卷积层(16*16*16)

            self.conv2 = nn.Conv2d(16,32,3,padding=1)

            #卷积层(8*8*32)

            self.conv3 = nn.Conv2d(32,64,3,padding=1)

            #最大池化层

            self.pool = nn.MaxPool2d(2,2)

            #LINEAR LAYER(64*4*4-->500)

            self.fc1 = nn.Linear(64*4*4,500)

            #linear层(500,10)

            self.fc2 = nn.Linear(500,10)

            #dropout(p=0.3)

            self.dropout = nn.Dropout(0.3)

        def forward(self,x):

            #add sequence of convolutinal and max pooling layers

            x = self.pool(F.relu(self.conv1(x)))

            x = self.pool(F.relu(self.conv2(x)))

            x = self.pool(F.relu(self.conv3(x)))

            #flatten image input

            x = x.view(-1,64*4*4)

            #add dropout layer

            x = self.dropout(x)

            # add 1st hidden layer,with relu activation function

            x = F.relu(self.fc1(x))

            # add dropout layer

            x = self.dropout(x)

            # add 2nd hidden layer,with relu activation function

            x = self.fc2(x)

            return x

    #create a complete CNN

    model = Net()

    print (model)

    #检查是否可以利用GPU

    train_on_gpu = torch.cuda.is_available()

    #

    # if not train_on_gpu:

    #    print ('CUDA IS NOT AVAILABLE!')

    # else:

    #    print('CUDA IS AVAILABEL!')

    #可以将模型加载到GPU上去

    if train_on_gpu:

        model.cuda()

    #选择损失函数与优化函数

    #使用交叉熵损失函数

    criterion = nn.CrossEntropyLoss()

    #使用随机梯度下降,学习率为0.01

    optimizer = optim.SGD(model.parameters(),lr=0.01)

    # 训练模型的次数

    n_epochs = 40

    valid_loss_min = np.Inf #track change in calidation loss

    for epoch in range(1,n_epochs+1):

        #keep tracks of training and validation loss

        train_loss = 0.0

        valid_loss = 0.0

        ##################

        # 训练集的模型 #

        ##################

        model.train()

        for data,target in train_loader:

            #move tensors to gpu if cuda is available

            if train_on_gpu:

                data,target = data.cuda(),target.cuda()

            #clear the gradients of all optimized variables

            optimizer.zero_grad()

            #forward pass:compute predicted outputs by passing inputs to the model

            output = model(data)

            # calculate the batch loss

            loss = criterion(output,target)

            #backward pass:compute gradient of the loss with respect to model parameters

            loss.backward()

            #perform a single optimization step(parameters updata)

            optimizer.step()

            #updata training loss

            train_loss += loss.item()*data.size(0)

        ###############

        # 验证集模型 #

        ##################

        model.eval()

        for data,target in valid_loader:

            if train_on_gpu:

                data,target = data.cuda(),target.cuda()

            output = model(data)

            loss = criterion(output,target)

            valid_loss += loss.item()*data.size(0)

        #计算平均损失

        train_loss = train_loss/len(train_loader.sampler)

        valid_loss = valid_loss/len(valid_loader.sampler)

        #显示训练集与验证集的损失函数

        print('Epoch:{} \tTraining loss:{} \tValidation loss:{}'.format(

            epoch,train_loss,valid_loss

        ))

        #如果验证集损失函数减少,就保存模型

        if valid_loss <= valid_loss_min:

            print ('Validation loss decreased ({} --> {}). Saving model ...'.format(

                valid_loss_min,valid_loss

            ))

            torch.save(model.state_dict(),'model_cifar.pt')

            valid_loss_min = valid_loss

    model.load_state_dict(torch.load('model_cifar.pt',map_location=torch.device('cpu')))

    # track test loss

    test_loss = 0.0

    class_correct = list(0. for i in range(10))

    class_total = list(0. for i in range(10))

    model.eval()

    # iterate over test data

    for data, target in test_loader:

        # move tensors to GPU if CUDA is available

        if train_on_gpu:

            data, target = data.cuda(), target.cuda()

        # forward pass: compute predicted outputs by passing inputs to the model

        output = model(data)

        # calculate the batch loss

        loss = criterion(output, target)

        # update test loss

        test_loss += loss.item()*data.size(0)

        # convert output probabilities to predicted class

        _, pred = torch.max(output, 1)

        # compare predictions to true label

        correct_tensor = pred.eq(target.data.view_as(pred))

        correct = np.squeeze(correct_tensor.numpy()) if not train_on_gpu else np.squeeze(correct_tensor.cpu().numpy())

        # calculate test accuracy for each object class

        for i in range(batch_size):

            label = target.data[i]

            class_correct[label] += correct[i].item()

            class_total[label] += 1

    # average test loss

    test_loss = test_loss/len(test_loader.dataset)

    print('Test Loss: {:.6f}\n'.format(test_loss))

    for i in range(10):

        if class_total[i] > 0:

            print('Test Accuracy of %5s: %2d%% (%2d/%2d)' % (

                classes[i], 100 * class_correct[i] / class_total[i],

                np.sum(class_correct[i]), np.sum(class_total[i])))

        else:

            print('Test Accuracy of %5s: N/A (no training examples)' % (classes[i]))

    print('\nTest Accuracy (Overall): %2d%% (%2d/%2d)' % (

        100. * np.sum(class_correct) / np.sum(class_total),

        np.sum(class_correct), np.sum(class_total)))

    相关文章

      网友评论

        本文标题:深度学习计算机视觉入门过程

        本文链接:https://www.haomeiwen.com/subject/cniosltx.html