美文网首页
cs231n assignment2(3)

cs231n assignment2(3)

作者: 没天赋的学琴 | 来源:发表于2020-05-29 22:44 被阅读0次

       assignment2的第三部分,是熟悉深度学习框架pytorch或者tensorflow,这里选择的是使用pytorch框架。该部分主要通过三个层次:BarebonesModule APISequential API,来了解pytorch


    Barebones

       在该层次中,需要利用pytorch所提供的一些函数,不仅需要定义神经网络的结构,同时还需编写网络的前向传播以及模型的训练部分;而参数的梯度可以直接由pytorch自动算出。
    下述是定义一个两层神经网络的代码:

    import torch.nn.functional as F  # useful stateless functions
    
    def two_layer_fc(x, params):
        """
        A fully-connected neural networks; the architecture is:
        NN is fully connected -> ReLU -> fully connected layer.
        Note that this function only defines the forward pass; 
        PyTorch will take care of the backward pass for us.
        
        The input to the network will be a minibatch of data, of shape
        (N, d1, ..., dM) where d1 * ... * dM = D. The hidden layer will have H units,
        and the output layer will produce scores for C classes.
        
        Inputs:
        - x: A PyTorch Tensor of shape (N, d1, ..., dM) giving a minibatch of
          input data.
        - params: A list [w1, w2] of PyTorch Tensors giving weights for the network;
          w1 has shape (D, H) and w2 has shape (H, C).
        
        Returns:
        - scores: A PyTorch Tensor of shape (N, C) giving classification scores for
          the input data x.
        """
        # first we flatten the image
        x = flatten(x)  # shape: [batch_size, C x H x W]
        
        w1, w2 = params
        
        # Forward pass: compute predicted y using operations on Tensors. Since w1 and
        # w2 have requires_grad=True, operations involving these Tensors will cause
        # PyTorch to build a computational graph, allowing automatic computation of
        # gradients. Since we are no longer implementing the backward pass by hand we
        # don't need to keep references to intermediate values.
        # you can also use `.clamp(min=0)`, equivalent to F.relu()
        x = F.relu(x.mm(w1))
        x = x.mm(w2)
        return x
        
    
    def two_layer_fc_test():
        hidden_layer_size = 42
        x = torch.zeros((64, 50), dtype=dtype)  # minibatch size 64, feature dimension 50
        w1 = torch.zeros((50, hidden_layer_size), dtype=dtype)
        w2 = torch.zeros((hidden_layer_size, 10), dtype=dtype)
        scores = two_layer_fc(x, [w1, w2])
        print(scores.size())  # you should see [64, 10]
    
    two_layer_fc_test()
    

    下述是利用SGD来训练模型的方法:

    def train_part2(model_fn, params, learning_rate):
        """
        Train a model on CIFAR-10.
        
        Inputs:
        - model_fn: A Python function that performs the forward pass of the model.
          It should have the signature scores = model_fn(x, params) where x is a
          PyTorch Tensor of image data, params is a list of PyTorch Tensors giving
          model weights, and scores is a PyTorch Tensor of shape (N, C) giving
          scores for the elements in x.
        - params: List of PyTorch Tensors giving weights for the model
        - learning_rate: Python scalar giving the learning rate to use for SGD
        
        Returns: Nothing
        """
        for t, (x, y) in enumerate(loader_train):
            # Move the data to the proper device (GPU or CPU)
            x = x.to(device=device, dtype=dtype)
            y = y.to(device=device, dtype=torch.long)
    
            # Forward pass: compute scores and loss
            scores = model_fn(x, params)
            loss = F.cross_entropy(scores, y)
    
            # Backward pass: PyTorch figures out which Tensors in the computational
            # graph has requires_grad=True and uses backpropagation to compute the
            # gradient of the loss with respect to these Tensors, and stores the
            # gradients in the .grad attribute of each Tensor.
            loss.backward()
    
            # Update parameters. We don't want to backpropagate through the
            # parameter updates, so we scope the updates under a torch.no_grad()
            # context manager to prevent a computational graph from being built.
            with torch.no_grad():
                for w in params:
                    w -= learning_rate * w.grad
    
                    # Manually zero the gradients after running the backward pass
                    w.grad.zero_()
    
            if t % print_every == 0:
                print('Iteration %d, loss = %.4f' % (t, loss.item()))
                check_accuracy_part2(loader_val, model_fn, params)
                print()
    

       其实在网络训练部分,也可以使用torch.optim中的函数来完成网络的训练,而无需自己编写训练部分的函数。


    Module API

       在Barebones层次,其实与numpy类似,利用其提供的一些运算方法,来构建神经网络;在这个过程中,除了无需考虑如何计算梯度外,其他与使用numpy基本无异。而在Module API,可以直接利用nn.Module中的一些基本神经网络层次直接构建神经网络。
    下述是构建两层全连接神经网络结构:

    class TwoLayerFC(nn.Module):
        def __init__(self, input_size, hidden_size, num_classes):
            super().__init__()
            # assign layer objects to class attributes
            self.fc1 = nn.Linear(input_size, hidden_size)
            # nn.init package contains convenient initialization methods
            # http://pytorch.org/docs/master/nn.html#torch-nn-init 
            nn.init.kaiming_normal_(self.fc1.weight)
            self.fc2 = nn.Linear(hidden_size, num_classes)
            nn.init.kaiming_normal_(self.fc2.weight)
        
        def forward(self, x):
            # forward always defines connectivity
            x = flatten(x)
            scores = self.fc2(F.relu(self.fc1(x)))
            return scores
    
    def test_TwoLayerFC():
        input_size = 50
        x = torch.zeros((64, input_size), dtype=dtype)  # minibatch size 64, feature dimension 50
        model = TwoLayerFC(input_size, 42, 10)
        scores = model(x)
        print(scores.size())  # you should see [64, 10]
    test_TwoLayerFC()
    

    上述的定义中,定义自身的神经结构类来实现nn.Module这一接口,在init方法中,定义网络的结构,使用的是框架所提供的一些基本层;而在forward方法中,描述网络前向传播的过程,把输入x按顺序代入定义的层次即可,无需像上一部分那样,要完成相乘等一些基本操作。而在真正实现时,只需把网络结构输入即可。
    模型的训练也与上一部分有些许不同,利用torch.optim来训练网络,代码如下:

    def train_part34(model, optimizer, epochs=1):
        """
        Train a model on CIFAR-10 using the PyTorch Module API.
        
        Inputs:
        - model: A PyTorch Module giving the model to train.
        - optimizer: An Optimizer object we will use to train the model
        - epochs: (Optional) A Python integer giving the number of epochs to train for
        
        Returns: Nothing, but prints model accuracies during training.
        """
        model = model.to(device=device)  # move the model parameters to CPU/GPU
        for e in range(epochs):
            for t, (x, y) in enumerate(loader_train):
                model.train()  # put model to training mode
                x = x.to(device=device, dtype=dtype)  # move to device, e.g. GPU
                y = y.to(device=device, dtype=torch.long)
    
                scores = model(x)
                loss = F.cross_entropy(scores, y)
    
                # Zero out all of the gradients for the variables which the optimizer
                # will update.
                optimizer.zero_grad()
    
                # This is the backwards pass: compute the gradient of the loss with
                # respect to each  parameter of the model.
                loss.backward()
    
                # Actually update the parameters of the model using the gradients
                # computed by the backwards pass.
                optimizer.step()
    
                if t % print_every == 0:
                    print('Iteration %d, loss = %.4f' % (t, loss.item()))
                    check_accuracy_part34(loader_val, model)
                    print()
    

    代码中的前半部分与上一部分的一样,算出scoreloss的值以及使用loss.backward()算出梯度;而参数更新部分,不用具体实现相应的方法,而是通过语句optimizer.step()来更新。而optimizer则由外层来定义。
    这是整个部分的完整代码:

    hidden_layer_size = 4000
    learning_rate = 1e-2
    model = TwoLayerFC(3 * 32 * 32, hidden_layer_size, 10)
    optimizer = optim.SGD(model.parameters(), lr=learning_rate)
    
    train_part34(model, optimizer)
    

    Sequential API

       在Module API这一层次,整个流程大致如下:init定义网络结构、forward描述前向部分、利用optim编写训练部分、将相应参数填入并训练。而在Sequential API这一层次,可以把“ forward描述前向部分 ”去掉,在定义网络结构后,其前向部分即可由框架自动完成,而剩余部分也还需取编写定义。
    下述是利用Sequential API来定义两层神经网络的代码:

    # We need to wrap `flatten` function in a module in order to stack it
    # in nn.Sequential
    class Flatten(nn.Module):
        def forward(self, x):
            return flatten(x)
    
    hidden_layer_size = 4000
    learning_rate = 1e-2
    
    model = nn.Sequential(
        Flatten(),
        nn.Linear(3 * 32 * 32, hidden_layer_size),
        nn.ReLU(),
        nn.Linear(hidden_layer_size, 10),
    )
    
    # you can use Nesterov momentum in optim.SGD
    optimizer = optim.SGD(model.parameters(), lr=learning_rate,
                         momentum=0.9, nesterov=True)
    
    train_part34(model, optimizer)
    

    与前面相比,定义网络更为简洁明了。


    小结

       assignment2的最后一部分,对pytorch框架的熟悉,以后在构建更为复杂的神经网络项目时,使用pytorch就可提高效率。

    相关文章

      网友评论

          本文标题:cs231n assignment2(3)

          本文链接:https://www.haomeiwen.com/subject/dzvwghtx.html