pytorch 随笔

作者: zidea | 来源:发表于2020-06-24 20:22 被阅读0次

pytorch.png

框架

看文档
看生态
看调试工具

最近许多网络结构都该用 pytorch 重新实现一遍，在很多领域都成为三足鼎立，在前端框架 react、angular 和 vue，他们相互学习共同进步。都有自己的特点，才能在众多前端项目中脱颖而出。

Tensorflow-VS-Pytorch.jpg

import torch
import numpy as np

初始化创建 tensor

在 torch 和 tensorflow 中都提供了一些类似 numpy 方法，虽然使用 tensorflow 和 torch 提供对 tensor 操作方法已经基本可以替换掉了 numpy 中对应的方法。但是大家还是习惯了使用 numpy，不知不觉还是想用 numpy。就像写 web 时，有时候还是想用 jquery。

x = torch.empty(3,2)
x

tensor([[ 0.0000e+00, -8.5899e+09],
        [ 0.0000e+00, -8.5899e+09],
        [ 1.1210e-44,  0.0000e+00]])

x = torch.rand(3,2)
x

tensor([[0.5045, 0.7438],
        [0.5103, 0.1921],
        [0.6033, 0.8493]])

创建一个全部为 0 的 tensor

torch.zeros(3,2)
x

tensor([[0.5045, 0.7438],
        [0.5103, 0.1921],
        [0.6033, 0.8493]])

# 查看类型

x.dtype

torch.float32

改变元素类型

x = torch.zeros(5,3,dtype=torch.long)
x

tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])

x = torch.zeros(5,3).long()
x.dtype

torch.int64

x = torch.tensor([1,2,3])
x

tensor([1, 2, 3])

重用信息，例如新建的 tensor 类型会 x 的类型相同

x = x.new_ones(2,3)
x

tensor([[1, 1, 1],
        [1, 1, 1]])

x = torch.randn_like(x,dtype=torch.float)
x

tensor([[-0.3904, -0.7965, -0.2722],
        [-0.2156, -0.0594,  0.6214]])

x.shape

torch.Size([2, 3])

tensor 运算操作符

y = torch.rand(2,3)

torch.add(x,y)

tensor([[ 0.2551, -0.3587,  0.0470],
        [ 0.5914,  0.7677,  1.6127]])

# x + y 是 torch.add 的语法糖
x+y

tensor([[ 0.2551, -0.3587,  0.0470],
        [ 0.5914,  0.7677,  1.6127]])

result = torch.empty(2,3)
torch.add(x,y,out=result)
result

tensor([[ 0.2551, -0.3587,  0.0470],
        [ 0.5914,  0.7677,  1.6127]])

y.add_(x)
y

tensor([[ 0.2551, -0.3587,  0.0470],
        [ 0.5914,  0.7677,  1.6127]])

x[:,1:]

tensor([[-0.7965, -0.2722],
        [-0.0594,  0.6214]])

tensor([[-0.3904, -0.7965, -0.2722],
        [-0.2156, -0.0594,  0.6214]])

x = torch.randn(5,5)
y = x.view(25)
y

tensor([ 0.8127, -1.0394, -0.4423, -0.2006,  0.0952, -2.0985, -0.6619, -1.3043,
         1.4075, -0.4180,  2.3185,  0.8152, -0.6363, -1.7527,  0.3265, -0.6964,
        -1.7822,  1.1541, -0.3272,  1.1663,  0.3566, -0.4818,  2.0843, -0.5080,
         0.6954])

x = torch.randn(3,2)
z = x.view(2,-1)
z

tensor([[ 0.3892, -0.8355,  0.7208],
        [ 1.4621,  1.3130, -1.1406]])

x = torch.randn(1)
x.data

tensor([1.5721])

# 提取 tensor 数值
x.item()

1.5720665454864502

z.transpose(1,0)

tensor([[ 0.3892,  1.4621],
        [-0.8355,  1.3130],
        [ 0.7208, -1.1406]])

z.shape

torch.Size([2, 3])

Numpy 和 Tensor 之间转换

有的时候我们需要将 torch 的 tensor 转换为 Numpy 的 ndarry，在 torch 提供方法可以让两者自由地切换，这个 tensorflow 2.0 中也提供了方法。

a = torch.ones(5)
a

tensor([1., 1., 1., 1., 1.])

调用 tensor 对象的 numpy() 方法就可以将 torch 的 tensor 转换为 ndarray 对象。这时如果我们改变由 torch tensor 得到的 ndarray 对象的值，torch 的 tensor 的值也会随之改变。

b = a.numpy()
b

array([1., 1., 1., 1., 1.], dtype=float32)

b[1] = 2
a

tensor([1., 2., 1., 1., 1.])

a = np.ones(5)
b = torch.from_numpy(a)
np.add(a,1,out=a)
print(a)

[2. 2. 2. 2. 2.]

# 检测 cuda 是否可用
torch.cuda.is_available()

热身：用numpy 实现两层神经网络

ReLU

$hidden = W_1X + b_1$
$a = max(0,h)$
$y_{hat} = W_2a + b_2$

$h = w_1 x$
$h_{relu} = max(0,h)$
$\hat{y} = w_2 h_{relu}$
$\frac{d \hat{y}}{d w_2} = h_{relu}$

$L = \sum_{i=1}^N (\hat{y}_i - y_i)^2$
$\hat{y} =$

$\frac{d h}{dw_1} = x$

$\frac{dL}{d\hat{y}} = 2(\hat{y} - y) \times 1$
$\frac{d L}{dw_2} = \frac{dL}{d \hat{y}} \frac{d \hat{y}} {d w_2} = 2(\hat{y} - y) h_{relu}$
$\hat{y} = w_2relu(w_1x)$
$\frac{d L}{dw_1} = \frac{d L}{d \hat{y}} \frac{d \hat{y}}{d h_{relu}} \frac{d h_{relu}}{d w_1} = x.T.dot(grad h)$

前向传播

反向传播

我们要求 $$$$

这部分内容虽然比较基础，但是只有很好理解这部

N, D_in, H, D_out= 64,1000,100,10
# N 表示样本数，x 表示数据而 y 表示标签，我们任务就找到 y 和 x 之间的映射关系
# D_in 输入样本维度
x = np.random.randn(N,D_in)
y = np.random.randn(N,D_out)

# 初始化权重
w1 = np.random.randn(D_in,H)
w2 = np.random.randn(H,D_out)

learning_rate = 1e-6
for t in range(500):
    # 前向传播     
    h = x.dot(w1)
    h_relu = np.maximum(h,0)
    y_pred = h_relu.dot(w2)
    
    # 计算损失
    loss = np.square(y_pred - y).sum()
    print(t,loss)
    
    # 反向传播，也就是分别计算 w1 和 w2 对 loss 函数的导数
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.T.dot(grad_y_pred)
    grad_h_relu = grad_y_pred.dot(w2.T)
    grad_h = grad_h_relu.copy()
    
    grad_h[h<0] = 0
    grad_w1 = x.T.dot(grad_h)
    
    
    # 更新权重
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2

h = x.dot(w1)
h_relu = np.maximum(h,0)
y_pred = h_relu.dot(w2)

(y_pred - y).mean()

5.423194979076235e-07

N, D_in, H, D_out= 64,1000,100,10
# 随机创建一些训练数据
x = torch.randn(N,D_in)
y = torch.randn(N,D_out)

w1 = torch.randn(D_in,H)
w2 = torch.randn(H,D_out)

learning_rate = 1e-6
for t in range(500):
    h = x.mm(w1)
    h_relu = h.clamp(min=0)
    y_pred = h_relu.mm(w2)
    
    # compute loss
    loss = (y_pred - y).pow(2).sum().item()
    print(t,loss)
    
    # backward pass
    # compute the gradient
    grad_y_pred = 2.0 * (y_pred - y)
    grad_w2 = h_relu.t().mm(grad_y_pred)
    grad_h_relu = grad_y_pred.mm(w2.t())
    grad_h = grad_h_relu.clone()
    
    grad_h[h<0] = 0
    grad_w1 = x.t().mm(grad_h)
    
    
    # update weights of w1 and w2
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2

x = torch.tensor(1.,requires_grad=True)
w = torch.tensor(2.,requires_grad=True)
b = torch.tensor(3.,requires_grad=True)

y = w*x + b
y.backward()
print(w.grad)
print(x.grad)

tensor(1.)
tensor(2.)

N, D_in, H, D_out= 64,1000,100,10
# 随机创建一些训练数据
x = torch.randn(N,D_in)
y = torch.randn(N,D_out)

w1 = torch.randn(D_in,H)
w2 = torch.randn(H,D_out)

learning_rate = 1e-6
for t in range(500):
    y_pred = x.mm(w1).clamp(min=0).mm(w2)
    
    
    # compute loss
    loss = (y_pred - y).pow(2).sum().item()
    print(t,loss)
    
    loss.backward()
    
    # update weights of w1 and w2
    w1 -= learning_rate * grad_w1
    w2 -= learning_rate * grad_w2

pytorch 随笔

框架

初始化创建 tensor

改变元素类型

tensor 运算操作符

Numpy 和 Tensor 之间转换

热身：用numpy 实现两层神经网络

前向传播

反向传播

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

深度学习