在初步体验其自动微分操作之后,发觉这个算法包确实是将计算图的符号占位变量和实体变量的绑定操作合二为一了。这就比其它的算法包来得更为简洁明快,学习的曲线陡降了很多。
下面是几个自动求导的示例:
第一类:纯标量
import torch
from torch.autograd import Variable
x = Variable(torch.ones(1)*3, requires_grad=True)
y = Variable(torch.ones(1)*4, requires_grad=True)
z = x.pow(2)+3*y.pow(2) # z = x^2+3y^2, dz/dx=2x, dz/dy=6y
z.backward() #纯标量结果可不写占位变量
print x.grad # x = 3 时, dz/dx=2x=2*3=6
print y.grad # y = 4 时, dz/dy=6y=6*4=24
结果:
Variable containing:
6
[torch.FloatTensor of size 1]
Variable containing:
24
[torch.FloatTensor of size 1]
第二类:全 1 向量
x = Variable(torch.ones(2)*3, requires_grad=True)
y = Variable(torch.ones(2)*4, requires_grad=True)
z = x.pow(2)+3*y.pow(2)
z.backward(torch.ones(2))
print x.grad
print y.grad
结果:
Variable containing:
6
6
[torch.FloatTensor of size 2]
Variable containing:
24
24
[torch.FloatTensor of size 2]
第三类:异值向量
x = Variable(torch.Tensor([1,2,3]), requires_grad=True)
y = Variable(torch.Tensor([4,5,6]), requires_grad=True)
z = x.pow(2)+3*y.pow(2)
z.backward(torch.ones(3))
print x.grad
print y.grad
结果:
Variable containing:
2
4
6
[torch.FloatTensor of size 3]
Variable containing:
24
30
36
[torch.FloatTensor of size 3]
第四类:矩阵乘法
x = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
y = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
z = x.mm(y.t())
z.backward(torch.ones(2,2))
print x.grad
print y.grad
结果:
Variable containing:
5 7 9
5 7 9
[torch.FloatTensor of size 2x3]
Variable containing:
5 7 9
5 7 9
[torch.FloatTensor of size 2x3]
第五类:矩阵和向量的乘法
x = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
y = Variable(torch.Tensor([1,3,5]), requires_grad=True) #单个方括号,否则视为矩阵,不能用mv函数
z = x.mv(y)
z.backward(torch.ones(2))
print x.grad
print y.grad
结果:
Variable containing:
1 3 5
1 3 5
[torch.FloatTensor of size 2x3]
Variable containing:
5
7
9
[torch.FloatTensor of size 3]
第六类:cuda. 第一次执行时似乎需要编译,会执行很长时间,大概需要3~5分钟,之后的运行会很快。
import torch
from torch.autograd import Variable
print ' -------- cuda scalar --------'
x = Variable((torch.ones(1)*3).cuda(), requires_grad=True)
y = Variable((torch.ones(1)*4).cuda(), requires_grad=True)
z = (x.pow(2)+3*y.pow(2))
print 'z without cuda: {0}'.format(type(z))
z = (x.pow(2)+3*y.pow(2))
z.backward()
print '\nx.type: {0}, x.grad: {1}'.format(type(x),x.grad)
print 'y.type: {0}, y.grad: {1}'.format(type(y),y.grad)
print 'z.type: {0}, z: {1}'.format(type(z),z)
print ' -------- cuda matrix --------'
x = Variable(torch.Tensor([[1, 2, 3], [4, 5, 6]]).cuda(), requires_grad=True)
y = Variable(torch.Tensor([[1, 2, 3], [4, 5, 6]]).cuda(), requires_grad=True)
z = x.mm(y.t())#.cuda(0,async=True)
z.backward(torch.ones(2, 2).cuda(0,async=True))
print '\nx.type: {0}, x.grad: {1}'.format(type(x),x.grad)
print 'y.type: {0}, y.grad: {1}'.format(type(y),y.grad)
print 'z.type: {0}, z: {1}'.format(type(z),z)
结果:
-------- cuda scalar --------
z without cuda:
x.type: , x.grad: Variable containing:
6
[torch.cuda.FloatTensor of size 1 (GPU 0)]
y.type: , y.grad: Variable containing:
24
[torch.cuda.FloatTensor of size 1 (GPU 0)]
z.type: , z: Variable containing:
57
[torch.cuda.FloatTensor of size 1 (GPU 0)]
-------- cuda matrix --------
x.type: , x.grad: Variable containing:
5 7 9
5 7 9
[torch.cuda.FloatTensor of size 2x3 (GPU 0)]
y.type: , y.grad: Variable containing:
5 7 9
5 7 9
[torch.cuda.FloatTensor of size 2x3 (GPU 0)]
z.type: , z: Variable containing:
14 32
32 77
[torch.cuda.FloatTensor of size 2x2 (GPU 0)]
网友评论