假设:
输入层 L0 (输入X矩阵,4*3 有3个特征)
隐藏层 L1 (有4个特征)
输出层 L2 (有1个特征)
image.png
初始化
输入数据:
X = np.array([[0, 0, 1],
[0, 1, 1],
[1, 0, 1],
[1, 1, 1]])
标签值:
y = np.array([[0],
[1],
[1],
[0]])
初始W(权重)
w0 = 2 * np.random.random((3, 4)) - 1 # w0: shape(3,4)
w1 = 2 * np.random.random((4, 1)) - 1 # w1: shape(4,1)
流程
用激活函数sigmoid非线性化
sigmod函数:
sigmod求导:
sigmod函数及求导:
def nonlin(x, deriv=False):
if (deriv == True):
return x * (1 - x) # sigmoid函数求导
return 1 / (1 + np.exp(-x)) # sigmoid函数
1. 正向传播
随机初始化权重W,求出预测值
L1 层数据:
L2 层数据:
L1,L2表示:
l1 = nonlin(np.dot(l0, w0)) # l0: shape(4,3) l1: shape(4,4)
l2 = nonlin(np.dot(l1, w1)) # l2: shape(4,1)
2. 计算误差
3. 反向传播
通过误差值,不断更新权重W
计算w1的梯度:
计算w0的梯度 :
梯度更新:
l2_error = 2*(y - l2) # l2_error: shape(4,1)
l2_delta = l2_error * nonlin(l2, deriv=True) # l2_delta: shape(4,1)
l1_error = l2_delta.dot(w1.T) # l1_error: shape(4,4)
l1_delta = l1_error * nonlin(l1, deriv=True)
w1 += l1.T.dot(l2_delta)
w0 += l0.T.dot(l1_delta)
代码实现
import numpy as np
def nonlin(x, deriv=False):
if (deriv == True):
return x * (1 - x) # sigmoid函数求导
return 1 / (1 + np.exp(-x)) # sigmoid函数
X = np.array([[0, 0, 1],
[0, 1, 1],
[1, 0, 1],
[1, 1, 1]])
y = np.array([[0],
[1],
[1],
[0]])
np.random.seed(1)
# randomly initialize our weights with mean 0
w0 = 2 * np.random.random((3, 4)) - 1
w1 = 2 * np.random.random((4, 1)) - 1
for j in range(60000):
# 输入层
l0 = X
# 隐藏层1
l1 = nonlin(np.dot(l0, w0))
# 输出层
l2 = nonlin(np.dot(l1, w1))
l2_error = 2*(y - l2)
if (j % 10000) == 0:
print("Error:" + str(np.mean((y - l2)*(y - l2))))
l2_delta = l2_error * nonlin(l2, deriv=True)
l1_error = l2_delta.dot(w1.T)
l1_delta = l1_error * nonlin(l1, deriv=True)
w1 += l1.T.dot(l2_delta)
w0 += l0.T.dot(l1_delta)
网友评论