美文网首页
手写神经网络

手写神经网络

作者: azorazz | 来源:发表于2019-09-29 15:25 被阅读0次

假设:
输入层 L0 (输入X矩阵,4*3 有3个特征)
隐藏层 L1 (有4个特征)
输出层 L2 (有1个特征)


image.png

初始化

输入数据:

X = np.array([[0, 0, 1],
              [0, 1, 1],
              [1, 0, 1],
              [1, 1, 1]])

标签值:

y = np.array([[0],
              [1],
              [1],
              [0]])

初始W(权重)

w0 = 2 * np.random.random((3, 4)) - 1   # w0:  shape(3,4)
w1 = 2 * np.random.random((4, 1)) - 1    # w1:  shape(4,1)

流程

用激活函数sigmoid非线性化

sigmod函数:
f(x) = \frac{1}{1+exp(-x)}
sigmod求导:
f'(x) = f(x)\cdot (1-f(x))

sigmod函数及求导:

def nonlin(x, deriv=False):
    if (deriv == True):
        return x * (1 - x)  # sigmoid函数求导

    return 1 / (1 + np.exp(-x))   # sigmoid函数

1. 正向传播

随机初始化权重W,求出预测值
L1 层数据:
\frac{1}{1+exp(-w_{0}x)}
L2 层数据:
L2 = \frac{1}{1+exp(-w_{1}L_{1})} = \frac{1}{1+exp(-w_{1}\frac{1}{1+exp(-w_{0}x)})}

L1,L2表示:

l1 = nonlin(np.dot(l0, w0))  # l0:  shape(4,3)    l1: shape(4,4)
l2 = nonlin(np.dot(l1, w1))   # l2:  shape(4,1)

2. 计算误差

loss = (y-L_{2})^{2}

3. 反向传播

通过误差值,不断更新权重W

计算w1的梯度:
\triangledown w_{1}=\frac{\partial (loss)}{\partial (w_{1})} = \frac{\partial (loss)}{\partial (L_{2})}\cdot \frac{\partial (L_{2})}{\partial (w_{1})} = 2(y-L_{2})\cdot L_{2}(1-L_{2})\cdot L_{1}
计算w0的梯度 :
\triangledown w_{0}=\frac{\partial (loss)}{\partial (w_{0})} = 2(y-L_{2})\cdot L_{2}(1-L_{2})\cdot w_{1}\cdot L_{1}(1-L_{1})\cdot L_{0}
梯度更新:
w_{1} = w_{1} - \triangledown w_{1}
w_{0} = w_{0} - \triangledown w_{0}

l2_error = 2*(y - l2)      # l2_error:  shape(4,1)

l2_delta = l2_error * nonlin(l2, deriv=True)   # l2_delta:  shape(4,1)

l1_error = l2_delta.dot(w1.T)   # l1_error:  shape(4,4)

l1_delta = l1_error * nonlin(l1, deriv=True)

w1 += l1.T.dot(l2_delta)
w0 += l0.T.dot(l1_delta)

代码实现

import numpy as np


def nonlin(x, deriv=False):
    if (deriv == True):
        return x * (1 - x)  # sigmoid函数求导

    return 1 / (1 + np.exp(-x))   # sigmoid函数


X = np.array([[0, 0, 1],
              [0, 1, 1],
              [1, 0, 1],
              [1, 1, 1]])


y = np.array([[0],
              [1],
              [1],
              [0]])

np.random.seed(1)

# randomly initialize our weights with mean 0
w0 = 2 * np.random.random((3, 4)) - 1
w1 = 2 * np.random.random((4, 1)) - 1

for j in range(60000):
    # 输入层
    l0 = X

    # 隐藏层1
    l1 = nonlin(np.dot(l0, w0))

    # 输出层
    l2 = nonlin(np.dot(l1, w1))

    l2_error = 2*(y - l2)

    if (j % 10000) == 0:
        print("Error:" + str(np.mean((y - l2)*(y - l2))))

    l2_delta = l2_error * nonlin(l2, deriv=True)

    l1_error = l2_delta.dot(w1.T)

    l1_delta = l1_error * nonlin(l1, deriv=True)

    w1 += l1.T.dot(l2_delta)
    w0 += l0.T.dot(l1_delta)

相关文章

网友评论

      本文标题:手写神经网络

      本文链接:https://www.haomeiwen.com/subject/wqizuctx.html