如何构建一个简单的神经网络

作者: 小聪明李良才 | 来源:发表于2017-02-06 22:02 被阅读3301次

第五章深度学习实用层面
如何构建一个简单的神经网络
Tensorboard进行可视化 1
Tensorflow 构建简单神经网络
deeplearning课后作业（课程一第三周作业）
tensorflow 2.x系列01 构建和训练一个神经网络模型
手把手教你用Python创建简单的神经网络！
Pytorch构建CNN模型用于图片分类
小白学CNN以及Keras的速成（3）
构建属于自己的神经网络(二)

如何构建一个简单的神经网络

最近报名了Udacity的深度学习基石，这是介绍了第二部分神经网络入门，第一篇是线性回归背后的数学.
本文notebook的地址是：https://github.com/zhuanxuhit/nd101/blob/master/1.Intro_to_Deep_Learning/2.How_to_Make_a_Neural_Network/python-network.ipynb

1. 模型阐述

假设我们有下面的一组数据

输入1	输入2	输入3	输出
0	0	1	0
1	1	1	1
1	0	1	1
0	1	1	0

对于上面的表格，我们可以找出其中的一个规律是：

输入的第一列和输出相同

那对于输入有3列，每列有0和1两个值，那可能的排列有\(2^3=8\)种，但是此处只有4种，那么在有限的数据情况下，我们应该怎么预测其他结果呢？
这个时候神经网络就大显身手了！

看代码：

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

from numpy import exp, array, random, dot

class NeuralNetwork():
    def __init__(self):
        # Seed the random number generator, so it generates the same numbers
        # every time the program runs.
        random.seed(1)

        # We model a single neuron, with 3 input connections and 1 output connection.
        # We assign random weights to a 3 x 1 matrix, with values in the range -1 to 1
        # and mean 0.
        self.synaptic_weights = 2 * random.random((3, 1)) - 1
        self.sigmoid_derivative = self.__sigmoid_derivative

    # The Sigmoid function, which describes an S shaped curve.
    # We pass the weighted sum of the inputs through this function to
    # normalise them between 0 and 1.
    def __sigmoid(self, x):
        return 1 / (1 + exp(-x))
    
   

    # The derivative of the Sigmoid function.
    # This is the gradient of the Sigmoid curve.
    # It indicates how confident we are about the existing weight.
    def __sigmoid_derivative(self, x):
        return x * (1 - x)

    # We train the neural network through a process of trial and error.
    # Adjusting the synaptic weights each time.
    def train(self, training_set_inputs, training_set_outputs, number_of_training_iterations):
        for iteration in range(number_of_training_iterations):
            # Pass the training set through our neural network (a single neuron).
            output = self.think(training_set_inputs)

            # Calculate the error (The difference between the desired output
            # and the predicted output).
            error = training_set_outputs - output

            # Multiply the error by the input and again by the gradient of the Sigmoid curve.
            # This means less confident weights are adjusted more.
            # This means inputs, which are zero, do not cause changes to the weights.
            adjustment = dot(training_set_inputs.T, error * self.__sigmoid_derivative(output))

            # Adjust the weights.
            self.synaptic_weights += adjustment

    # The neural network thinks.
    def think(self, inputs):
        # Pass inputs through our neural network (our single neuron).
        return self.__sigmoid(dot(inputs, self.synaptic_weights))

#Intialise a single neuron neural network.
neural_network = NeuralNetwork()

print("Random starting synaptic weights: ")
print(neural_network.synaptic_weights)

# The training set. We have 4 examples, each consisting of 3 input values
# and 1 output value.
training_set_inputs = array([[0, 0, 1], [1, 1, 1], [1, 0, 1], [0, 1, 1]])
training_set_outputs = array([[0, 1, 1, 0]]).T

# Train the neural network using a training set.
# Do it 10,000 times and make small adjustments each time.
neural_network.train(training_set_inputs, training_set_outputs, 10000)

print("New synaptic weights after training: ")
print(neural_network.synaptic_weights)

# Test the neural network with a new situation.
print("Considering new situation [1, 0, 0] -> ?: ")
print(neural_network.think(array([1, 0, 0])))

Random starting synaptic weights: 
[[-0.16595599]
 [ 0.44064899]
 [-0.99977125]]
New synaptic weights after training: 
[[ 9.67299303]
 [-0.2078435 ]
 [-4.62963669]]
Considering new situation [1, 0, 0] -> ?: 
[ 0.99993704]

以上代码来自：https://github.com/llSourcell/Make_a_neural_network
现在我们来分析下具体的过程：
第一个我们需要注意的是sigmoid function,其图如下：

import matplotlib.pyplot as plt
import numpy as np

def sigmoid(x):
    a = []
    for item in x:
        a.append(1/(1+np.exp(-item)))
    return a
    
x = np.arange(-6., 6., 0.2)
sig = sigmoid(x)
plt.plot(x,sig)
plt.grid()
plt.show()

output_5_0.png

我们可以看到sigmoid函数将输入转换到了0-1之间的值，而sigmoid函数的导数是：

def __sigmoid_derivative(self, y):
return y * (1 - y)

其具体的含义看图：

def sigmoid_derivative(x):
    y = 1/(1+np.exp(-x))
    return y * (1-y)

def derivative(point):
    dx = np.arange(-0.5,0.5,0.1)
    slope = sigmoid_derivative(point)
    return [point+dx,slope * dx + 1/(1+np.exp(-point))]

x = np.arange(-6., 6., 0.1)

sig = sigmoid(x)
point1 = 2
slope1 = sigmoid_derivative(point1)
plt.plot(x,sig)
x1,y1 = derivative(point1)
plt.plot(x1,y1,linewidth=5)
x2,y2 = derivative(0)
plt.plot(x2,y2,linewidth=5)
x3,y3 = derivative(-4)
plt.plot(x3,y3,linewidth=5)
plt.grid()
plt.show()

output_7_0.png

现在我们来根据图解释下实际的含义：

首先输出是0到1之间的值，我们可以将其认为是一个可信度，0不可信，1完全可信
当输入是0的时候，输出是0.5，什么意思呢？意思是输出模棱两可

基于以上两点，我们来看下上面函数的中的一个计算过程：

adjustment = dot(training_set_inputs.T, error * self.__sigmoid_derivative(output))

这个调整值的含义我们就知道了，当输出接近0和1时候，我们已经预测的挺准了，此时调整就基本接近于0了
而当输出为0.5左右的时候，说明预测完全是瞎猜，我们就需要快速调整，因此此时的导数也是最大的，即上图的绿色曲线，其斜度也是最大的

基于上面的一个讨论，我们还可以有下面的一个结论:

当输入是1，输出是0，我们需要不断减小 weight 的值，这样子输出才会是很小，sigmoid输出才会是0
当输入是1，输出是1，我们需要不断增大 weight 的值，这样子输出才会是很大，sigmoid输出才会是1

这时候我们再来看下最初的数据，

输入1	输入2	输入3	输出
0	0	1	0
1	1	1	1
1	0	1	1
0	1	1	0

我们可以断定输入1的weight值会变大，而输入2，3的weight值会变小。
根据之前训练出来的结果也支持了我们的推断：

Random starting synaptic weights: 
[[-0.16595599]
 [ 0.44064899]
 [-0.99977125]]
New synaptic weights after training: 
[[ 9.67299303]
 [-0.2078435 ]
 [-4.62963669]]

2. 扩展

我们来将上面的问题稍微复杂下，假设我们的输入如下：

输入1	输入2	输入3	输出
0	0	1	0
0【此处改变】	1	1	1
1	0	1	1
1【此处改变】	1	1	0

此处我们只是改变一个值，此时我们再次训练呢？
我们观察上面的数据，好像很难再像最初一样直接观察出 输出1 == 输出 的这种简单的关系了，我们要稍微深入的观察下了

首先输入3都是1，看起来对输出没什么影响
接着观察输入1和输入2，似乎只要两者不同，输出就是1

基于上面的观察，我们似乎找不到像输出1 == 输出这种 one-to-one 的关系了，我们有什么办法呢？
这个时候，就需要引入 hidden layer，如下表格：

输入1	输入2	输入3	w1	w2	w3	中间输出
0	0	1	0.1	0.2	0	0
0	1	1	0.2	0.6	0.4	1
1	0	1	0.3	0.2	0.7	1
1	1	1	0.1	0.5	-0.6	0

此时我们得到中的中间输入和最后输出就还是原来的一个 输出1 == 输出 关系了。
上面介绍的这种方法就是深度学习的最简单的形式

深度学习就是通过增加层次，不断去放大输入和输出之间的关系，到最后，我们可以从复杂的初看起来毫不相干的数据中，找到一个能一眼就看出来的关系

此处我们还是用之前的网络来训练

#Intialise a single neuron neural network.
neural_network = NeuralNetwork()

print("Random starting synaptic weights: ")
print(neural_network.synaptic_weights)

# The training set. We have 4 examples, each consisting of 3 input values
# and 1 output value.
training_set_inputs = array([[0, 0, 1], [0, 1, 1], [1, 0, 1], [1, 1, 1]])
training_set_outputs = array([[0, 1, 1, 0]]).T

# Train the neural network using a training set.
# Do it 10,000 times and make small adjustments each time.
neural_network.train(training_set_inputs, training_set_outputs, 10000)

print("New synaptic weights after training: ")
print(neural_network.synaptic_weights)

train_loss = MSE(neural_network.think(training_set_inputs), training_set_outputs)
print("Training loss: " + str(train_loss)[:5])
# Test the neural network with a new situation.
print("Considering new situation [1, 0, 0] -> ?: ")
print(neural_network.think(array([1, 0, 0])))

print("Debug...")
output = neural_network.think(training_set_inputs)
print(output)
# print(dot(training_set_inputs, neural_network.synaptic_weights))

error = training_set_outputs - output
# print(error) error 是0.5
print(error * neural_network.sigmoid_derivative(output))
print(training_set_inputs.T)
adjustment = dot(training_set_inputs.T, error * neural_network.sigmoid_derivative(output))
# print(adjustment)

Random starting synaptic weights: 
[[-0.16595599]
 [ 0.44064899]
 [-0.99977125]]
New synaptic weights after training: 
[[  2.08166817e-16]
 [  2.22044605e-16]
 [ -3.05311332e-16]]
Training loss: 0.25
Considering new situation [1, 0, 0] -> ?: 
[ 0.5]
Debug...
[[ 0.5]
 [ 0.5]
 [ 0.5]
 [ 0.5]]
[[-0.125]
 [ 0.125]
 [ 0.125]
 [-0.125]]
[[0 0 1 1]
 [0 1 0 1]
 [1 1 1 1]]

此处我们训练可以发现，此处的误差基本就是0.25，然后预测基本不可信。0.5什么鬼！
由数据可以看到此处的weight都已经非常非常小了,然后斜率是0.5，
由上面打印出来的数据，已经达到平衡，adjustment都是0了，不会再次调整了。
由此可以看出，简单的一层网络已经不能再精准的预测了，只能增加复杂度了。
下面我们来加一层再来看下：

class TwoLayerNeuralNetwork(object):
    def __init__(self, input_nodes, hidden_nodes, output_nodes, learning_rate):
        # Set number of nodes in input, hidden and output layers.
        self.input_nodes = input_nodes
        self.hidden_nodes = hidden_nodes
        self.output_nodes = output_nodes

        np.random.seed(1)
        # Initialize weights
        self.weights_0_1 = np.random.normal(0.0, self.hidden_nodes**-0.5, 
                                       (self.input_nodes, self.hidden_nodes)) # n * 2

        self.weights_1_2 = np.random.normal(0.0, self.output_nodes**-0.5, 
                                       (self.hidden_nodes, self.output_nodes)) # 2 * 1
        self.lr = learning_rate
        
        #### Set this to your implemented sigmoid function ####
        # Activation function is the sigmoid function
        self.activation_function = self.__sigmoid
    
    def __sigmoid(self, x):
        return 1 / (1 + np.exp(-x))
    
    def __sigmoid_derivative(self, x):
        return x * (1 - x)
    
    def train(self, inputs_list, targets_list):
        # Convert inputs list to 2d array
        inputs = np.array(inputs_list,ndmin=2) # 1 * n
        layer_0 = inputs
        targets = np.array(targets_list,ndmin=2) # 1 * 1

        
        #### Implement the forward pass here ####
        ### Forward pass ###
        layer_1 = self.activation_function(layer_0.dot(self.weights_0_1)) # 1 * 2
        layer_2 = self.activation_function(layer_1.dot(self.weights_1_2)) # 1 * 1
        #### Implement the backward pass here ####
        ### Backward pass ###
        
        # TODO: Output error
        layer_2_error = targets - layer_2
        layer_2_delta = layer_2_error * self.__sigmoid_derivative(layer_2)# y = x so f'(h) = 1
        
        layer_1_error = layer_2_delta.dot(self.weights_1_2.T)
        layer_1_delta = layer_1_error * self.__sigmoid_derivative(layer_1)
        
        # TODO: Update the weights
        self.weights_1_2 += self.lr * layer_1.T.dot(layer_2_delta) # update hidden-to-output weights with gradient descent step
        self.weights_0_1 += self.lr * layer_0.T.dot(layer_1_delta)  # update input-to-hidden weights with gradient descent step
 
        
    def run(self, inputs_list):
        # Run a forward pass through the network
        inputs = np.array(inputs_list,ndmin=2)
        
        #### Implement the forward pass here ####
        layer_1 = self.activation_function(inputs.dot(self.weights_0_1)) # 1 * 2
        layer_2 = self.activation_function(layer_1.dot(self.weights_1_2)) # 1 * 1
        
        return layer_2
def MSE(y, Y):
    return np.mean((y-Y)**2)

# import sys

training_set_inputs = array([[0, 0, 1], [0, 1, 1], [1, 0, 1], [1, 1, 1]])
training_set_outputs = array([[0, 1, 1, 0]]).T

### Set the hyperparameters here ###
epochs = 20000
learning_rate = 0.1
hidden_nodes = 4
output_nodes = 1

N_i = 3
network = TwoLayerNeuralNetwork(N_i, hidden_nodes, output_nodes, learning_rate)

losses = {'train':[]}
for e in range(epochs):
    # Go through a random batch of 128 records from the training data set
    for record, target in zip(training_set_inputs, 
                              training_set_outputs):
#         print(target)
        network.train(record, target)
    
    train_loss = MSE(network.run(training_set_inputs), training_set_outputs)
    sys.stdout.write("\rProgress: " + str(100 * e/float(epochs))[:4] \
                     + "% ... Training loss: " + str(train_loss)[:7])
    
    losses['train'].append(train_loss)
    
print(" ")    
print("After train,layer_0_1: ")
print(network.weights_0_1)
print("After train,layer_1_2: ")
print(network.weights_1_2)
# Test the neural network with a new situation.
print("Considering new situation [1, 0, 0] -> ?: ")
print(network.run(array([1, 0, 0])))

Progress: 99.9% ... Training loss: 0.00078 
After train,layer_0_1: 
[[ 4.4375838  -3.87815184  1.74047905 -5.12726884]
 [ 4.43114847 -3.87644617  1.71905492 -5.10688387]
 [-6.80858063  0.76685389  1.89614363  1.61202043]]
After train,layer_1_2: 
[[-9.21973137]
 [-3.84985864]
 [ 4.75257888]
 [-6.36994226]]
Considering new situation [1, 0, 0] -> ?: 
[[ 0.00557239]]

layer_1=network.activation_function(training_set_inputs.dot(network.weights_0_1))
print(layer_1)
layer_2 = network.activation_function(layer_1.dot(network.weights_1_2))
print(layer_2)

[[  2.20482250e-01   9.33639853e-01   6.30402293e-01   6.24775766e-02]
 [  1.77659862e-02   9.99702482e-01   8.64290928e-01   9.26611880e-01]
 [  6.94975743e-01   8.90040645e-02   8.51261229e-01   2.06917379e-04]
 [  1.27171786e-01   9.58904341e-01   9.55296949e-01   3.77322214e-02]]
[[ 0.02374213]
 [ 0.97285992]
 [ 0.97468116]
 [ 0.02714965]]

最后总结下：我们发现在扩展中，我们只是简单的改变了两个输入值，此时再次用一层神经网络已经难以预测出正确的数据了，此时我们只能通过将神经网络变深，这个过程其实就是再去深度挖掘数据之间关系的过程，此时我们的2层神经网络相比较1层就好多了。

以上内容参考了：A Neural Network in 11 lines of Python (Part 1)

参考代码：https://github.com/llSourcell/Make_a_neural_network

参考视频：https://www.youtube.com/watch?v=p69khggr1Jo

第五章深度学习实用层面
1.训练/开发/测试集前面介绍了如何构建神经网络，接下来介绍如何有效运作神经网络，涉及到参数调优、如何构建数据、...
如何构建一个简单的神经网络
如何构建一个简单的神经网络最近报名了Udacity的深度学习基石，这是介绍了第二部分神经网络入门，第一篇是线性回...
Tensorboard进行可视化 1
我们之前已经构建了一个简单的神经网络算法。如果想在Tensorboard中可视化这个神经网络，可以用Tensorb...
Tensorflow 构建简单神经网络
Tensorflow 构建简单神经网络莫凡Tensorflow教程笔记https://morvanzhou.gi...
deeplearning课后作业（课程一第三周作业）
具有一个隐藏层的神经网络平面数据分类本篇文章将通过python构建一个简单浅层神经网络（具有一个隐藏层）来分类数...
tensorflow 2.x系列01 构建和训练一个神经网络模型
［tensorflow认证考试］01 构建和训练一个神经网络模型 01 构建和训练一个神经网络模型(Build a...
手把手教你用Python创建简单的神经网络！
数据：了解神经网络如何工作的最佳方法是学习如何从头开始构建神经网络(而不是采用任何库)。在本文中，我们将演示...
Pytorch构建CNN模型用于图片分类
1.源码 1.1简单的例子：使用pytorch 构建神经网络模型比较简单，大体上需要定义两个函数，一个init，...
小白学CNN以及Keras的速成（3）
介绍在前面一篇文章中，我们初步了解了什么是简单的神经网络，也通过keras构建了简单的神经网络，通过训练，我们实...
构建属于自己的神经网络(二)
在构建属于自己的神经网络（一）中，介绍了神经网络的基本原理，用一个简单的一次多项式说了一下基本过程，今天就将（一）...