有趣的机器学习：如何实现一个简单的神经网络

作者: 来个芒果 | 来源:发表于2017-05-05 21:34 被阅读0次

对于一些非线性可分问题，我们需要采取神经网络来解决（本质上还是一个逻辑回归），如下：

我们的目标时找到一个函数，成功分别X、O
激活函数主要由两种：

阶跃函数
sigmoid函数
下面为sigmoid函数

一、单层神经网络介绍

神经元：

误差计算方法：

跟梯度下降中处理线性问题一样，在处理这种非线性可分问题时，为了使我们的预测误差最小，我们需要使用梯度下降方法找到最优误差，神经网络中的误差计算方法：

根据梯度得到权值的更新公式：

权值更新公式

接下来我们实现上图中的单层神经网络（只有一个具有传递函数的层）：

首先处理数据：

# Import dataset
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

iris=pd.read_csv('./data/iris.csv')
shuffled_rows=np.random.permutation(iris.index)
iris=iris.loc[shuffled_rows,:]
print(iris.shape)

%matplotlib inline
print(iris.species.unique())
iris.hist(["sepal_length","sepal_width","petal_length","petal_width"])
plt.show()

iris['ones']=np.ones(iris.shape[0])
X=iris[['ones', 'sepal_length', 'sepal_width', 'petal_length', 'petal_width']].values
Y=((iris['species']=='Iris-versicolor').values.astype(int))

实现该神经网络：
定义激活函数、误差函数cost function、梯度计算函数

def sigmoid_activation(x, theta):
    x = np.asarray(x)
    theta = np.asarray(theta)
    return 1 / (1 + np.exp(-np.dot(x, theta)))

def cal_cost(x,y,theta):
    #h=sigmoid_activation(x.T,theta)
    h=sigmoid_activation(x,theta).T
    cost=-np.mean(y*(np.log(h))+(1-y)*np.log(1-h))  #h应该为1×n的样式，这样y×h才能得到一个向量而非矩阵
    return cost #计算的是累计误差，即所有样本的误差

#计算误差的梯度
def gradient(X,Y,theta):
    grads=np.zeros(theta_init.shape)  #5*1
    for i,obs in enumerate(X):
        h=sigmoid_activation(obs,theta)  #计算单个样例的梯度，再把所有样例的累加起来然后取平均值，也可以直接计算所有样例的梯度，再取平均值。
        delta=(Y[i]-h)*h*(1-h)*obs  #为（5,）的一个array
        grads+=delta[:,np.newaxis]/len(X)  #需要对delta增加维度才能与grads相加
    return grads

注意：

激活函数：当我们输入整体样本集时，返回的就是一个由n个样本的预测概率组成的n×1数组。
计算误差：(y(np.log(h))+(1-y)np.log(1-h)，
y为[1,0,1,...],shape:( n, )；
此时的h要为一个1×n的数组，只有（n，）*（1×n）才可以得到一个1×n的数组；如果(n,) *( n,1),得到的将是一个矩阵！http://www.cnblogs.com/Rambler1995/p/5581582.html

下面时完整的调整theta降低误差的过程，调整完theta以后就可以输入预测数据进行预测：

#接下来我们给出单层神经网络的完整theta计算过程：
theta_init = np.random.normal(0,0.01,size=(5,1))
def learn(X,Y,theta,alpha=0.1):
    
    counter=0
    max_iteration=1000
    
    convergence_thres=0.000001
    c=cal_cost(X,Y,theta)
    cost_pre=c+convergence_thres+0.01
    costs=[c]
    while( (np.abs(c-cost_pre)>convergence_thres) & (counter<max_iteration)):
        grads=gradient(X,Y,theta)
        theta+=alpha*grads
        cost_pre=c
        c=cal_cost(X,Y,theta)
        costs.append(c)
        counter+=1
    return theta,costs

theta,costs=learn(X,Y,theta_init)
plt.plot(costs)
plt.title("Convergence of the Cost Function")
plt.ylabel("J($\Theta$)")
plt.xlabel("Iteration")
plt.show()

二、多层神经网络（具有隐含层）

下面是一段关于多层神经网络的介绍：

由其结构可知:
输出层的输入为隐藏层的输出，而隐藏层又由起始层的输入数据计算得来；此时总共由两组权值：输入层与隐藏层间的，隐藏层与输出成之间的，也就是说我们需要调整的theta为两组。

multi-layer feedforward
多层前馈神经网络，指网络拓扑结构上不存在环或者回路。

误差函数与单层网络相同：

权值更新公式见周志华机器学习-P102
部分因式如下：

多层网络权值更新公式部分因式

接下来我们实现算法，为了保证算法的重用性，定义一个NNet类，结构如下：

class NNet
- 'init
  - learning_rate
  - maxepochs
  - convergence_thres
  - hidden_layer
- protected methods
  - _sigmoid_activation
  - _multi_cost
  - _feedforward
- learn
- predict

class NNet:
      def __init__(self):
          pass
      def _sigmoid_activation(self):
          pass
      def _multi_cost(self):
          pass
      def _feedforward(self):
          pass
      def predict(self):
          pass
      def learn(self):
          pass

完整代码如下：

class NNet3:
    def __init__(self, learning_rate=0.5, maxepochs=1e4, convergence_thres=1e-5, hidden_layer=4):
        self.learning_rate = learning_rate
        self.maxepochs = int(maxepochs)
        self.convergence_thres = 1e-5
        self.hidden_layer = int(hidden_layer)

    def _sigmoid_activation(self,x, theta):
        x = np.asarray(x)
        theta = np.asarray(theta)
        return 1 / (1 + np.exp(-np.dot(theta.T, x)))

    def _multiplecost(self, X, y):
        l1, l2 = self._feedforward(X) 
        # compute error
        inner = y * np.log(l2) + (1-y) * np.log(1-l2)
        return -np.mean(inner)  

    def _feedforward(self, X):
        l1 = self._sigmoid_activation(X.T, self.theta0).T
        l1 = np.column_stack([np.ones(l1.shape[0]), l1])
        l2 = self._sigmoid_activation(l1.T, self.theta1)
        return l1, l2

    def predict(self, X):
        _, y = self._feedforward(X)
        return y 

    def learn(self, X, y):
        nobs, ncols = X.shape
        self.theta0 = np.random.normal(0,0.01,size=(ncols,self.hidden_layer))
        self.theta1 = np.random.normal(0,0.01,size=(self.hidden_layer+1,1))
        self.costs = []
        cost = self._multiplecost(X, y)
        self.costs.append(cost)
        costprev = cost + self.convergence_thres+1  
        counter = 0  

        # Loop through until convergence
        for counter in range(self.maxepochs):
            l1, l2 = self._feedforward(X)

            # Start Backpropagation
            # Compute gradients
            l2_delta = (y-l2) * l2 * (1-l2)
            l1_delta = l2_delta.T.dot(self.theta1.T) * l1 * (1-l1)

            # Update parameters
            self.theta1 += l1.T.dot(l2_delta.T) / nobs * self.learning_rate  # theta1是一个5*1的数组，调整完也是。
            self.theta0 += X.T.dot(l1_delta)[:,1:] / nobs * self.learning_rate

            counter += 1  # Count
            costprev = cost  # Store prev cost
            cost = self._multiplecost(X, y)  # get next cost
            self.costs.append(cost)

            if np.abs(costprev-cost) < self.convergence_thres and counter > 500:
                break
`
learning_rate = 0.5
maxepochs = 10000       
convergence_thres = 0.00001  
hidden_units = 4
# Initialize model 
model = NNet3(learning_rate=learning_rate, maxepochs=maxepochs,
              convergence_thres=convergence_thres, hidden_layer=hidden_units)
# Train model
model.learn(X, y)
prediction=model.predict(X)
prediction=np.array([i>=0.5 for i in prediction]).astype(int)
print(prediction)
print(prediction==y)
# Plot costs
plt.plot(model.costs)
plt.title("Convergence of the Cost Function")
plt.ylabel("J($\Theta$)")
plt.xlabel("Iteration")
plt.show()

误差调整过程如下：

注：关于神经网络的层数
既可以说两层也可以说是三层：
2层：是从传递函数sigmoid的角度考虑的，只有隐含层跟输出层有传递函数，这个时候，输入是直接用线，不是用神经元来表示的。
3层：以神经元为单位的。因为输入也可以用神经元来表示的。
一般常用的神经网络是三层结构的，即只有一个隐藏层。（从传递函数的角度来说也可以说是两层）

网友评论

本文标题：有趣的机器学习：如何实现一个简单的神经网络

本文链接：https://www.haomeiwen.com/subject/cpwstxtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

有趣的机器学习：如何实现一个简单的神经网络

一、单层神经网络介绍

误差计算方法：

接下来我们实现上图中的单层神经网络（只有一个具有传递函数的层）：

二、多层神经网络（具有隐含层）

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

生活不易我用python

Python 运维

机器学习与数据挖掘

有趣的机器学习：如何实现一个简单的神经网络

一、单层神经网络介绍

误差计算方法：

接下来我们实现上图中的单层神经网络（只有一个具有传递函数的层）：

二、多层神经网络（具有隐含层）

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

生活不易 我用python

Python 运维

机器学习与数据挖掘

生活不易我用python