美文网首页
逻辑回归

逻辑回归

作者: 原上的小木屋 | 来源:发表于2020-09-19 12:47 被阅读0次

    广义线性回归

    • y=g^{-1}(wx+b)g(\cdot):代表任何单调可微函数

    逻辑回归---实现分类器

    1. 准备训练样本
    2. 训练分类器
    3. 对新样本分类

    单位阶跃函数(unit-step function)

    y= \begin{cases} 0& \text{z<0}\\ 1& \text{z>=0} \end{cases}

    二分类问题:1/0----正例或反例

    • 线性回归z=wx+b \Longrightarrow分类器0/1

    对数几率函数(logistic function)

    y= \frac {1}{1+e^{-z}} \Longrightarrow In \frac {y}{1-y}=z

    1. 单调上升,连续光滑
    2. 任意阶可导

    对数几率回归/逻辑回归(logistic regression)

    y=\frac {1}{1+e^{-(wx+b)}}

    1. 可以预测概率
    2. 是sigmoid函数的典型代表

    Sigmoid函数

    y=g^{-1}(z)=\sigma(z)=\sigma(wx+b),\sigma(z)= \frac {1}{1+e^{-z}} = \frac {1}{1+e^{-(wx+b)}}

    多元模型

    y= \frac {1}{1+e^{-(W^{T}X)}},W=(w_0,w_1,...,w_m)^T,X=(x^0,x^1,...,x^m)^T,x^0=1

    交叉熵损失函数

    1. 逻辑回归y=g^{-1}(z)=\sigma (z)=\sigma (wx+b)
      \sigma (z)= \frac {1}{1+e^{-z}} = \frac {1}{1+e^{-(wx+b)}}
    2. 平方损失函数Loss=\frac {1}{2} \sum^{n}_{i=1}(y_i-\hat{y}_i)^2= \frac {1}{2}\sum^{n}_{i=1}(y_i-\sigma(wx_i+b))^2= \frac {1}{2}\sum^{n}_{i=1}(y_i-\frac {1}{1+e^{-(wx+b)}})^2
      w^{(k+1)}=w^{(k)}-\eta \frac {\partial Loss}{\partial w},\frac {\partial Loss}{\partial w}=\sum^{n}_{i=1}(y_i-\sigma(wx+b))(-\sigma '(wx_i+b))x_i
      b^{(k+1)}=b^{(k)}-\eta \frac {\partial Loss}{\partial b},\frac {\partial Loss}{\partial w}=\sum^{n}_{i=1}(y_i-\sigma(wx+b))(-\sigma '(wx_i+b))

    逻辑回归中,经常使用交叉熵损失函数

    交叉熵损失函数表达

    Loss=-\sum^{n}_{i=1}[y_i In \hat{y}_i+(1-y_i)In(1-\hat{y}_i)],y_i代表第i个样本的标记,\hat{y}_i=\sigma(wx_i+b)

    平均交叉熵损失函数

    Loss=- \frac {1}{n} \sum^{n}_{i=1}[y_i In \hat{y}_i+(1-y_i)In(1-\hat{y}_i)]
    具备两条损失函数的基本性质,非负性和一致性
    \frac {\partial Loss}{\partial w}=\frac {1}{n} \sum^{n}_{i=1}x_i(\hat{y}_i-y_i),\frac {\partial Loss}{\partial b}=\frac {1}{n} \sum^{n}_{i=1}(\hat {y}_i-y_i)

    准确率(accuracy):正确分类的样本数/总样本数

    交叉熵损失函数能够很好地反映出概率之间的误差,是训练分类器时的重要分类依据

    tensorflow实现逻辑回归

    1. 加载数据
    import tensorflow as tf
    print("tensorflow version:",tf.__version__)
    
    tensorflow version: 2.2.0
    
    import numpy as np
    import matplotlib.pyplot as  plt
    
    x=np.array([137.97,104.50,100.00,126.32,79.20,99.00,124.00,114.00,106.69,140.05,53.75,46.91,68.00,63.02,81.26,86.21])
    y=np.array([1,1,0,1,0,1,1,0,0,1,0,0,0,0,0,0])
    
    1. 数据处理
    x_train=x-np.mean(x)
    y_train=y
    
    1. 设置超参数
    learn_rate=0.005
    iter=5
    
    display_step=1
    
    1. 设置模型变量的初始值
    np.random.seed(612)
    w=tf.Variable(np.random.randn())
    b=tf.Variable(np.random.randn())
    
    1. 训练模型
    x_=range(-80,80)
    y_=1/(1+tf.exp(-(w*x_+b)))
    plt.scatter(x_train,y_train)
    plt.plot(x_,y_,color="red",linewidth=3)
    
    cross_train=[]
    acc_train=[]
    
    for i in range(0,iter+1):
        
        with tf.GradientTape() as tape:
            pred_train = 1/(1+tf.exp(-(w*x_train+b)))
            Loss_train=-tf.reduce_mean(y_train*tf.math.log(pred_train)+(1-y_train)*tf.math.log(1-pred_train))
            Accuracy_train=tf.reduce_mean(tf.cast(tf.equal(tf.where(pred_train<0.5,0,1),y_train),tf.float32))
            
        cross_train.append(Loss_train)
        acc_train.append(Accuracy_train)
        
        dL_dw,dL_db=tape.gradient(Loss_train,[w,b])
        
        w.assign_sub(learn_rate*dL_dw)
        b.assign_sub(learn_rate*dL_db)
        
        if i%display_step==0:
            print("i:%i,Train Loss:%f, Accuracy: %f" % (i,Loss_train,Accuracy_train))
            y_=1/(1+tf.exp(-(w*x_+b)))
            plt.plot(x_,y_)
    
    i:0,Train Loss:0.852807, Accuracy: 0.625000
    i:1,Train Loss:0.400259, Accuracy: 0.875000
    i:2,Train Loss:0.341504, Accuracy: 0.812500
    i:3,Train Loss:0.322571, Accuracy: 0.812500
    i:4,Train Loss:0.313972, Accuracy: 0.812500
    i:5,Train Loss:0.309411, Accuracy: 0.812500
    
    output_12_1.png

    线性分类器

    • 由直线f(x_1,x_2)=w_1x_1+w_2x_2+b \Longrightarrow w_1x_1+w_2x_2+b=0将数据集分成两类,f(x_1,x_2)>0为一类,f(x_1,x_2)>0为另一类
    • 高维空间中存在这样的超平面w_1x_1+w_2x_2+w_3x_3+b=0
    • 决策边界:m维空间:超平面W^TX=0
    • 逻辑回归相当于就是构建一个线性分类器,实现对线性可分数据集的划分,其中的线性模型就是决策边界
    • 在逻辑运算中,与运算、或运算、非运算均是线性可分的,而异或运算是非线性可分的

    实现多元逻辑回归对鸢尾花数据集分类

    150个样本

    4个属性

    • 花萼长度
    • 花萼宽度
    • 花瓣长度
    • 花瓣宽度

    1个标签

    • 山鸢尾
    • 变色鸢尾
    • 维吉尼亚鸢尾
    1. 加载数据
    import tensorflow as tf
    import pandas as pd
    import numpy as np
    import matplotlib as mpl
    import matplotlib.pyplot as plt
    cm_pt = mpl.colors.ListedColormap(['red', 'green'])
    
    TRAIN_URL="http://download.tensorflow.org/data/iris_training.csv"
    train_path=tf.keras.utils.get_file(TRAIN_URL.split('/')[-1],TRAIN_URL)
    
    df_iris=pd.read_csv(train_path,header=0)
    
    df_iris[:2]
    
    1. 处理数据
    iris=np.array(df_iris)
    iris.shape
    
    (120, 5)
    
    # 只取出花萼的长度和宽度两个属性来训练模型
    train_x=iris[:,0:2]
    train_y=iris[:,4]
    train_x.shape,train_y.shape
    
    ((120, 2), (120,))
    
    # 只取出来两类鸢尾类型来实现分类器
    x_train=train_x[train_y<2]
    y_train=train_y[train_y<2]
    x_train.shape,y_train.shape
    
    ((78, 2), (78,))
    
    num=len(x_train)
    
    1. 可视化样本,属性中心化
    x_train=x_train-np.mean(x_train,axis=0)
    plt.scatter(x_train[:,0],x_train[:,1],c=y_train,cmap=cm_pt)
    plt.show()
    
    output_26_0.png
    1. 处理数据-生成多元模型的属性矩阵和标签列向量
    x0_train=np.ones(num).reshape(-1,1)
    
    X=tf.cast(tf.concat((x0_train,x_train),axis=1),tf.float32)
    Y=tf.cast(y_train.reshape(-1,1),tf.float32)
    X.shape,Y.shape
    
    (TensorShape([78, 3]), TensorShape([78, 1]))
    
    1. 设置超参数和模型参数初始值
    learn_rate=0.2
    iter=180
    
    display_step=30
    
    np.random.seed(612)
    W=tf.Variable(np.random.randn(3,1),dtype=tf.float32)
    
    1. 训练模型
    ce=[]
    acc=[]
    
    for i in range(0,iter+1):
        with tf.GradientTape() as tape:
            PRED=1/(1+tf.exp(-tf.matmul(X,W)))
            Loss=-tf.reduce_mean(Y*tf.math.log(PRED)+(1-Y)*tf.math.log(1-PRED))
            
        accuracy=tf.reduce_mean(tf.cast(tf.equal(tf.where(PRED.numpy()<0.5,0.,1.),Y),tf.float32))
        ce.append(Loss)
        acc.append(accuracy)
        
        dL_dW=tape.gradient(Loss,W)
        W.assign_sub(learn_rate*dL_dW)
        
        if i%display_step==0:
            print("i: %i, Acc: %f, Loss: %f" % (i,accuracy,Loss))
    
    i: 0, Acc: 0.230769, Loss: 0.994269
    i: 30, Acc: 0.961538, Loss: 0.481892
    i: 60, Acc: 0.987179, Loss: 0.319128
    i: 90, Acc: 0.987179, Loss: 0.246626
    i: 120, Acc: 1.000000, Loss: 0.204982
    i: 150, Acc: 1.000000, Loss: 0.177490
    i: 180, Acc: 1.000000, Loss: 0.157764
    
    1. 可视化----绘制损失和准确率变化曲线
    plt.figure(figsize=(5,3))
    plt.plot(ce,color="blue",label="Loss")
    plt.plot(acc,color="red",label="acc")
    plt.legend()
    plt.show()
    
    output_34_0.png
    1. 绘制决策边界
      w_1x_1+w_2x_2+w_0=0,x_2=-\frac {w_1x_1+x_0}{w_2}
    plt.scatter(x_train[:,0],x_train[:,1],c=y_train,cmap=cm_pt)
    x_=[-1.5,1.5]
    y_=-(W[1]*x_+W[0])/W[2]
    plt.plot(x_,y_,color='g')
    plt.show()
    
    output_36_0.png

    多分类问题

    自然顺序码

    • 0 - 山鸢尾
    • 1 - 变色鸢尾
    • 2 - 维吉尼亚鸢尾

    独热编码

    • 使非偏序关系的数据,取值不具有偏序性
    • 到原点等距
    • (0,0,1) - 山鸢尾
    • (0,1,0) - 变色鸢尾
    • (1,0,0) - 维吉尼亚鸢尾

    应用

    • 离散的特征
    • 多分类问题中的类别标签

    此外,还有独冷编码,与独热编码相反

    softmax()函数:显著地改变概率大小

    相关文章

      网友评论

          本文标题:逻辑回归

          本文链接:https://www.haomeiwen.com/subject/squeyktx.html