美文网首页
(九)逻辑斯蒂回归(用于分类)

(九)逻辑斯蒂回归(用于分类)

作者: 羽天驿 | 来源:发表于2020-04-07 08:36 被阅读0次

一、逻辑斯蒂回归

(1)构建预测函数

  • 第一步,构建一个预测函数(概率,分类问题,变成了概率问题)
    • h_{\theta}(x) = g(\theta^Tx) = \frac{1}{1 + e^{-\theta^Tx}}
    • f(x) = \theta^Tx 线性回归函数
    • f(x) = x_1\theta_1 + x_2\theta_2 + …… + x_n\theta_n + b
    • f(x) = x_0\theta_0 + x_1\theta_1 + x_2\theta_2 + …… + x_n\theta_n 其中x_0 = 1
    • f(x) = \sum\limits_{i = 0}^nx_i\theta_i
    • \theta 和 x 表示向量
    • 向量,一维的;矩阵是二维的
    • 一般情况下,给一个向量,<font color = red>默认是列向量</font>
    • 原来的\theta系数是列向量,现在\theta^T (转置)表示行向量
    • 行向量\theta^Tx :行向量点乘列向量(f(x) = x_0\theta_0 + x_1\theta_1 + x_2\theta_2 + …… + x_n\theta_n

(2)构建损失函数


损失函数.png

(3)使用梯度下降求最小值


最小值.png

虽然名称是回归的算法,但是他是用于分类的而不是用于回归的算法。
逻辑斯蒂回归的损失函数用的是最大似然。
什么是最大似然?
案列:


  • 假如有一个罐子,里面有黑白两种颜色的球,数目多少不知,两种颜色的比例也不知。我们想知道罐中白球和黑球的比例,但我们不能把罐中的球全部拿出来数。现在我们可以每次任意从已经摇匀的罐中拿一个球出来,记录球的颜色,然后把拿出来的球再放回罐中。这个过程可以重复,我们可以用记录的球的颜色来估计罐中黑白球的比例。假如在前面的一百次重复记录中,有七十次是白球,请问罐中白球所占的比例最有可能是多少?

  • 最大似然估计,计算

  • 白球概率是p,黑球是1-p(罐子中非黑即白)

  • 罐子中取一个请问是白球的概率是多少?

    • p
  • 罐子中取两个球,两个球都是白色,概率是多少?

    • p^2
  • 罐子中取5个球都是白色,概率是多少?

    • p^5
  • 罐子中取10个球,9个是白色,一个是黑色,概率是多少呢?

    • p^9(1-p)
  • 罐子取100个球,70次是白球,30次是黑球,概率是多少?

    • C_{100}^{30} == C_{100}^{70} 常数,大了一点,还是常数
    • P = C_{100}^{30}p^{70}(1-p)^{30}
  • 最大似然估计 什么时候,P最大呢?

  • P' = C_{100}^{30}*70p^{69}(1-p)^{30} + C_{100}^{30}*p^{70}*30(1-p)^{29}*(-1)

  • 令导数为0

  • 0 = C_{100}^{30}*70p^{69}(1-p)^{30} + C_{100}^{30}*p^{70}*30(1-p)^{29}*(-1)

  • 0 = 70p^{69}(1-p)^{30} - p^{70}*30(1-p)^{29}

  • 0 = 70(1-p) - p*30

  • 0 = 70 - 100p

  • p = 70%


通过上述的例子我们可以得到似然函数,就是概率函数,就是每一个样本真实标记的概率。

似然函数: 似然函数.png

二、原理及公式

逻辑斯蒂回归

  • Logistic regression, despite its name, is a linear model for classification rather than regression.

  • 逻辑回归尽管有其名称,但它是用于分类而不是回归的线性模型。

  • P(t) = \frac{KP_0e^{rt}}{K + P_0(e^{rt} - 1)}

  • P_0 = 1,令K = 2,令r = 1

  • P(t) = \frac{2e^t}{1 + e^t}

  • P(t) = \frac{2}{1 + e^{-t}}

  • 纵坐标统一缩小一半

  • P(t) = \frac{1}{1 + e^{-t} }

  • Sigmoid函数

    • S(x) = \frac{1}{1 + e^{-x}}
  • 逻辑斯蒂函数和Sigmoid函数统一了

逻辑斯蒂回归使用:

  • 导包
  • 声明对象 lr
  • lr.fit(X_train,y_train)训练
  • lr.predict(X_test)算法使用
  • 公司中业务,无论复杂还是简单,流程类似这样的

逻辑斯蒂回归原理

第一步预测函数h(x)

  • from sklearn.linear_model import LogisticRegression

  • 线性回归(四元(四个属性)一次方程)

    • f(X) = Xw + b
    • f( X) = \sum\limits_{i = 1}^nx_iw_i + b
  • Sigmoid方程

    • S(x) = \frac{1}{1 + e^{-x} }
    • sigmoid.png
  • 逻辑斯蒂回归线性回归+sigmoid结合

    • 复合函数线性回归套到逻辑斯蒂回归中
    • h_w(x) = \frac{1}{1 + e^{-f(x)}}
    • 预测函数:<font color =red>h(x) = \frac{1}{1 + e^{-f(x) }}</font>
    • 如上的预测函数概率函数,范围0 ~ 1之间
    • 分类问题,计算机(死脑筋),比较概率的大小分类!!!
  • 为什么要把线性回归套进去逻辑斯蒂函数中呢???

    • 分类问题,怎么分类???
    • 分类问题转化成概率问题
    • 分类问题,交给计算机解决,想法设法,把问题变成概率问题,比较大小
    • 逻辑斯蒂函数就是概率函数,无论给的值,多大多小变换到0~1之间,概率
    • 逻辑斯蒂函数或者Sigmoid函数<font color =red>巧妙</font>之处

第二步,cost损失函数

  • 之前线性回归:最小二乘法

  • 逻辑斯蒂回归:最大似然

  • 预测函数

  • h_{\theta}(X) = g(X\theta) = \frac{1}{1 + e^{-X\theta}}

  • 预测函数h_{\theta}(X) 概率函数,范围 0 ~ 1

  • 似然函数 = 概率函数:

    • 似然函数就是概率函数,就是每个样本属于真实标记的概率
    • P(y|x;\theta) = (h_{\theta}(x))^{y}(1 - h_{\theta}(x))^{1-y}
    • 什么是似然函数呢???
    • 逻辑斯蒂回归首先解决二分类问题,类别0、1
    • 延伸逻辑斯蒂回归可以解决多分类。
    • 情况:二分类
    • 情况一:y = 1
      • P(1|x;\theta) = (h_{\theta}(x))^{1}(1 - h_{\theta}(x))^{1-1}
      • P(1|x;\theta) = h_{\theta}(x)
    • 情况二:y = 0
      • P(0|x;\theta) = (h_{\theta}(x))^{0}(1 - h_{\theta}(x))^{1-0}
      • P(0|x;\theta) = 1 - h_{\theta}(x)
    • 定义白球概率 p,黑球的概率是 1- p
    • 黑球白球目标值y
      • 白球y = 1
      • 黑球y = 0
    • P(y|x) = p^y(1-p)^{1-y}

三、逻辑斯蒂代码的简单的使用

(一、代码的简单的使用)

import numpy as np

from sklearn.linear_model import LogisticRegression

from sklearn import datasets

from sklearn import metrics
from sklearn.model_selection import train_test_split
# iris鸢尾花,细分,品种
# 自然环境生成的是不同的,所以导致,亚种
# 鸢尾花,品种不同,花萼花瓣长宽是不同的
iris = datasets.load_iris()
iris
# 四个属性:花萼长度、花萼宽度、花瓣长度、花瓣宽度
X = iris['data']
# 分类问题
y = iris['target']
display(X.shape,y.shape)
(150, 4)



(150,)
# test_size  = 0.2 20%
# 测试数据占比20%,训练80%
# 构建模型,学习数据,规律,根据数据,进行预测
# train_test_split这个方法,随机的打乱顺序,每个人都会不一样
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size = 0.2,random_state = 512)
lr = LogisticRegression(max_iter=1000)

# 学习了数据X_train和y_train
# X_train是数据 ------> y_train目标值
# 算法寻找,关系,方程
lr.fit(X_train,y_train)
LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=1000,
                   multi_class='auto', n_jobs=None, penalty='l2',
                   random_state=None, solver='lbfgs', tol=0.0001, verbose=0,
                   warm_start=False)
# X_test测试数据,标准答案 y_test
# X_test对我们的算法而言,是全新的数据
y_ = lr.predict(X_test)
print('标准答案:\n',y_test)
print('算法预测:\n',y_)
# 算法预测了30个,28个正确,2个出现错误
print('准确率:',28/30)
标准答案:
 [0 1 1 1 2 0 0 2 0 2 1 1 1 0 2 0 2 0 1 1 0 1 0 1 0 2 1 1 1 2]
算法预测:
 [0 1 1 2 2 0 0 2 0 2 1 1 1 0 2 0 2 0 1 1 0 1 0 2 0 2 1 1 1 2]
准确率: 0.9333333333333333

(二、原理代码的实现)

import numpy as np

import matplotlib.pyplot as plt

S( x) = \frac{1}{1 + e^{-x} }

x = np.linspace(-10,10)

sigmoid = lambda x : 1/(1 + np.e**(-x))

y = sigmoid(x)

plt.plot(x,y)
plt.title('Sigmoid-Logistic')
Text(0.5, 1.0, 'Sigmoid-Logistic')
output_2_1.png

(三。逻辑斯蒂识别手写的数字)

导包

import numpy as np

import matplotlib.pyplot as plt

from sklearn.linear_model import LogisticRegression

import pandas as pd

from sklearn.model_selection import train_test_split

加载数据,拆分

data = pd.read_csv('./digits.csv')
data
X = data.iloc[:,1:]
y = data['label']
X.shape
(42000, 784)

可视化手写数字

# 图片高度28像素,宽度28像素
28*28
784
plt.imshow(X.loc[1024].values.reshape(28,28))
<matplotlib.image.AxesImage at 0x1ab22bd1b88>
output_7_1.png

拆分数据:训练和测试

X_train,X_test,y_train,y_test = train_test_split(X,y,test_size = 1000)
print(X_train.shape,X_test.shape)
(41000, 784) (1000, 784)
# 数据减少,5000个训练,不少,可以找到规律了
X_train = X.iloc[:5000]
y_train  =y.iloc[:5000]

X_test = X.iloc[-1000:]
y_test = y.iloc[-1000:]

使用算法,训练和预测

import warnings
warnings.filterwarnings('ignore')
%%time
lr = LogisticRegression(max_iter=200)

lr.fit(X_train,y_train)

y_ = lr.predict(X_test)
print('真实数字是:\n',y_test[:50].values)
print('逻辑斯蒂回归算法预测是:\n',y_[:50])
真实数字是:
 [2 8 1 8 0 1 3 8 1 0 8 1 5 7 3 8 7 6 9 7 2 5 8 4 1 6 4 2 4 9 4 1 1 2 7 7 3
 6 1 3 5 0 9 5 2 9 1 5 9 4]
逻辑斯蒂回归算法预测是:
 [2 8 1 8 0 1 3 8 1 0 8 1 5 7 3 3 9 6 9 7 5 5 8 4 1 6 4 2 4 9 0 1 1 2 7 4 3
 6 1 8 5 0 9 5 2 9 1 2 9 4]
Wall time: 3.62 s

准确率计算

(y_test == y_).mean()
0.851
lr.score(X_test,y_test)
0.851

逻辑斯蒂回归分类问题,变成概率问题

lr.predict(X_test)[:5]
array([2, 8, 1, 8, 0], dtype=int64)
lr.predict_proba(X_test)[:5]
array([[3.99944549e-56, 3.13802567e-40, 9.99999991e-01, 8.11029507e-33,
        9.55758950e-31, 1.33407904e-36, 1.63559956e-35, 1.36167810e-34,
        3.94873126e-13, 9.17542857e-09],
       [3.47691937e-32, 9.45659216e-12, 2.15691146e-15, 7.76272853e-17,
        4.60336112e-47, 1.95280205e-14, 2.86509150e-14, 3.42799723e-47,
        1.00000000e+00, 8.09887934e-30],
       [2.95355219e-48, 1.00000000e+00, 3.78128277e-14, 9.53249317e-37,
        3.37867878e-48, 1.33987574e-35, 1.18501330e-30, 9.42662502e-55,
        7.71758466e-16, 7.02293528e-47],
       [1.04645342e-27, 2.44542774e-06, 5.74072289e-19, 6.70739001e-29,
        7.67253219e-51, 4.90497849e-13, 8.16572961e-27, 6.46372239e-62,
        9.99997555e-01, 4.55192317e-40],
       [1.00000000e+00, 1.47237626e-77, 1.61003550e-34, 9.10565041e-46,
        8.07064824e-64, 1.78965234e-36, 1.98884531e-31, 6.01113728e-52,
        3.11702498e-25, 3.17750910e-48]])

(四、逻辑斯蒂二分类问题)

import numpy as np

from sklearn.linear_model import LogisticRegression

from sklearn import datasets
d:\python3.7.4\lib\importlib\_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
  return f(*args, **kwds)

获取数据,类别筛选,保留2类,二分类问题

X,y = datasets.load_iris(True)
# 150代表 150个样本,4,代表4个属性、特征
# 数据挖掘、机器学习、人工智能------->将实际问题,数学化(方程,求解方程)
print(X.shape,y.shape)
# 类别分三类
# 逻辑斯蒂回归,进行原理推导的时候,二分类:0,1
# 将类别,删去类别2,此时只剩下0,1
# 将类别,删去类别0,此时只剩下1,2
cond = y!=1
X = X[cond]
y = y[cond]
print(X.shape)
print(y)
(150, 4) (150,)
(100, 4)
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2]

算法训练、预测

算法找到规律后,使用规律进行预测

lr = LogisticRegression()
lr.fit(X,y)#算法,训练数据,找X和y之间的规律,方程
y_ = lr.predict(X)# 规律找到之后,使用规律,进行计算
# 计算出来的y_和y(真实值),是完全一样???
# 给出的结果完全一样,数据简单(去除了类别2,保留了类别0,1,准确率100%,说明类别0,1容易区分)
# 相同的代码,执行了一次,数据复杂一点(去除了类别0,保留了类别1,2,准确96%,4个预测错了,
# 反推,有一些个例,不太容易分类)
(y_ == y).mean()
1.0

算法概率计算

# 计算的概率
proba_ = lr.predict_proba(X) #probability 概率
# 将概率转化成类别
proba_[:10]
array([[0.99153216, 0.00846784],
       [0.9908928 , 0.0091072 ],
       [0.99355185, 0.00644815],
       [0.99086387, 0.00913613],
       [0.99219814, 0.00780186],
       [0.98268555, 0.01731445],
       [0.99252002, 0.00747998],
       [0.9899949 , 0.0100051 ],
       [0.99259338, 0.00740662],
       [0.99028393, 0.00971607]])

手动计算概率,代码实现

\theta 就是线性方程的系数
h_{\theta}(x) = \frac{1}{1 + e^{-\theta^Tx}}

'''calculate the probability
of each class assuming it to be positive using the logistic function.
and normalize these values across all the classes.'''
w_[0].dot(X.T)
array([ 4.00608449,  4.07951914,  3.73156718,  4.08271965,  3.92349967,
        4.73031193,  3.88104351,  4.17445465,  3.871113  ,  4.14484998,
        4.26618964,  4.26023998,  3.94765549,  3.15681607,  3.80957767,
        4.33981114,  3.99918266,  4.08944965,  4.82652859,  4.16997299,
        4.73401326,  4.28742447,  2.99837642,  4.87269957,  4.80858694,
        4.49358227,  4.52396728,  4.2373653 ,  4.08866931,  4.27991414,
        4.36249896,  4.53517893,  3.94948219,  3.96147417,  4.22821513,
        3.69428034,  4.01729615,  3.79163602,  3.65424436,  4.22295314,
        3.85816884,  4.0247123 ,  3.5860717 ,  4.65661126,  4.98446742,
        4.1143858 ,  4.26939015,  3.86585101,  4.21769114,  4.02575865,
       14.98162437, 12.79841805, 14.95562835, 13.8032843 , 14.56522023,
       16.47759705, 11.16668003, 15.56774547, 14.49918822, 15.49863414,
       13.05084102, 13.45497365, 14.07900359, 12.71867505, 13.1811575 ,
       13.61800264, 13.68341264, 16.5195524 , 17.37751813, 12.54960373,
       14.59162438, 12.38513525, 16.69368536, 12.59198072, 14.29381076,
       14.86864104, 12.32661358, 12.39272475, 14.13596459, 14.40451874,
       15.36813081, 15.90147212, 14.21932975, 12.67336356, 13.47508567,
       15.77891426, 14.1330436 , 13.60082782, 12.16144394, 13.91063344,
       14.42929656, 13.52901679, 12.79841805, 14.90869053, 14.62727138,
       13.64888845, 12.92630085, 13.301796  , 13.63561532, 12.6612924 ])
X.dot(w_[0].T)
array([ 4.00608449,  4.07951914,  3.73156718,  4.08271965,  3.92349967,
        4.73031193,  3.88104351,  4.17445465,  3.871113  ,  4.14484998,
        4.26618964,  4.26023998,  3.94765549,  3.15681607,  3.80957767,
        4.33981114,  3.99918266,  4.08944965,  4.82652859,  4.16997299,
        4.73401326,  4.28742447,  2.99837642,  4.87269957,  4.80858694,
        4.49358227,  4.52396728,  4.2373653 ,  4.08866931,  4.27991414,
        4.36249896,  4.53517893,  3.94948219,  3.96147417,  4.22821513,
        3.69428034,  4.01729615,  3.79163602,  3.65424436,  4.22295314,
        3.85816884,  4.0247123 ,  3.5860717 ,  4.65661126,  4.98446742,
        4.1143858 ,  4.26939015,  3.86585101,  4.21769114,  4.02575865,
       14.98162437, 12.79841805, 14.95562835, 13.8032843 , 14.56522023,
       16.47759705, 11.16668003, 15.56774547, 14.49918822, 15.49863414,
       13.05084102, 13.45497365, 14.07900359, 12.71867505, 13.1811575 ,
       13.61800264, 13.68341264, 16.5195524 , 17.37751813, 12.54960373,
       14.59162438, 12.38513525, 16.69368536, 12.59198072, 14.29381076,
       14.86864104, 12.32661358, 12.39272475, 14.13596459, 14.40451874,
       15.36813081, 15.90147212, 14.21932975, 12.67336356, 13.47508567,
       15.77891426, 14.1330436 , 13.60082782, 12.16144394, 13.91063344,
       14.42929656, 13.52901679, 12.79841805, 14.90869053, 14.62727138,
       13.64888845, 12.92630085, 13.301796  , 13.63561532, 12.6612924 ])
X.dot(w_[0])
array([ 4.00608449,  4.07951914,  3.73156718,  4.08271965,  3.92349967,
        4.73031193,  3.88104351,  4.17445465,  3.871113  ,  4.14484998,
        4.26618964,  4.26023998,  3.94765549,  3.15681607,  3.80957767,
        4.33981114,  3.99918266,  4.08944965,  4.82652859,  4.16997299,
        4.73401326,  4.28742447,  2.99837642,  4.87269957,  4.80858694,
        4.49358227,  4.52396728,  4.2373653 ,  4.08866931,  4.27991414,
        4.36249896,  4.53517893,  3.94948219,  3.96147417,  4.22821513,
        3.69428034,  4.01729615,  3.79163602,  3.65424436,  4.22295314,
        3.85816884,  4.0247123 ,  3.5860717 ,  4.65661126,  4.98446742,
        4.1143858 ,  4.26939015,  3.86585101,  4.21769114,  4.02575865,
       14.98162437, 12.79841805, 14.95562835, 13.8032843 , 14.56522023,
       16.47759705, 11.16668003, 15.56774547, 14.49918822, 15.49863414,
       13.05084102, 13.45497365, 14.07900359, 12.71867505, 13.1811575 ,
       13.61800264, 13.68341264, 16.5195524 , 17.37751813, 12.54960373,
       14.59162438, 12.38513525, 16.69368536, 12.59198072, 14.29381076,
       14.86864104, 12.32661358, 12.39272475, 14.13596459, 14.40451874,
       15.36813081, 15.90147212, 14.21932975, 12.67336356, 13.47508567,
       15.77891426, 14.1330436 , 13.60082782, 12.16144394, 13.91063344,
       14.42929656, 13.52901679, 12.79841805, 14.90869053, 14.62727138,
       13.64888845, 12.92630085, 13.301796  , 13.63561532, 12.6612924 ])
### 查看方程
w_ = lr.coef_
b_ = lr.intercept_
print('方程系数',lr.coef_)
print('方程截距',lr.intercept_)
def fun(X):#线性方程,矩阵,批量计算
    return X.dot(w_[0]) + b_[0]
def sigmoid(x):#fun就是线性方程的返回值
    return 1/(1+np.e**-x)
方程系数 [[ 0.48498493 -0.34086327  1.8278232   0.83365156]]
方程截距 [-8.76905997]
proba_[:5]
array([[0.99153216, 0.00846784],
       [0.9908928 , 0.0091072 ],
       [0.99355185, 0.00644815],
       [0.99086387, 0.00913613],
       [0.99219814, 0.00780186]])
f = fun(X)
p_1 = sigmoid(f)
p_0 = 1 - p_1
p_ = np.c_[p_0,p_1]
p_[:10]
array([[0.99153216, 0.00846784],
       [0.9908928 , 0.0091072 ],
       [0.99355185, 0.00644815],
       [0.99086387, 0.00913613],
       [0.99219814, 0.00780186],
       [0.98268555, 0.01731445],
       [0.99252002, 0.00747998],
       [0.9899949 , 0.0100051 ],
       [0.99259338, 0.00740662],
       [0.99028393, 0.00971607]])

(五、逻辑斯蒂实现多分类的问题)

import numpy as np

from sklearn.linear_model import LogisticRegression

from sklearn import datasets

加载数据,三分类问题,打乱顺序,shuffle,每个人都会不同

# y三分类问题
X,y = datasets.load_iris(True)
index = np.arange(150)#0,1,2,……149
np.random.shuffle(index)
X = X[index]
y = y[index]
print(y)
[2 1 2 1 2 0 2 1 0 2 0 0 0 2 0 0 1 0 1 1 2 0 1 0 1 1 0 0 2 0 2 0 0 1 0 0 2
 2 1 2 0 0 1 1 1 1 0 0 1 0 1 2 0 2 2 2 1 0 0 2 2 1 0 0 0 2 1 0 1 1 2 2 0 0
 2 2 0 2 2 1 0 2 1 2 0 0 0 2 0 2 0 0 1 1 2 0 1 2 2 0 2 2 1 2 2 2 1 1 1 1 1
 0 2 1 2 2 1 2 1 0 0 2 0 1 1 2 0 1 1 1 2 1 1 2 0 1 1 1 2 1 2 2 2 0 1 0 1 1
 2 0]

使用算法,训练数据,构建模型,方程系数出来

lr = LogisticRegression(max_iter = 200)

lr.fit(X,y)

print('方程斜率\n',lr.coef_)
print('方程截距\n',lr.intercept_)
w_ = lr.coef_
b_ = lr.intercept_
方程斜率
 [[-0.42423735  0.9676256  -2.51686784 -1.07948854]
 [ 0.5345059  -0.32152121 -0.2063666  -0.94389713]
 [-0.11026856 -0.64610438  2.72323444  2.02338567]]
方程截距
 [  9.85170372   2.23620276 -12.08790648]

使用模型,预测类别与预测概率

y_ = lr.predict(X) #预测类别
proba_ = lr.predict_proba(X)# 预测概率
print(y_[:10])
print(proba_[:10])
print(proba_[:10].argmax(axis = 1)) #概率转化为类别
[2 1 2 1 2 0 2 1 0 2]
[[6.26707685e-05 1.88637260e-01 8.11300070e-01]
 [1.47766771e-01 8.49178375e-01 3.05485403e-03]
 [1.62200903e-03 4.40387095e-01 5.57990896e-01]
 [3.71141547e-02 9.55351158e-01 7.53468681e-03]
 [2.48080106e-06 2.55761731e-02 9.74421346e-01]
 [9.68771637e-01 3.12283317e-02 3.17614803e-08]
 [9.96351290e-05 1.20620400e-01 8.79279965e-01]
 [9.07597639e-03 9.76586059e-01 1.43379645e-02]
 [9.86783707e-01 1.32162727e-02 1.99839847e-08]
 [3.73964002e-06 1.74498680e-02 9.82546392e-01]]
[2 1 2 1 2 0 2 1 0 2]

手动进行概率的计算

''' softmax function is used to find the predicted probability of
each class.'''
# softmax 将数据变成概率问题(所有的概率和是1)
' softmax function is used to find the predicted probability of\neach class.'

e^x/\sum_{i=1}^ne^{x_i}

image
a = np.array([-3,1,3])
# softmax将数值转换成概率,大的值,变的更大,小的值,变得更小
np.e**a/((np.e**a).sum())
array([0.00217852, 0.11894324, 0.87887824])
def softmax(x):
    return np.e**x/((np.e**x).sum(axis = 1).reshape(-1,1))
c = np.random.randint(1,10,size = (3,4))
c
array([[3, 7, 6, 1],
       [7, 6, 8, 5],
       [8, 8, 5, 7]])
c_s = c.sum(axis = 1)
# c_s.reshape(-1,1)
# 数据 c每个数据除以每一行的平均值
c/c_s
---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-10-efea26b8ac04> in <module>
      1 # 数据 c每个数据除以每一行的平均值
----> 2 c/c_s


ValueError: operands could not be broadcast together with shapes (3,4) (3,) 
# w_ 和 b_是方程的斜率和截距
def linear(x):
    y = x.dot(w_.T) + b_ #矩阵运算,对齐!!!
    return y
# y_pred这个是线性函数预测的线性值
# 线性值,转化成概率
y_pred = linear(X)
y_proba = softmax(y_pred) # softmax可以转化成概率
y_proba[:10]

相关文章

网友评论

      本文标题:(九)逻辑斯蒂回归(用于分类)

      本文链接:https://www.haomeiwen.com/subject/ehuruhtx.html