美文网首页机器学习与数据挖掘
林轩田机器学习基石课程 - PLA算法 python实现

林轩田机器学习基石课程 - PLA算法 python实现

作者: Spareribs | 来源:发表于2019-02-08 19:41 被阅读42次

    作业1:计算

    Q1. Implement a version of PLA by visiting examples in the naive cycle using the order of examples in the data set.
    Run the algorithm on the data set.
    What is the number of updates before the algorithm halts?

           def pla_1(self, X, Y):
            """
            统计迭代次数
            :param X: 特征集
            :param Y: 标签集
            :return: 返回迭代次数
            """
            # w权重初始化, 默认设置成与X唯独一样的零法向量
            W = np.zeros(X.shape[1])
    
            # PLA iteration
            halt = 0  # number of iteration before halt
            for i in range(X.shape[0]):  # 遍历所有所有X
                score = np.dot(X[i, :], W)  # 计算X与w的乘积
                if score * Y[i] <= 0:  # 出现错误,即 y*X <= 0的时候
                    W = W + np.dot(X[i, :].T, Y[i])  # 重置W权重
                    halt = halt + 1  # 增加迭代的次数
    
            return halt
    

    作业2:

    Q2. Implement a version of PLA by visiting examples in fixed, pre-determined random cycles throughout the algorithm.
    Run the algorithm on the data set. Please repeat your experiment for 2000 times, each with a different random seed.
    What is the average number of updates before the algorithm halts?
    Plot a histogram ( https://en.wikipedia.org/wiki/Histogram ) to show the number of updates versus frequency.

        def pla_2(self, X, Y):
            """
            在pla_2的基础上统计 平均迭代的次数
            :param X: 特征集
            :param Y: 标签集
            :return: 平均迭代次数和准确率
            """
            Iteration = 2000  # 设置最大的迭代次数
            Halts = []  # list store halt every iteration
            Accuracys = []  # list store accuracy every iteration
    
            for iter in range(Iteration):
                np.random.seed(iter)  # set random seed, different by iteration
    
                # 随机选取一个点
                permutation = np.random.permutation(X.shape[0])  # random select index
                X = X[permutation]  # random order X
                Y = Y[permutation]  # random order Y, as the same as X
    
                # 与 pla_1 的功能一样,遍历X, 更新w权重和统计迭代次数
                W = np.zeros(X.shape[1])  # weights initialization
                halt = 0  # number of iteration before halt
                for i in range(X.shape[0]):
                    score = np.dot(X[i, :], W)  # score
                    if score * Y[i] <= 0:  # classification error
                        W = W + np.dot(X[i, :].T, Y[i])
                        halt = halt + 1
    
                # 设置Y标签,如果大于0置1,小于0置-1
                Y_pred = np.dot(X, W)
                Y_pred[Y_pred > 0] = 1
                Y_pred[Y_pred < 0] = -1
                accuracy = np.mean(Y_pred == Y)
    
                # store Halts & Accuracys
                Halts.append(halt)
                Accuracys.append(accuracy)
    
            # mean
            halt_mean = np.mean(Halts)
            accuracy_mean = np.mean(Accuracys)
    
            return halt_mean, accuracy_mean
    

    作业3:

    Q3. Implement a version of PLA by visiting examples in fixed, pre-determined random cycles throughout the algorithm, while changing the update rule to be:
    Wt+1→Wt+ηyn(t)xn(t) with η=0.5η=0.5 . Note that your PLA in the previous problem corresponds to η=1η=1 .
    Please repeat your experiment for 2000 times, each with a different random seed. What is the average number of updates before the algorithm halts?
    Plot a histogram to show the number of updates versus frequency. Compare your result to the previous problem and briefly discuss your findings.

    
        def pla_3(self, X, Y):
    
            Iteration = 2000  # number of iteration
            Halts = []  # list store halt every iteration
            Accuracys = []  # list store accuracy every iteration
    
            for iter in range(Iteration):
                np.random.seed(iter)  # set random seed, different by iteration
                permutation = np.random.permutation(X.shape[0])  # random select index
                X = X[permutation]  # random order X_data
                Y = Y[permutation]  # random order Y_data, as the same as X_data
    
                # look through the entire data set
                W = np.zeros(X.shape[1])  # weights initialization
                halt = 0  # number of iteration before halt
                for i in range(X.shape[0]):
                    score = np.dot(X[i, :], W)  # score
                    if score * Y[i] <= 0:  # classification error
                        W = W + 0.5 * np.dot(X[i, :].T, Y[i])
                        halt = halt + 1
    
                # accuracy
                Y_pred = np.dot(X, W)
                Y_pred[Y_pred > 0] = 1
                Y_pred[Y_pred < 0] = -1
                accuracy = np.mean(Y_pred == Y)
    
                # store Halts & Accuracys
                Halts.append(halt)
                Accuracys.append(accuracy)
    
            # mean
            halt_mean = np.mean(Halts)
            accuracy_mean = np.mean(Accuracys)
            return halt_mean, accuracy_mean
    

    相关文章

      网友评论

        本文标题:林轩田机器学习基石课程 - PLA算法 python实现

        本文链接:https://www.haomeiwen.com/subject/iyeosqtx.html