美文网首页
李宏毅老师机器学习课程笔记_ML Lecture 1: ML L

李宏毅老师机器学习课程笔记_ML Lecture 1: ML L

作者: leogoforit | 来源:发表于2020-03-21 20:11 被阅读0次

    引言:

    最近开始学习“机器学习”,早就听说祖国宝岛的李宏毅老师的大名,一直没有时间看他的系列课程。今天听了一课,感觉非常棒,通俗易懂,而又能够抓住重点,中间还能加上一些很有趣的例子加深学生的印象。
    视频链接(bilibili):李宏毅机器学习(2017)
    另外已经有有心的同学做了速记并更新在github上:李宏毅机器学习笔记(LeeML-Notes)
    所以,接下来我的笔记只记录一些我自己的总结和听课当时的困惑,如果有能够帮我解答的朋友也请多多指教。

    学习机器学习,先从demo侠做起吧,这个demo是完全复现的李老师demo

    import numpy as np
    import matplotlib.pyplot as plt
    x_data = [ 338., 333., 328., 207., 226., 25., 179., 60., 208.,  606. ]
    y_data = [ 640., 633., 619., 393., 428., 27., 193., 66., 226., 1591. ]
    # y_data = b + w * x_data
    x = np.arange(-200, -100, 1) # bias
    y = np.arange(-5, 5, 0.1) # weight
    Z = np.zeros((len(x), len(y)))
    X, Y = np.meshgrid(x, y)
    for i in range(len(x)):
        for j in range((len(y))):
            b = x[i]
            w = y[j]
            Z[j][i] = 0
            for n in range(len(x_data)):
               Z[j][i] = Z[j][i] +(y_data[n] - b - w*x_data[n])**2
            Z[j][i] = Z[j][i]/len(x_data)
    b = -129 # intialize b
    w = -4 # intialize w
    lr = 0.0000001 # learning rate
    iteration = 100000
    
    # Store intial values for plotting
    b_history = [b]
    w_history = [w]
    
    # Iteration
    for i in range(iteration):
        b_grad = 0.0
        w_grad = 0.0
        for n in range(len(x_data)):
            b_grad = b_grad - 2.0*(y_data[n] - b - w*x_data[n])*1.0
            w_grad = w_grad - 2.0*(y_data[n] - b - w*x_data[n])*x_data[n]
        
        # Update parameters
        b = b - lr * b_grad
        w = w - lr * w_grad
        
        # Store the parameters for plotting
        b_history.append(b)
        w_history.append(w)
    
    # plot the figure
    plt.contour(x, y, Z, 50, alpha=0.5, cmap=plt.get_cmap('jet'))
    plt.plot([-188.4], [2.67], 'x', ms=12, markeredgewidth=3, color='orange')
    plt.plot(b_history, w_history, 'o-', ms=3, lw=1.5, color='black')
    plt.xlim(-200, -100)
    plt.ylim(-5,5)
    plt.xlabel(r'$b$', fontsize=16)
    plt.ylabel(r'$w$', fontsize=16)
    plt.show()
    

    输出结果为:


    图1

    横坐标是b,纵坐标是w,标记×位最优解,显然,在图中我们并没有运行得到最优解,最优解十分的遥远。那么我们就调大learning rate,lr = 0.000001(调大10倍),得到结果如图2。

    #### change the lr to 0.000001
    
    b = -129 # intialize b
    w = -4 # intialize w
    lr = 0.000001 # learning rate
    iteration = 100000
    
    # Store intial values for plotting
    b_history = [b]
    w_history = [w]
    
    # Iteration
    for i in range(iteration):
        b_grad = 0.0
        w_grad = 0.0
        for n in range(len(x_data)):
            b_grad = b_grad - 2.0*(y_data[n] - b - w*x_data[n])*1.0
            w_grad = w_grad - 2.0*(y_data[n] - b - w*x_data[n])*x_data[n]
        
        # Update parameters
        b = b - lr * b_grad
        w = w - lr * w_grad
        
        # Store the parameters for plotting
        b_history.append(b)
        w_history.append(w)
    
    # plot the figure
    plt.contour(x, y, Z, 50, alpha=0.5, cmap=plt.get_cmap('jet'))
    plt.plot([-188.4], [2.67], 'x', ms=12, markeredgewidth=3, color='orange')
    plt.plot(b_history, w_history, 'o-', ms=3, lw=1.5, color='black')
    plt.xlim(-200, -100)
    plt.ylim(-5,5)
    plt.xlabel(r'$b$', fontsize=16)
    plt.ylabel(r'$w$', fontsize=16)
    plt.show()
    
    图2

    我们再调大learning rate,lr = 0.00001(调大10倍),得到结果如图3。

    #### change the lr to 0.00001
    
    b = -129 # intialize b
    w = -4 # intialize w
    lr = 0.00001 # learning rate
    iteration = 100000
    
    # Store intial values for plotting
    b_history = [b]
    w_history = [w]
    
    # Iteration
    for i in range(iteration):
        b_grad = 0.0
        w_grad = 0.0
        for n in range(len(x_data)):
            b_grad = b_grad - 2.0*(y_data[n] - b - w*x_data[n])*1.0
            w_grad = w_grad - 2.0*(y_data[n] - b - w*x_data[n])*x_data[n]
        
        # Update parameters
        b = b - lr * b_grad
        w = w - lr * w_grad
        
        # Store the parameters for plotting
        b_history.append(b)
        w_history.append(w)
    
    # plot the figure
    plt.contour(x, y, Z, 50, alpha=0.5, cmap=plt.get_cmap('jet'))
    plt.plot([-188.4], [2.67], 'x', ms=12, markeredgewidth=3, color='orange')
    plt.plot(b_history, w_history, 'o-', ms=3, lw=1.5, color='black')
    plt.xlim(-200, -100)
    plt.ylim(-5,5)
    plt.xlabel(r'$b$', fontsize=16)
    plt.ylabel(r'$w$', fontsize=16)
    plt.show()
    
    图3

    一开始设置学习率为0.0000001,经过10万次迭代,发现离最优解还挺远,说明学习率太小,然后将学习率调整为0.000001,扩大了10倍,但是这个时候我们发现,学习率有发生了震荡,但是比之前的结果好了一点,更加接近我们的最优解。然后我们又将学习率增大了十倍,发现最终结果已经超出了整个图纸,完全震荡了,找不到最优解了。
    解决办法是:客制化b、w不同的学习率,这种方法称之为AdaGrad

    #### using adagrad to solve this problem
    
    b = -129 # intialize b
    w = -4 # intialize w
    lr = 1 # learning rate
    iteration = 100000
    
    b_lr = 0.0
    w_lr = 0.0
    
    # Store intial values for plotting
    b_history = [b]
    w_history = [w]
    
    # Iteration
    for i in range(iteration):
        b_grad = 0.0
        w_grad = 0.0
        for n in range(len(x_data)):
            b_grad = b_grad - 2.0*(y_data[n] - b - w*x_data[n])*1.0
            w_grad = w_grad - 2.0*(y_data[n] - b - w*x_data[n])*x_data[n]
        
        b_lr = b_lr + b_grad**2
        w_lr = w_lr + w_grad**2
        
        # Update parameters
        b = b - lr/np.sqrt(b_lr) * b_grad
        w = w - lr/np.sqrt(w_lr) * w_grad
        
        # Store the parameters for plotting
        b_history.append(b)
        w_history.append(w)
    
    # plot the figure
    plt.contour(x, y, Z, 50, alpha=0.5, cmap=plt.get_cmap('jet'))
    plt.plot([-188.4], [2.67], 'x', ms=12, markeredgewidth=3, color='orange')
    plt.plot(b_history, w_history, 'o-', ms=3, lw=1.5, color='black')
    plt.xlim(-200, -100)
    plt.ylim(-5,5)
    plt.xlabel(r'$b$', fontsize=16)
    plt.ylabel(r'$w$', fontsize=16)
    plt.show()
    

    最后的结果如图4:


    图4

    相关文章

      网友评论

          本文标题:李宏毅老师机器学习课程笔记_ML Lecture 1: ML L

          本文链接:https://www.haomeiwen.com/subject/eeiuyhtx.html