美文网首页TensorFlow@IT·互联网程序员
scikit-learn的基本用法(六)——交叉验证2

scikit-learn的基本用法(六)——交叉验证2

作者: SnailTyan | 来源:发表于2017-05-04 21:59 被阅读332次

    文章作者:Tyan
    博客:noahsnail.com | CSDN | 简书

    本文主要介绍scikit-learn中的交叉验证。

    • Demo
    import numpy as np
    import matplotlib.pyplot as plt
    from sklearn.datasets import load_digits
    from sklearn.cross_validation import train_test_split
    from sklearn.svm import SVC
    from sklearn.learning_curve import learning_curve
    from sklearn.model_selection import cross_val_score 
    
    
    # 加载数据集
    digits = load_digits()
    X = digits.data
    y = digits.target
    # 用SVM进行学习并记录loss
    train_sizes, train_loss, test_loss = learning_curve(SVC(gamma = 0.001), 
                                                        X, y, cv = 10, scoring = 'neg_mean_squared_error',
                                                        train_sizes = [0.1, 0.25, 0.5, 0.75, 1])
    
    # 训练误差均值
    train_loss_mean = -np.mean(train_loss, axis = 1)
    # 测试误差均值
    test_loss_mean = -np.mean(test_loss, axis = 1)
    
    # 绘制误差曲线
    plt.plot(train_sizes, train_loss_mean, 'o-', color = 'r', label = 'Training')
    plt.plot(train_sizes, test_loss_mean, 'o-', color = 'g', label = 'Cross-Validation')
    
    plt.xlabel('Training data size')
    plt.ylabel('Loss')
    plt.legend(loc = 'best')
    plt.show()
    
    • 结果

    ![image](文章作者:Tyan
    博客:noahsnail.com | CSDN | 简书

    本文主要介绍scikit-learn中的交叉验证。

    • Demo
    import numpy as np
    import matplotlib.pyplot as plt
    from sklearn.datasets import load_digits
    from sklearn.cross_validation import train_test_split
    from sklearn.svm import SVC
    from sklearn.learning_curve import learning_curve
    from sklearn.model_selection import cross_val_score 
    
    
    # 加载数据集
    digits = load_digits()
    X = digits.data
    y = digits.target
    # 用SVM进行学习并记录loss
    train_sizes, train_loss, test_loss = learning_curve(SVC(gamma = 0.001), 
                                                        X, y, cv = 10, scoring = 'neg_mean_squared_error',
                                                        train_sizes = [0.1, 0.25, 0.5, 0.75, 1])
    
    # 训练误差均值
    train_loss_mean = -np.mean(train_loss, axis = 1)
    # 测试误差均值
    test_loss_mean = -np.mean(test_loss, axis = 1)
    
    # 绘制误差曲线
    plt.plot(train_sizes, train_loss_mean, 'o-', color = 'r', label = 'Training')
    plt.plot(train_sizes, test_loss_mean, 'o-', color = 'g', label = 'Cross-Validation')
    
    plt.xlabel('Training data size')
    plt.ylabel('Loss')
    plt.legend(loc = 'best')
    plt.show()
    
    • 结果

    ![image](文章作者:Tyan
    博客:noahsnail.com  |  CSDN  |  简书

    本文主要介绍scikit-learn中的交叉验证。

    • Demo
    import numpy as np
    import matplotlib.pyplot as plt
    from sklearn.datasets import load_digits
    from sklearn.cross_validation import train_test_split
    from sklearn.svm import SVC
    from sklearn.learning_curve import learning_curve
    from sklearn.model_selection import cross_val_score 
    
    
    # 加载数据集
    digits = load_digits()
    X = digits.data
    y = digits.target
    # 用SVM进行学习并记录loss
    train_sizes, train_loss, test_loss = learning_curve(SVC(gamma = 0.001), 
                                                        X, y, cv = 10, scoring = 'neg_mean_squared_error',
                                                        train_sizes = [0.1, 0.25, 0.5, 0.75, 1])
    
    # 训练误差均值
    train_loss_mean = -np.mean(train_loss, axis = 1)
    # 测试误差均值
    test_loss_mean = -np.mean(test_loss, axis = 1)
    
    # 绘制误差曲线
    plt.plot(train_sizes, train_loss_mean, 'o-', color = 'r', label = 'Training')
    plt.plot(train_sizes, test_loss_mean, 'o-', color = 'g', label = 'Cross-Validation')
    
    plt.xlabel('Training data size')
    plt.ylabel('Loss')
    plt.legend(loc = 'best')
    plt.show()
    
    • 结果
    image

    )

    相关文章

      网友评论

        本文标题:scikit-learn的基本用法(六)——交叉验证2

        本文链接:https://www.haomeiwen.com/subject/kvgdtxtx.html