美文网首页
3.1.6.5. Multiple Regression

3.1.6.5. Multiple Regression

作者: 榴莲气象 | 来源:发表于2019-01-03 11:21 被阅读3次

    3.1.6.5. Multiple Regression
    Calculate using ‘statsmodels’ just the best fit, or all the corresponding statistical parameters.

    Also shows how to make 3d plots.

    Original author: Thomas Haslwanter

    import numpy as np
    import matplotlib.pyplot as plt
    import pandas
    
    # For 3d plots. This import is necessary to have 3D plotting below
    from mpl_toolkits.mplot3d import Axes3D
    
    # For statistics. Requires statsmodels 5.0 or more
    from statsmodels.formula.api import ols
    # Analysis of Variance (ANOVA) on linear models
    from statsmodels.stats.anova import [anova_lm](http://www.statsmodels.org/stable/generated/statsmodels.stats.anova.anova_lm.html#statsmodels.stats.anova.anova_lm "View documentation for statsmodels.stats.anova.anova_lm")
    

    Generate and show the data

    # We generate a 2D grid
    X, Y = [np.meshgrid](https://docs.scipy.org/doc/numpy/reference/generated/numpy.meshgrid.html#numpy.meshgrid "View documentation for numpy.meshgrid")(x, x)
    
    # To get reproducable values, provide a seed value
    [np.random.seed](https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.seed.html#numpy.random.seed "View documentation for numpy.random.seed")(1)
    
    # Z is the elevation of this 2D grid
    Z = -5 + 3*X - 0.5*Y + 8 * [np.random.normal](https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.normal.html#numpy.random.normal "View documentation for numpy.random.normal")(size=X.shape)
    
    # Plot the data
    fig = [plt.figure](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.figure.html#matplotlib.pyplot.figure "View documentation for matplotlib.pyplot.figure")()
    ax = fig.gca(projection='3d')
    surf = ax.plot_surface(X, Y, Z, cmap=plt.cm.coolwarm,
                           rstride=1, cstride=1)
    ax.view_init(20, -120)
    ax.set_xlabel('X')
    ax.set_ylabel('Y')
    ax.set_zlabel('Z')
    
    ../../../_images/sphx_glr_plot_regression_3d_001.png

    Multilinear regression model, calculating fit, P-values, confidence intervals etc.

    # Convert the data into a Pandas DataFrame to use the formulas framework
    # in statsmodels
    
    # First we need to flatten the data: it's 2D layout is not relevent.
    X = X.flatten()
    Y = Y.flatten()
    Z = Z.flatten()
    
    data = [pandas.DataFrame](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.html#pandas.DataFrame "View documentation for pandas.DataFrame")({'x': X, 'y': Y, 'z': Z})
    
    # Fit the model
    model = ols("z ~ x + y", data).fit()
    
    # Print the summary
    print(model.summary())
    
    print("\nRetrieving manually the parameter estimates:")
    print(model._results.params)
    # should be array([-4.99754526,  3.00250049, -0.50514907])
    
    # Peform analysis of variance on fitted linear model
    anova_results = [anova_lm](http://www.statsmodels.org/stable/generated/statsmodels.stats.anova.anova_lm.html#statsmodels.stats.anova.anova_lm "View documentation for statsmodels.stats.anova.anova_lm")(model)
    
    print('\nANOVA results')
    print(anova_results)
    
    [plt.show](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.show.html#matplotlib.pyplot.show "View documentation for matplotlib.pyplot.show")()
    

    相关文章

      网友评论

          本文标题:3.1.6.5. Multiple Regression

          本文链接:https://www.haomeiwen.com/subject/sxhfrqtx.html