R语言进行方差分析示例

作者: Cache_wood | 来源:发表于2021-04-30 07:38 被阅读0次

    方差分析的本质是研究分类变量数值变量的影响

    总误差SST = 组内误差(SSE)+ 组间误差(SSA)

    组内误差:误差平方和,组间误差:处理平方和

    SST = \sum_{i=1}^{4}\sum_{j=1}^{n_i}(x_{ij}-\bar{\bar{x}})^2\\ SSE = \sum_{i=1}^{4}\sum_{j=1}^{n_i}(x_{ij}-\bar{x}_i)^2\\ SSA = \sum_{i=1}^{4}\sum_{j=1}^{n_i}(\bar{x}_i-\bar{\bar{x}})^2 = \sum_{i=1}^{4}n_i(\bar{x}_i-\bar{\bar{x}})^2

    构造F统计量
    F = \frac{\frac{SSA}{k-1}}{\frac{SSE}{n-k}}\sim F(k-1,n-k)
    定义MSA =\frac{SSA}{k-1}\\ MSE = \frac{SSE}{n-k}

    eg<-data.frame(行业=c(rep(c('零售业','旅游业','航空公司','家电制造业'),c(7,6,5,5))),
                   vol = c(57,66,49,40,34,53,44,
                           68,39,29,45,56,51,
                           31,49,21,34,40,
                           44,51,65,77,58))
    fit<-aov(vol~行业,data=eg)
    summary(fit)
    
                Df Sum Sq Mean Sq F value
    行业         3   1457   485.5   3.407
    Residuals   19   2708   142.5        
                Pr(>F)  
    行业        0.0388 *
    Residuals           
    ---
    Signif. codes:  
      0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
      0.1 ‘ ’ 1
    

    结果解读:
    SSA = 1457, Df = 3, 均方MSA = 485.5\\ SSE = 2708, Df = 19, 均方MSE = 142.5\\ F检验统计量为3.407\\ p-value = 0.0388<0.05\\
    所以可以认为行业对被投诉次数有影响

    eg<-data.frame(manager=c(rep(c('m1','m2','m3'),c(5,7,6))),
                   vol = c(7,7,8,7,9,
                           8,9,8,10,9,10,8,
                           5,6,5,7,4,8))
    fit<-aov(vol~manager,data = eg)
    summary(fit)
    
                Df Sum Sq Mean Sq F value
    manager      2  29.61  14.805   11.76
    Residuals   15  18.89   1.259        
                  Pr(>F)    
    manager     0.000849 ***
    Residuals               
    ---
    Signif. codes:  
      0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’
      0.1 ‘ ’ 1
    

    相关文章

      网友评论

        本文标题:R语言进行方差分析示例

        本文链接:https://www.haomeiwen.com/subject/oajwrltx.html