R中优雅的对P值进行转换

作者: R语言数据分析指南 | 来源:发表于2021-09-11 21:23 被阅读0次

    在数据分析中,我们经常要对数据进行统计分析;但是返回的结果往往是一串很长的浮点数不能给人直观的感受,本节来解释如何使用lucid函数来改进数据格式使P值更加直观

    原文链接:R中优雅的对P值进行转换

    安装并加载R包

    package.list=c("tidyverse","lucid","broom")
    
    for (package in package.list) {
      if (!require(package,character.only=T, quietly=T)) {
        install.packages(package)
        library(package, character.only=T)
      }
    }
    

    数据展示

    Orange %>% group_by(Tree) %>% 
      do(tidy(lm(circumference ~ age, data=.))) %>% as.data.frame
    

    可以看到返回的P值格式很不直观

       Tree        term    estimate    std.error statistic      p.value
    1     3 (Intercept) 19.20353638  5.863410215  3.275148 2.207255e-02
    2     3         age  0.08111158  0.005628105 14.411881 2.901046e-05
    3     1 (Intercept) 24.43784664  6.543311039  3.734783 1.350409e-02
    4     1         age  0.08147716  0.006280721 12.972581 4.851902e-05
    5     5 (Intercept)  8.75834459  8.176436207  1.071169 3.330518e-01
    6     5         age  0.11102891  0.007848307 14.146861 3.177093e-05
    7     2 (Intercept) 19.96090337  9.352361105  2.134317 8.593318e-02
    8     2         age  0.12506176  0.008977041 13.931291 3.425041e-05
    9     4 (Intercept) 14.63762022 11.233762751  1.303002 2.493507e-01
    10    4         age  0.13517222  0.010782940 12.535748 5.733090e-05
    

    lucid转换格式

    Orange %>% group_by(Tree) %>% 
      do(tidy(lm(circumference ~ age, data=.))) %>% as.data.frame %>% lucid
    
       Tree  term        estimate  std.error  statistic p.value    
       <ord> <chr>       <chr>     <chr>      <chr>     <chr>      
     1 3     (Intercept) "19.2   " " 5.86   " " 3.28"   "0.0221   "
     2 3     age         " 0.0811" " 0.00563" "14.4 "   "0.000029 "
     3 1     (Intercept) "24.4   " " 6.54   " " 3.73"   "0.0135   "
     4 1     age         " 0.0815" " 0.00628" "13   "   "0.0000485"
     5 5     (Intercept) " 8.76  " " 8.18   " " 1.07"   "0.333    "
     6 5     age         " 0.111 " " 0.00785" "14.1 "   "0.0000318"
     7 2     (Intercept) "20     " " 9.35   " " 2.13"   "0.0859   "
     8 2     age         " 0.125 " " 0.00898" "13.9 "   "0.0000343"
     9 4     (Intercept) "14.6   " "11.2    " " 1.3 "   "0.249    "
    10 4     age         " 0.135 " " 0.0108 " "12.5 "   "0.0000573"
    

    经过lucid函数处理后,可以看到数据符合人类的感官了,但是请注意数据格式变为了字符串类型,因此后续我们需求将其重新转换为数值型

    P值转换

    通过symnum函数将P值转换为*

    Orange %>% group_by(Tree) %>% 
      do(tidy(lm(circumference ~ age, data=.))) %>% as.data.frame %>%
      mutate(p.value=as.numeric(p.value)) %>% 
      lucid %>%
      mutate(pvalue=as.numeric(p.value),
             p_signif=symnum(pvalue, 
                           cutpoints = c(0,0.001,0.01,0.05,1), 
                           symbols = c("***","**","*"," "))) %>% 
      select(-pvalue)
    
       Tree        term estimate std.error statistic   p.value   pvalue signif
    1     3 (Intercept)  19.2      5.86         3.28 0.0221    2.21e-02      *
    2     3         age   0.0811   0.00563     14.4  0.000029  2.90e-05    ***
    3     1 (Intercept)  24.4      6.54         3.73 0.0135    1.35e-02      *
    4     1         age   0.0815   0.00628     13    0.0000485 4.85e-05    ***
    5     5 (Intercept)   8.76     8.18         1.07 0.333     3.33e-01       
    6     5         age   0.111    0.00785     14.1  0.0000318 3.18e-05    ***
    7     2 (Intercept)  20        9.35         2.13 0.0859    8.59e-02       
    8     2         age   0.125    0.00898     13.9  0.0000343 3.43e-05    ***
    9     4 (Intercept)  14.6     11.2          1.3  0.249     2.49e-01       
    10    4         age   0.135    0.0108      12.5  0.0000573 5.73e-05    ***
    

    自定义函数结合sapply对P值进行转换

    myfun <- function(pval) {
      stars = ""
      if(pval <= 0.001)
        stars = "***"
      if(pval > 0.001 & pval <= 0.01)
        stars = "**"
      if(pval > 0.01 & pval <= 0.05)
        stars = "*"
      if(pval > 0.05 & pval <= 0.1)
        stars = ""
      stars
    }
    
    Orange %>% group_by(Tree) %>% 
      do(tidy(lm(circumference ~ age, data=.))) %>% as.data.frame %>%
      lucid %>%
      mutate(pvalue=as.numeric(p.value)) %>% 
      mutate(signif = sapply(p.value, function(x) myfun(x)))
    
       Tree        term estimate std.error statistic   p.value   pvalue signif
    1     3 (Intercept)  19.2      5.86         3.28 0.0221    2.21e-02      *
    2     3         age   0.0811   0.00563     14.4  0.000029  2.90e-05    ***
    3     1 (Intercept)  24.4      6.54         3.73 0.0135    1.35e-02      *
    4     1         age   0.0815   0.00628     13    0.0000485 4.85e-05    ***
    5     5 (Intercept)   8.76     8.18         1.07 0.333     3.33e-01       
    6     5         age   0.111    0.00785     14.1  0.0000318 3.18e-05    ***
    7     2 (Intercept)  20        9.35         2.13 0.0859    8.59e-02       
    8     2         age   0.125    0.00898     13.9  0.0000343 3.43e-05    ***
    9     4 (Intercept)  14.6     11.2          1.3  0.249     2.49e-01       
    10    4         age   0.135    0.0108      12.5  0.0000573 5.73e-05    ***
    

    喜欢的小伙伴欢迎关注我的公众号 ,下回更新不迷路

    R语言数据分析指南,持续分享数据可视化的经典案例及一些生信知识,希望对大家

    相关文章

      网友评论

        本文标题:R中优雅的对P值进行转换

        本文链接:https://www.haomeiwen.com/subject/eiuawltx.html