美文网首页生物信息学R语言学习R包
ggpubr:快速绘制用于发表的图形

ggpubr:快速绘制用于发表的图形

作者: Boer223 | 来源:发表于2020-02-06 22:36 被阅读0次

    更好的阅读体验-->>

    ggplot2 (Hadley Wickham开发)是目前R语言数据可视化的主流。与R的基础绘图系统相比,基于grid绘图系统的ggplot2已经在语法理解性上已经进步很多,但是通过ggplot2绘制用于学术杂志的图形,仍然需要较多的绘图函数(或者加载一些写好的模板代码)。为此Alboukadel Kassambara基于ggplot2、ggsci包开发了ggpubr用于绘制符合出版物要求的图形。该包封装了很多ggplot2的绘图函数,并且内嵌了ggsci中很多优秀的学术期刊配色方案,值得学习使用。

    ggpubr包括一些关键的特性:

    • 能帮助研究人员快速创建易于发表的图形;
    • 能够将P值和显著性水平自动添加到图形上而无需二次编辑;
    • 使图形注释和排版变得容易;
    • 使更改图形参数(例如颜色和标签)变得容易。

    安装

    从CRAN安装:

    install.packages("ggpubr")
    

    或者也可以从Github安装最新版本:

    if(!require(devtools)) install.packages("devtools")
    devtools::install_github("kassambara/ggpubr")
    

    加载ggpubr包:

    library("ggpubr")
    

    ggpubr可绘制的图形

    加载数据

    library(ggpubr)
    
    set.seed(1234)
    wdata = data.frame(
       sex = factor(rep(c("F", "M"), each=200)),
       weight = c(rnorm(200, 55), rnorm(200, 58)))
    head(wdata, 4)
    
    ##   sex weight
    ## 1   F   53.8
    ## 2   F   55.3
    ## 3   F   56.1
    ## 4   F   52.7
    

    密度图

    # 带有平均值线和边际地毯的密度图
    ggdensity(wdata, x = "weight",
       add = "mean", rug = TRUE,
       color = "sex", fill = "sex",
       palette = c("#00AFBB", "#E7B800"))
    

    直方图

    # 带有平均值线和边际地毯的直方图
    gghistogram(wdata, x = "weight",
       add = "mean", rug = TRUE,
       color = "sex", fill = "sex",
       palette = c("#00AFBB", "#E7B800"))
    

    箱线图和小提琴图

    # 加载数据
    data("ToothGrowth")
    df <- ToothGrowth
    head(df, 4)
    
    ##    len supp dose
    ## 1  4.2   VC  0.5
    ## 2 11.5   VC  0.5
    ## 3  7.3   VC  0.5
    ## 4  5.8   VC  0.5
    
    # 带有抖动点图的箱线图
    p <- ggboxplot(df, x = "dose", y = "len",
                    color = "dose", palette =c("#00AFBB", "#E7B800", "#FC4E07"),
                    add = "jitter", shape = "dose")
    p
    
    # 添加P值
    my_comparisons <- list( c("0.5", "1"), c("1", "2"), c("0.5", "2") )
    p + stat_compare_means(comparisons = my_comparisons)+ # 添加每两组间的P值
      stat_compare_means(label.y = 50) # 添加全局P值
    
    # 带有箱线图的小提琴图
    ggviolin(df, x = "dose", y = "len", fill = "dose",
             palette = c("#00AFBB", "#E7B800", "#FC4E07"),
             add = "boxplot", add.params = list(fill = "white"))+
      stat_compare_means(comparisons = my_comparisons, label = "p.signif")+ # 添加显著性水平
      stat_compare_means(label.y = 50) # 添加全局P值
    

    条形图

    数据加载

    data("mtcars")
    dfm <- mtcars
    dfm$cyl <- as.factor(dfm$cyl)
    dfm$name <- rownames(dfm)
    head(dfm[, c("name", "wt", "mpg", "cyl")])
    
    ##                                name   wt  mpg cyl
    ## Mazda RX4                 Mazda RX4 2.62 21.0   6
    ## Mazda RX4 Wag         Mazda RX4 Wag 2.88 21.0   6
    ## Datsun 710               Datsun 710 2.32 22.8   4
    ## Hornet 4 Drive       Hornet 4 Drive 3.21 21.4   6
    ## Hornet Sportabout Hornet Sportabout 3.44 18.7   8
    ## Valiant                     Valiant 3.46 18.1   6
    

    有序条形图

    通过cyl更改填充色,并且对全部数据进行排序, 而非分组排序。

    ggbarplot(dfm, x = "name", y = "mpg",
              fill = "cyl",               
              color = "white",            
              palette = "jco",            
              sort.val = "desc",          
              sort.by.groups = FALSE,     
              x.text.angle = 90           
              )
    

    对每组内的数据进行排序,可设置sort.by.groups = TRUE。

    ggbarplot(dfm, x = "name", y = "mpg",
              fill = "cyl",               
              color = "white",            
              palette = "jco",            
              sort.val = "asc",           
              sort.by.groups = TRUE,      
              x.text.angle = 90           
              )
    

    偏差图

    偏差图一般用以展示变量与参考值之间的偏差程度。下面将以mtcars数据集中的mpg z-score来绘制偏差图。
    计算mpg数据的z-score:

    dfm$mpg_z <- (dfm$mpg -mean(dfm$mpg))/sd(dfm$mpg)
    dfm$mpg_grp <- factor(ifelse(dfm$mpg_z < 0, "low", "high"), 
                         levels = c("low", "high"))
    
    head(dfm[, c("name", "wt", "mpg", "mpg_z", "mpg_grp", "cyl")])
    
    ##                                name   wt  mpg  mpg_z mpg_grp cyl
    ## Mazda RX4                 Mazda RX4 2.62 21.0  0.151    high   6
    ## Mazda RX4 Wag         Mazda RX4 Wag 2.88 21.0  0.151    high   6
    ## Datsun 710               Datsun 710 2.32 22.8  0.450    high   4
    ## Hornet 4 Drive       Hornet 4 Drive 3.21 21.4  0.217    high   6
    ## Hornet Sportabout Hornet Sportabout 3.44 18.7 -0.231     low   8
    ## Valiant                     Valiant 3.46 18.1 -0.330     low   6
    

    绘制分组排序的条形图:

    ggbarplot(dfm, x = "name", y = "mpg_z",
              fill = "mpg_grp",           
              color = "white",            
              palette = "jco",            
              sort.val = "asc",           
              sort.by.groups = FALSE,     
              x.text.angle = 90,          
              ylab = "MPG z-score",
              xlab = FALSE,
              legend.title = "MPG Group"
              )
    

    旋转图形:

    ggbarplot(dfm, x = "name", y = "mpg_z",
              fill = "mpg_grp",           
              color = "white",            
              palette = "jco",            
              sort.val = "desc",          
              sort.by.groups = FALSE,     
              x.text.angle = 90,          
              ylab = "MPG z-score",
              legend.title = "MPG Group",
              rotate = TRUE,
              ggtheme = theme_minimal()
              )
    

    点图

    棒棒糖图

    当你有大量数据来展示时,棒棒糖图与上面所说的条形图的效果是类似的。

    棒棒糖图的颜色可以根据分组变量“cyl”确定:

    ggdotchart(dfm, x = "name", y = "mpg",
               color = "cyl",                                
               palette = c("#00AFBB", "#E7B800", "#FC4E07"), 
               sorting = "ascending",                        
               add = "segments",                             
               ggtheme = theme_pubr()                        
               )
    

    旋转并更改点大小:

    ggdotchart(dfm, x = "name", y = "mpg",
               color = "cyl",                                
               palette = c("#00AFBB", "#E7B800", "#FC4E07"), 
               sorting = "descending",                       
               add = "segments",                             
               rotate = TRUE,                                
               group = "cyl",                                
               dot.size = 6,                                 
               label = round(dfm$mpg),                       
               font.label = list(color = "white", size = 9, 
                                 vjust = 0.5),              
               ggtheme = theme_pubr()                        
               )
    

    偏差图

    ggdotchart(dfm, x = "name", y = "mpg_z",
               color = "cyl",                                
               palette = c("#00AFBB", "#E7B800", "#FC4E07"), 
               sorting = "descending",                       
               add = "segments",                             
               add.params = list(color = "lightgray", size = 2), 
               group = "cyl",                                
               dot.size = 6,                                 
               label = round(dfm$mpg_z,1),                        
               font.label = list(color = "white", size = 9, 
                                 vjust = 0.5),               
               ggtheme = theme_pubr()                        
               )+
      geom_hline(yintercept = 0, linetype = 2, color = "lightgray")
    

    Cleveland点图

    ggdotchart(dfm, x = "name", y = "mpg",
               color = "cyl",                                
               palette = c("#00AFBB", "#E7B800", "#FC4E07"), 
               sorting = "descending",                       
               rotate = TRUE,                                
               dot.size = 2,                                 
               y.text.col = TRUE,                            
               ggtheme = theme_pubr()                        
               )+
      theme_cleveland()   
    

    运行环境

    sessionInfo()
    
    R version 3.6.2 (2019-12-12)
    Platform: x86_64-w64-mingw32/x64 (64-bit)
    Running under: Windows 10 x64 (build 18363)
    
    Matrix products: default
    
    locale:
    [1] LC_COLLATE=Chinese (Simplified)_China.936 
    [2] LC_CTYPE=Chinese (Simplified)_China.936   
    [3] LC_MONETARY=Chinese (Simplified)_China.936
    [4] LC_NUMERIC=C                              
    [5] LC_TIME=Chinese (Simplified)_China.936    
    
    attached base packages:
    [1] stats     graphics  grDevices utils     datasets  methods   base     
    
    other attached packages:
    [1] ggpubr_0.2.4  magrittr_1.5  ggplot2_3.2.1
    
    loaded via a namespace (and not attached):
     [1] Rcpp_1.0.3       rstudioapi_0.10  tidyselect_0.2.5 munsell_0.5.0   
     [5] colorspace_1.4-1 R6_2.4.1         rlang_0.4.3      dplyr_0.8.3     
     [9] tools_3.6.2      grid_3.6.2       gtable_0.3.0     withr_2.1.2     
    [13] lazyeval_0.2.2   assertthat_0.2.1 digest_0.6.23    tibble_2.1.3    
    [17] lifecycle_0.1.0  ggsignif_0.6.0   crayon_1.3.4     ggsci_2.9       
    [21] purrr_0.3.3      farver_2.0.3     glue_1.3.1       labeling_0.3    
    [25] compiler_3.6.2   pillar_1.4.3     scales_1.1.0     pkgconfig_2.0.3 
    

    参考

    相关文章

      网友评论

        本文标题:ggpubr:快速绘制用于发表的图形

        本文链接:https://www.haomeiwen.com/subject/uvlpxhtx.html