美文网首页R语言R plot生物信息与数据分析
跟着Nature Genetics学作图:R语言ggplot2普

跟着Nature Genetics学作图:R语言ggplot2普

作者: 小明的数据分析笔记本 | 来源:发表于2022-07-05 16:53 被阅读0次

    论文

    Plasma proteome analyses in individuals of European and African ancestry identify cis-pQTLs and models for proteome-wide association studies

    https://www.nature.com/articles/s41588-022-01051-w

    本地pdf s41588-022-01051-w.pdf

    代码链接

    https://zenodo.org/record/6332981#.YroV0nZBzic

    https://github.com/Jingning-Zhang/PlasmaProtein/tree/v1.2

    今天的推文重复一下论文中的Figure3,涉及到4个图,普通箱线图,分组箱线图,箱线图分面,最后一个知识点是如何将这5个图组合到一起

    首先是定义了ggplot2的主题

    library(ggplot2)
    
    My_Theme <- theme(
      panel.background = element_blank(), 
      title = element_text(size = 7),
      text = element_text(size = 6))
    

    第一个普通的箱线图

    部分示例数据集

    image.png

    读取数据集

    library(readxl)
    dat01<-read_excel("data/20220627/Fig3.xlsx",
                      sheet = "3a")
    

    作图代码

    p1 <- ggplot(data = dat01, aes(x = group)) + 
      geom_boxplot(alpha=0.6, 
                   notch = TRUE, 
                   notchwidth = 0.5, 
                   aes(y=hsq, fill=kind)) +
      coord_cartesian(ylim = c(0,0.5)) +  
      labs(y = expression(paste("cis-",h^2)),
           x=NULL, title=NULL) +
      theme(legend.position="top",
            legend.title=element_blank(), 
            axis.text.x = element_text(color = c("#4a1486", 
                                                 "#4a1486", 
                                                 "#cb181d",
                                                 "#cb181d"),
                                       vjust = 0.5, 
                                       hjust = 0.5, 
                                       angle = 15))+
      My_Theme+
      scale_fill_manual(values=c("#4a1486","#cb181d"))+
      theme(axis.line = element_line())
    p1
    
    image.png

    分组箱线图

    作图代码

    dat02<-read_excel("data/20220627/Fig3.xlsx",
                      sheet = "3b")
    head(dat02)
    
    p2 <- ggplot(data = dat02, aes(x = group)) +
      geom_boxplot(alpha=0.8, 
                   notch = TRUE, 
                   notchwidth = 0.5, 
                   aes(y=acc, fill=Model)) + 
      coord_cartesian(ylim = c(0,1.2)) +
      labs(title = NULL, x=NULL,
           y=expression(paste(R^2,"/cis-",h^2))) +
      theme(legend.position="top",
            axis.text.x = element_text(color = c("#4a1486", 
                                                 "#4a1486", 
                                                 "#cb181d",
                                                 "#cb181d"),
                                       vjust = 0.5, 
                                       hjust = 0.5, 
                                       angle = 15))+
      My_Theme+
      scale_fill_manual(values=c("#feb24c","#41b6c4"))+
      theme(axis.line = element_line())
    p2
    

    箱线图分面

    dat03<-read_excel("data/20220627/Fig3.xlsx",
                      sheet = "3c")
    head(dat03)
    p3 <- ggplot(data = dat03, aes(x = model)) + 
      geom_boxplot(alpha=0.8, 
                   notch = TRUE, 
                   notchwidth = 0.5, 
                   aes(y=acc, fill=model)) + 
      facet_wrap(~race,  ncol=2)+
      labs(title = NULL, x=NULL,
           y=expression(paste(R^2,"/cis-",h^2))) +
      coord_cartesian(ylim = c(0,1.2))  +
      theme(axis.text.x = element_text(color = c("#238b45", 
                                                 "#2171b5"),
                                       vjust = 0.5, 
                                       hjust = 0.5, 
                                       angle = 15),
            legend.position="none") +
      My_Theme+
      scale_fill_manual(values=c("#238b45","#2171b5"))+
      theme(axis.line = element_line(),
            panel.spacing.x = unit(0,'lines'),
            strip.background = element_rect(color="white"))
    p3
    
    

    这里两个小知识点,

    • 默认分面两个图之间是有空白的,如果想没有这个空白可以在主题里进行设置 panel.spacing.x = unit(0,'lines')

    • 两个图中间没有空白,上面灰色区域的地方如果想区分开,可以将边框颜色设置为白色strip.background = element_rect(color="white")

    image.png

    最后一个箱线图

    dat04<-read_excel("data/20220627/Fig3.xlsx",
                      sheet = "3d")
    head(dat04)
    gtex.colors <- read_excel("data/20220627/gtex_colors.xlsx")
    gtex.colors
    
    myColors <- gtex.colors$V2
    names(myColors) <- gtex.colors$V1
    colScale <- scale_fill_manual(name = "gtex.colors", values = myColors)
    
    p4 <- ggplot(data = dat04, aes(x = tissue, fill=tissue)) +
      geom_boxplot(alpha=0.8, 
                   notch = TRUE, 
                   notchwidth = 0.5, 
                   aes(y=cor)) + 
      theme(axis.text.x = element_text(angle = 90, hjust = 1),
            legend.position="none",
            axis.title.y = element_text(hjust=1))+
      My_Theme+
      coord_cartesian(ylim = c(-0.25,1))+
      colScale +
      labs(x = "GTEx V7 tissue", 
           y = "Correlation between cis-regulated gene       \nexpression and plasma protein SOMAmers      ",
           title=NULL)+
      theme(axis.line = element_line())
    p4
    
    
    image.png

    将四个图组合到一起

    library(ggpubr)
    p <- ggarrange(ggarrange(p1, p2,
                             p3,
                             ncol = 3, labels = c("a", "b","c"),
                             widths = c(0.29,0.4,0.31)),
                   p4,
                   nrow = 2, heights = c(0.5,0.5),
                   labels = c(NA,"d"))
    p
    
    
    image.png

    示例数据和代码可以自己到论文中获取,或者给本篇推文点赞,点击在看,然后留言获取

    欢迎大家关注我的公众号

    小明的数据分析笔记本

    小明的数据分析笔记本 公众号 主要分享:1、R语言和python做数据分析和数据可视化的简单小例子;2、园艺植物相关转录组学、基因组学、群体遗传学文献阅读笔记;3、生物信息学入门学习资料及自己的学习笔记!

    相关文章

      网友评论

        本文标题:跟着Nature Genetics学作图:R语言ggplot2普

        本文链接:https://www.haomeiwen.com/subject/jousbrtx.html