美文网首页Data scienceR plotR for statistics
跟着Nature学作图:R语言ggplot2频率分布直方图和散点

跟着Nature学作图:R语言ggplot2频率分布直方图和散点

作者: 小明的数据分析笔记本 | 来源:发表于2022-12-06 05:37 被阅读0次

    论文

    A saturated map of common genetic variants associated with human height

    https://www.nature.com/articles/s41586-022-05275-y

    s41586-022-05275-y.pdf

    代码没有公开,但是作图数据基本都公开了,争取把每个图都重复一遍

    今天的推文重复论文中的extended Figure4 频率分布直方图和散点图添加误差线

    首先是图a频率分布直方图

    library(readxl)
    dat<-read_excel("extendFig4.xlsx",
                    sheet = "Panel a")
    dat
    
    colnames(dat)<-"Var1"
    library(ggplot2)
    library(ggh4x)
    
    
    ggplot(data=dat,aes(x=Var1))+
      geom_histogram(bins = 25,
                     color="white",
                     fill="#aadbe9")+
      scale_x_continuous(limits = c(0.5,3),
                         breaks = seq(0.5,3,by=0.5))+
      scale_y_continuous(limits = c(0,300),
                         breaks = seq(0,300,50))+
      geom_vline(xintercept = 0.75,lty="dashed",color="#aadbe9")+
      geom_vline(xintercept = 2.25,lty="dashed",color="#aadbe9")+
      geom_segment(aes(x=2.5,xend=2.5,y=50,yend=0),
                   arrow = arrow(),
                   color="red")+
      annotate(geom = "text",x=2.5,y=50,label="Observed",
               vjust=-1)+
      geom_segment(aes(x=0.75,xend=2.25,y=250,yend=250),
                   arrow = arrow(ends = "both",
                                 angle=20,
                                 length = unit(3,'mm')),
                   color="#aadbe9")+
      annotate(geom = "text",x=1.5,y=250,
               label="Null distribution (1,000 draws)",
               vjust=-1)+
      theme_classic()+
      guides(x=guide_axis_truncated(trunc_lower = 0.5,
                                    trunc_upper = 3),
             y=guide_axis_truncated(trunc_lower = 0,
                                    trunc_upper = 300))+
      labs(y="Frequency",
           x="Enrichment folde of OMIM genes\nnear GWS SNPs with a density > 1")
    
    
    image.png

    第二个图b

    datb<-read_excel("extendFig4.xlsx",
                    sheet = "Panel b")
    datb
    ggplot(data=datb,aes(x=`Minimum Signal Density`,
                         y=`Enrichment statistic`))+
      geom_point()+
      geom_errorbar(aes(ymin=`Enrichment statistic`-`Standard Error of Enrichment Statistic`,
                        ymax=`Enrichment statistic`+`Standard Error of Enrichment Statistic`),
                    width=0.4)+
      scale_x_continuous(limits = c(0.5,10.5),
                         breaks = 1:10)+
      scale_y_continuous(limits = c(0,9),
                         breaks = 0:8)+
      theme_classic()+
      guides(x=guide_axis_truncated(trunc_lower = 1,
                                    trunc_upper = 10),
             y=guide_axis_truncated(trunc_lower = 0,
                                    trunc_upper = 8))+
      labs(x="Minimum Signal Density",
           y="Enrichment-fold of OMIM genes\nnear GWS SNPs")
    
    image.png

    最后是拼图

    library(patchwork)
    p1+p2
    
    image.png

    示例数据和代码可以给公众号推文点赞,点击在看,最后留言获取

    欢迎大家关注我的公众号

    小明的数据分析笔记本

    小明的数据分析笔记本 公众号 主要分享:1、R语言和python做数据分析和数据可视化的简单小例子;2、园艺植物相关转录组学、基因组学、群体遗传学文献阅读笔记;3、生物信息学入门学习资料及自己的学习笔记!

    相关文章

      网友评论

        本文标题:跟着Nature学作图:R语言ggplot2频率分布直方图和散点

        本文链接:https://www.haomeiwen.com/subject/ghpbfdtx.html