美文网首页R可视化
跟着BMC Plant Biology学作图:R语言ggtree

跟着BMC Plant Biology学作图:R语言ggtree

作者: 小明的数据分析笔记本 | 来源:发表于2023-02-01 11:24 被阅读0次

    论文

    Comparative analysis of de novo genomes reveals dynamic intra‑species divergence of NLRs in pepper

    数据和代码

    https://github.com/sdaf11111/NLR-map-in-pepper

    论文中Figure2的示例数据和代码作者放到了github主页,我们可以学习一下他的代码

    示例数据是一个nwk格式的树文件 和 一个csv格式的分组文件

    学到的新知识点

    R包 svglite 输出图片如果保存为svg格式可能会用到这个R包

    函数split()可以把数据框根据某一列分组转换成列表格式,文字表达可能有点看不明白,看一下函数的输出效果

    df<-data.frame(x=c("A","A","B","B","B"),
                   y=c(1,4,5,2,7))
    df
    split(df$y,df$x)
    
    image.png

    函数geom_abline() ggtree里的函数,可以把树末端补平

    接下来是实际的代码

    首先是加载需要用到的R包

    library(dplyr)
    library(ggtree)
    library(ggplot2)
    library(svglite)
    library(scales)
    

    读取数据

    info <- read.csv("data/20230202/Annuum.Intact.NBARC.group.trimal92.csv")
    head(info)
    tree <- read.tree("data/20230202/Annuum.Intact.NBARC.tree.trimal92.nwk.txt")
    

    给树添加分组信息

    groupInfo<-split(info$ID,info$Group)
    tree<- groupOTU(tree, groupInfo)
    

    指定颜色

    heatmap.colours <- c("#be9fe1","#8ac6d1","#e1ccec","#fddb3a",
                         "#C0C0C0","#c9b6e4","#d5c455","#ffb6b9","#fae3d9",
                         "#9aceff","#d7cde6","#bbded6","#ede59a","#4f98ca",
                         "#4a69bb","#f5cdaa","NA",
                         "#c3d14a","#63b637","#FF0000","#008000","#FF1493","#FF4500")
    names(heatmap.colours) <- c("G1","G2","G3","G4",
                                "G5","G6","G7","G8","G9",
                                "G10","G11","G12","G13","GT",
                                "GR","G14","NG",
                                "CHIL","Know","CANN","CECW","CZUN","CASF")
    

    作图代码

    p <- ggtree(tree, layout='circular', size=0.2) %<+% info +
      geom_aline(linetype="solid", 
                 size=0.5, aes(color=group),
                 alpha=0.5) +   
      scale_colour_manual(values=heatmap.colours,
                          breaks=c("G1","G2","G3","G4","G5",
                                   "G6","G7","G8","G9","G10",
                                   "G11","G12","G13","GT","GR",
                                   "G14","CHIL","Know","CANN",
                                   "CECW","CZUN","CASF"), 
                          name="Group") +
      theme(legend.position="right")+
      geom_tippoint(aes(color=Species), size=0.2)+
      guides(color=guide_legend(ncol=6))
    q <- flip(p, 3001, 4090) %>% rotate(3507)
    
    ggsave(file="NLR_tree.92.pdf", plot=q,
           width = 9.4,height = 4)
    
    image.png

    这里有一个问题是geom_abline()函数的效果如果是在Rstudio的图片显示界面是看不到的,如果保存为pdf就可以看到效果。暂时不明白是什么原因

    论文中提供的代码还有一部分是计算MRCA,这部分我暂时没有想明白,想明白了再来介绍吧

    示例数据和代码可以到论文中提到的链接处下载,或者给推文点赞,点击在看,最后留言获取

    欢迎大家关注我的公众号

    小明的数据分析笔记本

    小明的数据分析笔记本 公众号 主要分享:1、R语言和python做数据分析和数据可视化的简单小例子;2、园艺植物相关转录组学、基因组学、群体遗传学文献阅读笔记;3、生物信息学入门学习资料及自己的学习笔记!

    相关文章

      网友评论

        本文标题:跟着BMC Plant Biology学作图:R语言ggtree

        本文链接:https://www.haomeiwen.com/subject/cdlihdtx.html