美文网首页
微生物多样性qiime2分析流程(12) 数据可视化分析(中)

微生物多样性qiime2分析流程(12) 数据可视化分析(中)

作者: R语言数据分析指南 | 来源:发表于2020-11-17 21:49 被阅读0次
常用的排序方法如下:
间接排序法包括:

*   PCA(principal components analysis,主成分分析)
*   CA(correspondence analysis,对应分析)
*   DCA(Detrended correspondenceanalysis, 去趋势对应分析)
*   PCoA(principal coordinate analysis,主坐标分析)
*   NMDS(non-metric multi-dimensional scaling,非度量多维尺度分析);

直接排序法包括:
*   RDA(Redundancy analysis,冗余分析)
*   CCA(canonical correspondence analysis,典范对应分析)
*   dbRDA(distance based redundancy analysis,基于距离的冗余分析)
*   CAP(canonical analysis of principal coordinates,主要坐标的典型分析)

其中PCA和RDA是基于线性模型(linear model)的,
而CA、DCA、CCA、DCCA是基于单峰(unimodal)模型

选择单峰模型还是线性模型?

用DCA(vegan::decorana())先对数据(site-species)进行分析;
查看结果中的“Axis lengths”的第一轴DCA1的值,
根据该值判断该采用线性模型还是单峰模型:
如果大于4.0,就应该选单峰模型;
如果3.0-4.0之间,选线性模型或者单峰模型均可;
如果小于3.0, 线性模型的结果要好于单峰模型
rm(list = ls())
pacman::p_load(tidyverse,vegan,ggrepel)  
species <- read.delim("phylum.top10.xls",header = T,
                  row.names = 1,check.names = F,sep="\t") %>% 
  t() %>% decostand(method = "hellinger")

envi <- read.delim("env1.log10.txt",
header = T,row.names = 1,check.names = F,sep="\t")

grp <- read.delim("group.xls",header = T,check.names = F,sep="\t")

decorana(species)
#根据看分析结果中Axis Lengths的第一轴的大小
#如果大于4.0,就应选CCA(基于单峰模型,典范对应分析)
#如果在3.0-4.0之间,选RDA和CCA均可
#如果小于3.0, RDA的结果会更合理(基于线性模型,冗余分析)

p <- rda(species~.,envi) %>% summary()
sp <- as.data.frame(p$species[,1:2])*2
st <- as.data.frame(p$sites[,1:2])
yz <- as.data.frame(p$biplot[,1:2])
ggplot() + geom_point(data=st,aes(RDA1,RDA2,
                                  shape=grp$group,fill=grp$group),size=3)+
  scale_shape_manual(values =c(18,19,21))+
  geom_segment(data = sp,aes(x = 0, y = 0, xend = RDA1, yend = RDA2), 
               arrow = arrow(angle=22.5,length = unit(0.35,"cm"),
                             type = "closed"),linetype=1,
 size=0.6,colour = "#E69F00")+
  geom_text_repel(data = sp,aes(RDA1,RDA2),label=row.names(sp))+
  geom_segment(data = yz,aes(x = 0, y = 0,
xend = RDA1, yend = RDA2), 
               arrow = arrow(angle=45,length = unit(0.35,"cm"),
                             type = "closed"),linetype=1,
               size=0.6,colour = "steelblue")+
  geom_text_repel(data = yz,aes(RDA1,RDA2,label=row.names(yz)))+
  labs(x=paste("RDA1(", format(100*p$cont[[1]][2,1],
digits=4),"%)",sep=""),
       y=paste("RDA2 (", format(100*p$cont[[1]][2,2], 
digits=4),"%)", sep=""))+
  geom_hline(yintercept=0,linetype=3,size=1) + 
  geom_vline(xintercept=0,linetype=3,size=1)+
  guides(shape=guide_legend(title=NULL,color="black"),
         fill=guide_legend(title=NULL))+
  theme_bw()+theme(panel.grid=element_blank())
rda.jpeg
rm(list=ls())
pacman::p_load(tidyverse,vegan,ggrepel)  
sample <- read.table("otu_table.txt", header = T, 
                         row.names=1,sep="\t") %>% t()
env <- read.table("environment.txt", header=TRUE, row.names=1,sep="\t")
group <- read.table("groups.txt", header=T) %>% as.list()
decorana(sample)
cca <- cca(sample, env, scale = TRUE)
p <- scores(cca)
CCAE <- as.data.frame(cca$CCA$biplot[,1:2])
plotdata <- data.frame(rownames(p$sites), p$sites[,1],p$sites[,2], group$group)
colnames(plotdata) <- c("sample","CCAS1","CCAS2","group")
#----------------------------------------------------------------------------------
ggplot(plotdata, aes(CCAS1, CCAS2)) +
  geom_label_repel(aes(CCAS1, CCAS2, label = sample), fill = "white",
                   color = "black", box.padding = unit(0.6,"lines"),
                   segment.colour = "grey50",
                   label.padding = unit(0.35,"lines")) +  
  geom_point(aes(fill = group,color=group),size = 3) + 
  labs(title = "CCA Plot",
       x = round(cca$CCA$eig[1]/sum(cca$CCA$eig)*100,2) %>%
  paste0("CCA1 ( ",.,"%"," )"),
       y = round(cca$CCA$eig[2]/sum(cca$CCA$eig)*100,2) %>%
  paste0("CCA2 ( ",.,"%"," )")) +
  geom_segment(data = CCAE, aes(x = 0, y = 0, xend = CCAE[,1], yend = CCAE[,2]),
               colour = "black", size = 1.2,
               arrow = arrow(angle = 30, length = unit(0.4, "cm"))) +
  geom_label_repel(data = CCAE, fill = "grey90",segment.colour = "black",
                   aes(x = CCAE[,1], y = CCAE[,2], label = rownames(CCAE))) +
  geom_vline(aes(xintercept = 0), linetype = "dotted") +
  geom_hline(aes(yintercept = 0), linetype = "dotted") +
  theme(panel.background = element_rect(fill = "white", colour = "black"), 
        panel.grid = element_blank(),
        axis.title = element_text(color = "black", size = 12),
        axis.ticks.length = unit(0.4,"lines"),
        axis.ticks = element_line(color = "black"),
        axis.line = element_line(colour = "black"),
        axis.title.x = element_text(colour = "black", size = 12),
        axis.title.y = element_text(colour="black", size = 12),
        axis.text = element_text(colour = "black", size = 12),
        legend.title = element_blank(),
        legend.text = element_text(size = 12), legend.key = element_blank(),
        plot.title = element_text(size = 14, colour = "black", 
                                  face = "plain", hjust = 0.5))  
cca.jpeg

相关文章

网友评论

      本文标题:微生物多样性qiime2分析流程(12) 数据可视化分析(中)

      本文链接:https://www.haomeiwen.com/subject/qntebktx.html