美文网首页
数据分析:扩增子进化树

数据分析:扩增子进化树

作者: 生信学习者2 | 来源:发表于2020-12-29 07:45 被阅读0次

OTU进化树关系,从qiime2的结果文件入手。更多知识分享请到 https://zouhua.top/

生成树文件和进化关系文件

  • rep_seq_k5_cln.tree

  • rep_seqs_k5.tax

# install software 
conda install -n qiime -c bioconda qiime -y 
source activate py27
conda install clustalo -y

# step1: count -> relative abundance
biom normalize-table -i feature-table.biom -r -o feature-table_norm.biom

# step2: filter abundance 0.5%
filter_otus_from_otu_table.py --min_count_fraction 0.005 -i feature-table_norm.biom -o feature-table_norm_k5.biom

# step3: get fasta
#qiime tools export --input-path rep-seqs.qza --output-path rep-seqs
filter_fasta.py -f rep_seq.fa -o rep_seq_k5.fa -b feature-table_norm_k5.biom

# step4: multip alignment
clustalo -i rep_seq_k5.fa -o rep_seq_k5_clus.fa --seqtype=DNA --full --force --threads=30

# step5: fastree build tree file
make_phylogeny.py -i rep_seq_k5_clus.fa -o rep_seq_k5.tree

# step6: delete ' signal for R ggtree
sed "s/\'//g" rep_seq_k5.tree > rep_seq_k5_cln.tree

# step7: get annotation information
grep ">" rep_seq_k5_clus.fa | sed 's/>//g' > rep_seq_k5_clus.id
awk 'BEGIN{OFS="\t";FS="\t"} NR==FNR {a[$1]=$0} NR>FNR {print a[$1]}' taxonomy.tsv rep_seq_k5_clus.id | sed 's/; /\t/g' | cut -f 1-5 | sed 's/;/\t/g' | cut -f 1-5 > rep_seqs_k5.tax

可视化

因为qiime2的ASV名称是复杂编码,所以需要对其进行修改,但最好在得到otu table时候就同一修改

library(dplyr)
library(tibble)
library(ggtree)
library(ggplot2)

# load data 
tree <- read.tree("rep_seq_k5_cln.tree")
tax <- read.table("rep_seqs_k5.tax", row.names=1)

# curation
colnames(tax) <- c("kingdom","phylum","class","order")
otuid <- data.frame(OTUID=tree$tip.label, 
                    OTU_ID=paste0("OTU_", seq(1:length(tree$tip.label))))
tax <- inner_join(tax %>% rownames_to_column("OTUID"), otuid, by = "OTUID") %>%
  column_to_rownames("OTU_ID") %>%
  dplyr::select(-OTUID)
tree$tip.label <- paste0("OTU_", seq(1:length(tree$tip.label)))

plot 1

groupInfo <- split(row.names(tax), tax$phylum) 
tree <- groupOTU(tree, groupInfo)
ggtree(tree, aes(color=group))+  
  theme(legend.position = "right")+
  geom_tiplab(size=3)

plot2

ggtree(tree, layout="fan", ladderize = FALSE, size=1.2, 
       branch.length = "none", aes(color=group))+
  geom_tiplab2(size=4)+ 
  theme(legend.position = "right")

参考

  1. 扩增子进化树分析

参考文章如引起任何侵权问题,可以与我联系,谢谢。

相关文章

网友评论

      本文标题:数据分析:扩增子进化树

      本文链接:https://www.haomeiwen.com/subject/idhrgktx.html