data:image/s3,"s3://crabby-images/24d9e/24d9e333064cb1584bc2f06a70d5e8e37d96ca98" alt=""
PhyloPhlAn3.0:https://huttenhower.sph.harvard.edu/phylophlan
GitHub:https://github.com/biobakery/phylophlan
一、安装:
conda install -c bioconda phylophlan=3.0
data:image/s3,"s3://crabby-images/60612/60612403f28ab3827b5341ae614d493dcb99c6f5" alt=""
失败
conda create -n python3.7 -c bioconda python=3.7 #创建新的环境
conda activate python3.7 # 进入环境
conda install phylophlan=3.0 # 安装
phylophlan --version # 检查
# PhyloPhlAn version 3.0.51 (11 May 2020)
# PhyloPhlAn version 3.0.60 (27 November 2020)
成功
另外安装metaphlan3时会安装phyliphlan3作为依赖。
conda create -n metaphlan python=3.7
conda activate metaphlan
conda install tbb=2020.2
conda install bowtie2
conda install -c bioconda metaphlan
phylophlan --version
# PhyloPhlAn version 3.0.60 (27 November 2020)
成功
二、获取数据库
data:image/s3,"s3://crabby-images/02891/02891d56f6233fc846d07bffa6facbedd3faebc7" alt=""
phylophlan有自己的数据库,也支持自建数据库。
地址:http://cmprod1.cibio.unitn.it/databases/PhyloPhlAn/phylophlan_databases.txt
#database_name database_url database_md5
amphora2
http://cmprod1.cibio.unitn.it/databases/PhyloPhlAn/amphora2.tar
http://cmprod1.cibio.unitn.it/databases/PhyloPhlAn/amphora2.md5
#amphora2 https://zenodo.org/record/4005745/files/amphora2.tar?download=1 https://zenodo.org/record/4005745/files/amphora2.md5?download=1
phylophlan
http://cmprod1.cibio.unitn.it/databases/PhyloPhlAn/phylophlan.tar
http://cmprod1.cibio.unitn.it/databases/PhyloPhlAn/phylophlan.md5
#phylophlan https://zenodo.org/record/4005620/files/phylophlan.tar?download=1 https://zenodo.org/record/4005620/files/phylophlan.md5?download=1
linux bad connection, win下载tar压缩文件:
# md5编号
587698f1b8593daba2719d587ba43463 amphora2.tar
9b3ce73a1d4808620161c27d7a739b48 phylophlan.tar
# 验证md5,不报错就是没错
diff <(md5sum amphora2.tar) amphora2.md5
diff <(md5sum phylophlan.tar) phylophlan.md5
悄悄改一个编号会被发现,测试diff的使用,
data:image/s3,"s3://crabby-images/47d8b/47d8bc45e4bdfec4f557c9f6c0291e19b585c61b" alt=""
解压
tar -xf amphora2.tar # 解压文件夹
bzcat amphora2/*.bz2 > amphora2/amphora2.faa # 解压合并文件
amphora一共136个marker gene,合并到一个faa蛋白序列文件
data:image/s3,"s3://crabby-images/62f41/62f4101bd98921055e7d1126ebeb640ef4c72834" alt=""
tar -xf phylophlan.tar # 解压文件夹
bunzip2 -k phylophlan/phylophlan.faa.bz2
phylophlan只有一个蛋白序列文件,34万条蛋白序列,
data:image/s3,"s3://crabby-images/1857a/1857a3c6749ae5684e7c355d02eb50677063f627" alt=""
最后清除所有压缩文件。
三、建数据库索引
diamond索引
diamond makedb --in amphora2/amphora2.faa --db amphora2/amphora2
diamond makedb --in phylophlan/phylophlan.faa --db phylophlan/phylophlan
data:image/s3,"s3://crabby-images/0179f/0179fb13b8b26b43f59083dac158e5c0bee6ecbe" alt=""
四、Tutorials
案例教程:https://github.com/biobakery/biobakery/wiki/PhyloPhlAn3
data:image/s3,"s3://crabby-images/9d105/9d1055ebcf8d88f142e1535807e1781598346020" alt=""
-
案例一:
1 获取S. aureus一个基因组
2 联网获取S. aureus基因组的UniRef90 core proteins
phylophlan_setup_database -g s__Staphylococcus_aureus
3 进化树
4 获取S. aureus更多参考基因组
5 进化树,GraPhlAn可视化 -
案例二:不依赖UniRef90,重建进化树
phylophlan -d phylophlan
-
案例三:meta组装SGB数据处理
1 获取Ethiopian宏基因组genome bins
2 SGB.Jan19作为参考给bins做注释phylophlan_metagenomic -d SGB.Jan19
3 热图展示Bin在样本中的有无,SGB的分类和数量
第一张热图显示在埃塞俄比亚人群中发现的前21个 SGB 的存在/缺失情况;第二张热图显示每个宏基因组样本中有多少 uSGBs (unknown)、 kSGBs (known) 和未分配的分箱。
data:image/s3,"s3://crabby-images/43ffd/43ffdd3f8e395f8ee7dfa5204e8a329a2630715a" alt=""
data:image/s3,"s3://crabby-images/f0ae4/f0ae41a4ef7791ac03a4de3d404e13a8841f7c58" alt=""
-
案例四:
1 获取E. coli bins
2 获取E. coli bins的core set of UniRef90 proteins
phylophlan_setup_database -g s__Escherichia_coli
3 添加E. coli参考基因组
phylophlan_get_reference -g s__Escherichia_coli
4 configure,建树 -
案例五:uSGB和close phyla reference建树
1 获取uSGB
2 获取Epsilonproteobacteria class参考-g c__Epsilonproteobacteria
3 获取close phyla参考-g p__Spirochaetes
4 configure,建树
网友评论