美文网首页metagenomic宏基因组微生物
MetaPhlAn2宏基因组物种注释

MetaPhlAn2宏基因组物种注释

作者: 胡童远 | 来源:发表于2020-08-10 09:17 被阅读0次

    导读

    上一篇介绍了MetaPhlAn:宏基因组微生物分类分析教程,这次来学习MetaPhlAn2的使用方法。

    bitbucket地址:https://bitbucket.org/biobakery/biobakery/wiki/metaphlan2

    依赖:
    Python (version >= 2.7)
    Bowtie2
    Numpy
    Pandas (optional, only required by utility scripts)
    BioPython (optional, only required by utility scripts)
    SciPy (optional, only required by utility scripts)
    Matplotlib (optional, only required by utility scripts)
    biom (optional, only required for <tt class="docutils literal">biom</tt> format input/output)

    一、conda安装

    conda install -c bioconda metaphlan2
    

    二、测序数据

    windows下载:
    SRS014476-Supragingival_plaque.fasta.gz
    SRS014494-Posterior_fornix.fasta.gz
    SRS014459-Stool.fasta.gz
    SRS014464-Anterior_nares.fasta.gz
    SRS014470-Tongue_dorsum.fasta.gz
    SRS014472-Buccal_mucosa.fasta.gz

    linux下载:

    curl -O https://bitbucket.org/biobakery/biobakery/raw/tip/demos/biobakery_demos/data/metaphlan2/input/SRS014476-Supragingival_plaque.fasta.gz
    curl -O https://bitbucket.org/biobakery/biobakery/raw/tip/demos/biobakery_demos/data/metaphlan2/input/SRS014494-Posterior_fornix.fasta.gz
    curl -O https://bitbucket.org/biobakery/biobakery/raw/tip/demos/biobakery_demos/data/metaphlan2/input/SRS014459-Stool.fasta.gz
    

    三、MetaPhlAn2分析

    1. 准备

    mkdir metaphlan2_analysis
    mv ~/Downloads/SRS*.fasta.gz metaphlan2_analysis/
    cd metaphlan2_analysis
    ls
    

    2. 单样品分析

    # 分析第一个样品
    metaphlan2.py SRS014476-Supragingival_plaque.fasta.gz  --input_type fasta > SRS014476-Supragingival_plaque_profile.txt
    
    # 查看比对结果
    less -S SRS014476-Supragingival_plaque.fasta.gz.bowtie2out.txt
    
    # 查看单样品物种丰度表
    less -S SRS014476-Supragingival_plaque_profile.txt
    
    # 多线程模式,第2个样品
    metaphlan2.py SRS014459-Stool.fasta.gz --input_type fasta --nproc 4 > SRS014459-Stool_profile.txt
    

    3. 多样品分析

    # 剩下的4个样品
    metaphlan2.py SRS014464-Anterior_nares.fasta.gz --input_type fasta --nproc 4 > SRS014464-Anterior_nares_profile.txt
    metaphlan2.py SRS014470-Tongue_dorsum.fasta.gz --input_type fasta --nproc 4 > SRS014470-Tongue_dorsum_profile.txt
    metaphlan2.py SRS014472-Buccal_mucosa.fasta.gz --input_type fasta --nproc 4 > SRS014472-Buccal_mucosa_profile.txt
    metaphlan2.py SRS014494-Posterior_fornix.fasta.gz --input_type fasta --nproc 4 > SRS014494-Posterior_fornix_profile.txt
    

    或者

    # 一个循环完成6个样品的分析
    for f in SRS*.fasta.gz
    do
        metaphlan2.py $f --input_type fasta --nproc 4 > ${f%.fasta.gz}_profile.txt
    done
    

    4. 六个样品的物种丰度表
    SRS014459-Stool_profile.txt
    SRS014464-Anterior_nares_profile.txt SRS014470-Tongue_dorsum_profile.txt
    SRS014472-Buccal_mucosa_profile.txt
    SRS014476-Supragingival_plaque_profile.txt
    SRS014494-Posterior_fornix_profile.txt

    5. 六个样品的比对结果
    SRS014459-Stool.fasta.gz.bowtie2out.txt
    SRS014464-Anterior_nares.fasta.gz.bowtie2out.txt
    SRS014470-Tongue_dorsum.fasta.gz.bowtie2out.txt
    SRS014472-Buccal_mucosa.fasta.gz.bowtie2out.txt
    SRS014476-Supragingival_plaque.fasta.gz.bowtie2out.txt
    SRS014494-Posterior_fornix.fasta.gz.bowtie2out.txt

    6. 合并六个样品的物种丰度表

    merge_metaphlan_tables.py *_profile.txt > merged_abundance_table.txt
    

    获取结果总表:merged_abundance_table.txt

    # 查看结果总表
    less -S merged_abundance_table.txt
    

    四、hcluast2绘制热图

    1. conda安装hclust2

    conda install -c biobakery hclust2
    

    2. 提取种水平丰度信息

    grep -E "(s__)|(^ID)" merged_abundance_table.txt | grep -v "t__" | sed 's/^.*s__//g' > merged_abundance_table_species.txt
    

    3. 绘制热图

    hclust2.py -i merged_abundance_table_species.txt -o abundance_heatmap_species.png --ftop 25 --f_dist_f braycurtis --s_dist_f braycurtis --cell_aspect_ratio 0.5 -l --flabel_size 6 --slabel_size 6 --max_flabel_len 100 --max_slabel_len 100 --minv 0.1 --dpi 300
    

    五、GraPhlAn绘制进化树

    1. conda安装GraPhlAn

    conda install -c biobakery graphlan
    

    2. 准备输入文件

    获取merged_abundance.tree.txt和merged_abunance.annot.txt

    export2graphlan.py --skip_rows 1,2 -i merged_abundance_table.txt --tree merged_abundance.tree.txt --annotation merged_abundance.annot.txt --most_abundant 100 --abundance_threshold 1 --least_biomarkers 10 --annotations 5,6 --external_annotations 7 --min_clade_size 1
    

    3. 绘制进化树

    获取:
    merged_abundance.xml
    merged_abundance.png
    merged_abundance_legend.png
    merged_abundance_annot.png

    graphlan_annotate.py --annot merged_abundance.annot.txt merged_abundance.tree.txt merged_abundance.xml
    graphlan.py --dpi 300 merged_abundance.xml merged_abundance.png --external_legends
    

    六、PanPhlAn绘制种水平heatmap

    PanPhlAn教程

    1. 输入数据

    MetaPhlAn intermediate bowtie2 output files

    13530241_SF05.fasta.gz.bowtie2out.txt
    13530241_SF06.fasta.gz.bowtie2out.txt
    19272639_SF05.fasta.gz.bowtie2out.txt
    19272639_SF06.fasta.gz.bowtie2out.txt
    40476924_SF05.fasta.gz.bowtie2out.txt
    40476924_SF06.fasta.gz.bowtie2out.txt

    2. 创建所选物种丰度表

    物种:s__Eubacterium_siraeum
    丰度:大于1%

    metaphlan2.py --input_type bowtie2out -t clade_specific_strain_tracker --clade s__Eubacterium_siraeum --min_ab 1.0 13530241_SF05.fasta.gz.bowtie2out.txt > 13530241_SF05.siraeum.txt
    metaphlan2.py --input_type bowtie2out -t clade_specific_strain_tracker --clade s__Eubacterium_siraeum --min_ab 1.0 13530241_SF06.fasta.gz.bowtie2out.txt > 13530241_SF06.siraeum.txt
    metaphlan2.py --input_type bowtie2out -t clade_specific_strain_tracker --clade s__Eubacterium_siraeum --min_ab 1.0 19272639_SF05.fasta.gz.bowtie2out.txt > 19272639_SF05.siraeum.txt
    metaphlan2.py --input_type bowtie2out -t clade_specific_strain_tracker --clade s__Eubacterium_siraeum --min_ab 1.0 19272639_SF06.fasta.gz.bowtie2out.txt > 19272639_SF06.siraeum.txt
    metaphlan2.py --input_type bowtie2out -t clade_specific_strain_tracker --clade s__Eubacterium_siraeum --min_ab 1.0 40476924_SF05.fasta.gz.bowtie2out.txt > 40476924_SF05.siraeum.txt
    metaphlan2.py --input_type bowtie2out -t clade_specific_strain_tracker --clade s__Eubacterium_siraeum --min_ab 1.0 40476924_SF06.fasta.gz.bowtie2out.txt > 40476924_SF06.siraeum.txt
    

    结果:
    13530241_SF05.siraeum.txt
    13530241_SF06.siraeum.txt
    19272639_SF05.siraeum.txt
    19272639_SF06.siraeum.txt
    40476924_SF05.siraeum.txt
    40476924_SF06.siraeum.txt

    3. 结果合并

    merge_metaphlan_tables.py *.siraeum.txt > siraeum_tracker.txt
    

    4. 绘制热图

    hclust2.py -i siraeum_tracker.txt -o siraeum_tracker.png --skip_rows 1 --f_dist_f hamming --no_flabels --dpi 300 --cell_aspect_ratio 0.01
    

    相关文章

      网友评论

        本文标题:MetaPhlAn2宏基因组物种注释

        本文链接:https://www.haomeiwen.com/subject/tzoldktx.html