美文网首页基因组
WGD (Whole Genome duplication)

WGD (Whole Genome duplication)

作者: 生信小白2018 | 来源:发表于2019-05-17 17:23 被阅读93次
    最近要做一下WGD,搜索到一款好用的软件wgd - simple command line tools for the analysis of ancient whole genome duplications
    

    运行环境

    Python3.5 & Python3.6 on Linux Ubuntu

    依赖软件

    For wgd blast:
        BLAST, from which it uses the blastp and makeblastdb commands
        MCL (https://micans.org/mcl/index.html)
    For wgd ks:
        One of the following multiple sequence alignment programs: MUSCLE, MAFFT or PRANK
        PAML (http://abacus.gene.ucl.ac.uk/software/paml.html).
        PhyML and FastTree (Note that FastTree should be executable as FastTree and not fasttree, so please specify an alias or symlink from the latter to the former if necessary.)
    For wgd syn: 
        i-ADHoRe 3.0 suite (http://bioinformatics.psb.ugent.be/beg/tools/i-adhore30)
    

    安装

    conda create--name python36 python=3.6    ##conda 创建python36虚拟环境
    source activate python36    ##激活python36
    
    $ git clone https://github.com/arzwa/wgd.git
    $ cd wgd
    $ pip install .
    

    安装成功后 输入wgd会显示软件标识

    $ wgd
    Usage: wgd [OPTIONS] COMMAND [ARGS]...
    
      Welcome to the wgd command line interface!
    
                             _______
                             \  ___ `'.
             _     _ .--./)   ' |--.\  \
       /\    \\   ///.''\\    | |    \  '
       `\\  //\\ //| |  | |   | |     |  '
         \`//  \'/  \`-' /    | |     |  |
          \|   |/   /("'`     | |     ' .'
           '        \ '---.   | |___.' /'
                     /'""'.\ /_______.'/
                    ||     ||\_______|/
                    \'. __//
                     `'---'
    

    下载测试数据运行软件

    wget ftp://ftp.psb.ugent.be/pub/plaza/plaza_public_dicots_04/Fasta/cds.all_transcripts.ath.fasta.gz
    wget ftp://ftp.psb.ugent.be/pub/plaza/plaza_public_dicots_04/GFF/ath/annotation.all_transcripts.all_features.ath.gff3.gz
    wget ftp://ftp.psb.ugent.be/pub/plaza/plaza_public_dicots_04/Fasta/cds.all_transcripts.car.fasta.gz
    wget ftp://ftp.psb.ugent.be/pub/plaza/plaza_public_dicots_04/GFF/car/annotation.all_transcripts.all_features.car.gff3.gz
    gunzip cds.all_transcripts.ath.fasta.gz 
    gunzip annotation.all_transcripts.all_features.ath.gff3.gz
    gunzip cds.all_transcripts.car.fasta.gz
    gunzip annotation.all_transcripts.all_features.car.gff3.gz
    mv cds.all_transcripts.ath.fasta.gz ath.fasta
    mv cds.all_transcripts.car.fasta.gz catr.fasta
    mv annotation.all_transcripts.all_features.ath.gff3 ath.gff
    mv annotation.all_transcripts.all_features.car.gff3 car.gff
    
    source activate python36
    ### wgd mcl 生成.mcl文件
    wgd mcl -s ath.fasta --cds --mcl -o ath_out
    wgd mcl -s car.fasta --cds --mcl -o papaya_out
    wgd mcl --cds --one_v_one -s ath.fasta,car.fasta -id ath,car -e 1e-8 -o ath_car_out
    
    mkdir ks_out   ## 将上一步产生的.mcl文件转移到新文件夹ks_out
    mv ath_out/ath.fasta.blast.tsv.mcl ks_out/ath.mcl
    mv car_out/car.fasta.blast.tsv.mcl ks_out/car.mcl
    mv ath_car_out/ath_car.ovo.tsv ks_out/ath_car.mcl
    
    ### wgd ksd 将.mcl文件计算为Ks distribution
    wgd ksd ath.mcl ath.fasta  -n 8 -o ath_ks
    wgd ksd car.mcl car.fasta -n 8 -o car_ks
    wgd ksd -o ath_car_ks ath_car.mcl ath.fasta car.fasta -n 8
    
    mkdir ksout ##将上一步产生的.ks.tsv文件转移到新文件夹ksout
    
    ###wgd viz画图
    #单独plot
    wgd viz -ks ath.ks.tsv  
    
    wgd viz -ks ksout/ -c red,blue,yellow
    
    #合并plot
    bokeh serve &       ##$代表后台运行
    wgd viz -i -ks ath.fasta.ks.tsv,ath.fasta_car.fasta.ks.tsv,car.fasta.ks.tsv
    
    

    画图结果

    image.png

    参考:https://github.com/arzwa/wgd

    相关文章

      网友评论

        本文标题:WGD (Whole Genome duplication)

        本文链接:https://www.haomeiwen.com/subject/xczgaqtx.html