美文网首页基因组组装
关于基因家族分析流程的备忘

关于基因家族分析流程的备忘

作者: SnorkelingFan凡潜 | 来源:发表于2021-09-08 16:43 被阅读0次
    perl /02_Cluster_stat_v1.1/bin/step3_Cluster_stat_family.pl category.txt all.cds cluster.stat-info --cluster_file all_orthomcl.out --type orthomcl --step 134 -q x.q
    perl /07_orthomcl_pipeline_v1.0/bin/obtain_4d_phase1.pl all.philip
    
    ## vi step3_Cluster_stat_family.pl
    
    this script is used for stat infomation form the result of orthomcl or treefam.
    
    1.stat cluster infomation from cluster file .
            File require :cluster_file  category.txt.new cluster_stat_out;
            Output : cluster_stat_out;
    
    2.stat the cluster family from the cluster_stat_out, and draw veen_svg.
            File require : category.txt.new  all.cds cluster_stat_out;
            Output : 4spec_veen.input;
    
    3.stat the genefamilies information, such as of_gene,unique_family,single_gene.
            File require : category.txt.new all.cds cluster_stat_out;
            Output : family.stat.table;
    
    4.filter the single_copy family from the orthomcl.out,and put the correspond cds together into the genefamily category.
    then translate it to pep,run muscle.
    and abstract all.philip from singlecopy genefamily
            File require : cluster_file all.cds category.txt.new;
            Output : ./singlecopy_genefamily/ ;
    
    

    抽出的单拷贝同源基因家族只是用来建了树;流程得到的all.philip等所有philip文件均是来源于单拷贝同源基因家族,后面建树也就是基于这些文件,即全部都是单拷贝的。

    基因家族的聚类文件cluster.stat-info包含所有家族的拷贝数,每个id即是一个基因家族.

    $ tail -n 1 cluster.stat-info
    26006   2   0   2   0   0   0   0   0   0   0   0   0   0   0   1
    

    这个案例中得到的基因家族数目是26006个

    相关文章

      网友评论

        本文标题:关于基因家族分析流程的备忘

        本文链接:https://www.haomeiwen.com/subject/pibvwltx.html