kaks计算

作者: 清水咸鱼 | 来源:发表于2020-04-02 22:34 被阅读0次

    简介

    简单多线程快速计算同源基因对kaks

    依赖工具

    • ParaAT2.0
    • KaKs_Calculator2.0

    ParaAT 使用说明

    export PATH=/storage_wut/user/software/ParaAT2.0:$PATH
    
    cd /storage_wut/user/software/ParaAT2.0
    
    ParaAT.pl -h test.homologs -n test.cds -a test.pep -p proc -o output -f axt
    --------------------------------
    -h, 指定同源基因列表文件
    -n, 指定核酸序列文件
    -a, 指定蛋白序列文件
    -p, 指定多线程文件                      ## 文件中给定线程数,默认为6
    -m, 指定比对工具                        ## muscle
    -g, 去除比对有gap的密码子
    -k, 用KaKs_Calculator                   ## 计算kaks值
    -o, 输出结果的目录
    -f, 输出比对文件的格式
    

    计算 kaks

    echo start at time `date +%F'  '%H:%M:%S`
    
    export PATH=/storage_wut/user/software/ParaAT2.0:$PATH
    export PATH=/storage_wut/user/software/KaKs_Calculator2.0/bin/Linux/:$PATH
    
    cd /storage_wut/user/project/06lumeng_project/19.homologs_kaks/01.kaks
    
    ParaAT.pl -h ../00.data/A_CC.collinearity_one2one.dat -n ../00.data/homo.gene.cds.fa -a ../00.data/homo.gene.pep.fa -p proc -m muscle -f axt -g -k -o result_dir
    
    cat ./result_dir/*kaks |awk 'NR==1;NR>=1 { print $0| "grep -v Sequence"}' > ../all.kaks.result.xls
    less all.kaks.result.xls  |cut -f 5|grep -v 'NA' > kaks.list
    
    echo finish at time `date +%F'  '%H:%M:%S`
    
    
    ### all.kaks.result.xls 文件格式
    Sequence        Method  Ka      Ks      Ka/Ks   P-Value(Fisher) Length  S-Sites N-Sites Fold-Sites(0:2:4)       Substitutions   S-Substitutions N-Substitutio
    Cg-F_10146-gene7838     MA      0.0194491       0.172237        0.112921        6.96313e-06     303     67.5573 235.443 NA      14      10.0464 3.95362 NA
    Cg-F_11450-gene46992    MA      0.018447        0.18238 0.101146        8.74657e-22     1335    376.13  958.87  NA      75      59.6254 15.3746 NA      NA
    Cg-F_11533-gene3021     MA      0.0364833       0.133713        0.272848        3.03892e-07     984     254.578 729.422 NA      56      31.4295 24.5705 NA
    Cg-F_11705-gene4507     MA      0.043183        0.281557        0.153372        5.71615e-10     450     99.3644 350.636 NA      37      24.007  12.993  NA
    Cg-F_11829-gene26952    MA      0.0670496       0.195014        0.343819        0.000123585     528     128.586 399.414 NA      47      22.7275 24.2725 NA
    Cg-F_12075-gene67778    MA      0.163755        0.446331        0.366892        4.00233e-08     510     129.087 380.913 NA      96      46.0956 49.9044 NA
    Cg-F_12095-gene37099    MA      0.0459748       0.131137        0.350585        3.28611e-05     1056    236.285 819.715 NA      64      28.8778 35.1222 NA
    Cg-F_12212-gene32496    MA      0.0351454       0.113734        0.309015        0.000255903     639     182.649 456.351 NA      34      19.1865 14.8135 NA
    Cg-F_12217-gene33956    MA      0.0545515       0.128713        0.423823        0.00831507      552     132.318 419.682 NA      37      15.7831 21.2169 NA
    
    

    绘制 kaks 条形图

    rm(list = ls())
    library(ggplot2)
    windowsFonts(myFont = windowsFont("Times New Roman"))
    setwd("D:\\gooagle_data\\work_r\\kaks")
    data <- read.table("kaks.list",sep='\t')
    ggplot(data,aes(V1))+ geom_histogram(color='#39A0FE',fill='#39A0FE', binwidth = 0.5)
    
    ggplot(data,aes(V1))+ geom_histogram(fill='#39A0FE', binwidth = 0.03,color='white')+ 
      ylab(label = 'Number of gene pair')+xlab(label = 'ka/ks')+theme_classic()+
      theme(axis.title = element_text(size=20),axis.text = element_text(size = 18,color = "black"))+
      scale_x_continuous(limits = c(-0.1,5),breaks=c(0,1,2,3,4,5))
    
    
    ka/ks条形图.png

    参考资料

    kaks计算--刘辉
    一键批量计算kaks
    使用ParaAT和kaks_calculator批量Kaks批量计算

    相关文章

      网友评论

        本文标题:kaks计算

        本文链接:https://www.haomeiwen.com/subject/kmhqphtx.html