美文网首页
Rfam本地安装教程

Rfam本地安装教程

作者: 小杜的生信筆記 | 来源:发表于2019-08-20 20:23 被阅读0次

    Rfam简介

    Rfam是Rfam是用来鉴定non-coding RNAs的数据库,常用于注释新的核酸序列或者基因组序列。Rfam:http://eddylab.org/infernal/
    Rfam用户手册:http://eddylab.org/infernal/Userguide.pdf

    1. 下载infernal软件

    # infernal-1.1.1.tar.gz 下载软
    #在你安装软件的文件中建立一个Rfam的文件
    wget http://eddylab.org/software/infernal/infernal-1.1.1.tar.gz 
    tar xf infernal/infernal-1.1.1.tar.gz
    cd infernal/infernal-1.1.1.tar.gz
    ./configure  --prefix=`pwd`/../infernal_bin
    #安装步骤
    make 
    make install
    cd easel; make install
    cd ../../infernal_bin/bin
    ls
    #在该文件夹值就可以看到已安装的文件
    export PATH=${PATH}:`pwd`  #改变环境变量
    

    2.下载数据库

    wget ftp://ftp.ebi.ac.uk/pub/databases/Rfam/12.2/Rfam.cm.gz
    gunzip Rfam.cm.gz
    wget ftp://ftp.ebi.ac.uk/pub/databases/Rfam/12.2/Rfam12.2.claninfo
    #使用infernal中的cmpress引索Rfam.cm
    ../infernal_bin/bin/cmpress Rfam.cm  #我的必须进入到该文件家中进行
    #输出文件
    Working...    done.
    Pressed and indexed 2588 CMs and p7 HMM filters (2588 names and 2588 accessions).
    Covariance models and p7 filters pressed into binary file:  Rfam.cm.i1m
    SSI index for binary covariance model file:                 Rfam.cm.i1i
    Optimized p7 filter profiles (MSV part)  pressed into:      Rfam.cm.i1f
    Optimized p7 filter profiles (remainder) pressed into:      Rfam.cm.i1p
    #表示完成
    

    3. 查询待测基因组的大小【必须】

    ../infernal_bin/bin/esl-seqstat ~/M.truncatula/Medtr_v4_0v1/JCVI.Medtr.v4.20130313.fasta
    #输出
    Format:              FASTA
    Alphabet type:       DNA
    Number of sequences: 230
    Total # residues:    532015 #该行是我们需要的数字考虑到基因组为双链和下一步用到的参数的单位为Million,我们使用公式532015* 2 / 1000000计算得出结果为1.06403,作为下一步参数-Z的值.
    Smallest:            202
    Largest:             21302
    Average length:      2313.1
    

    运行

    # Rfam12.2.claninfo 为下载的claninfo文件,需提供所在路径
    # Rfam.cm 下载的cm文件
    # my-genome.fa 待查询序列
    # my-genome.cmscan 输出结果
    # my-genome.tblout 有一个输出结果
    cmscan -Z `esl-seqstat my-genome.fa | awk '{if($0~/^Total/) print int($4/2000000);}''` --cut_ga --rfam --nohmmonly --tblout my-genome.tblout --fmt 2 --clanin Rfam12.2.claninfo Rfam.cm my-genome.fa > my-genome.cmscan
    #根据参考博客的博主命令如上,但是自己的运行时总是报错,出不了结果
    

    根据官网给出的使用手册

    image.png

    根据使用手册运行的

    ~/software/infernal_bin/bin/cmscan ~/software/Rfam/Rfam.cm ../candidate_fasta/CPC_fasta/u_cpc.fasta
    #
     cmscan :: search sequence(s) against a CM database
    # INFERNAL 1.1.1 (July 2014)
    # Copyright (C) 2014 Howard Hughes Medical Institute.
    # Freely distributed under the GNU General Public License (GPLv3).
    # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    # query sequence file:                   ../candidate_fasta/CPC_fasta/u_cpc.fasta
    # target CM database:                    /root/software/Rfam/Rfam.cm
    # number of worker threads:              1
    # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    
    Query:       XLOC_000318::chr1:4780155-4784203()  [L=4048]
    
    Hit scores:
     rank     E-value  score  bias  modelname   start    end   mdl trunc   gc  description
     ----   --------- ------ -----  ---------- ------ ------   --- ----- ----  -----------
     ------ inclusion threshold ------
      (1) ?      0.14   15.6   1.0  snR73        2280   2346 + hmm     - 0.30  -
      (2) ?      0.17   18.2   0.2  sroH         2044   1992 -  cm    no 0.25  -
      (3) ?       1.5   12.2   0.2  Afu_328      3042   3012 - hmm     - 0.29  -
      (4) ?       5.5   10.7   1.6  adapt33_1    3099   3052 - hmm     - 0.23  -
      (5) ?       5.8   18.7   0.0  SNORD19      3298   3375 +  cm    no 0.40  -
      (6) ?       6.5   16.5   0.1  snoR66       2441   2506 +  cm    no 0.26  -
      (7) ?       7.6    9.3   2.3  DLX6-AS1_2    136    241 + hmm     - 0.33  -
      (8) ?       9.4   23.9   0.2  KRAS_3UTR    1432   1501 +  cm    no 0.26  -
    
    
    Hit alignments:
    >> snR73  
     rank     E-value  score  bias mdl mdl from   mdl to       seq from      seq to       acc trunc   gc
     ----   --------- ------ ----- --- -------- --------    ----------- -----------      ---- ----- ----
      (1) ?      0.14   15.6   1.0 hmm        1       67 [.        2280        2346 + .. 0.66     - 0.30
    
                                               ::::::::::::::::::::.::::::::::::::::::::::::::::::::::::::::.::::::: CS
                                    snR73    1 GUUUAUGAUGAuUucCacUU.aUCACGACGGUCAaCUGcGuUcuUCgAuUGUUUAuuuaaG.aACuUUG 67  
                                               GUU A GAUGAuUu  a+UU +UCA   C GUCAaCUG+G U+u C+  UG UUA   a+G +A uUU 
      XLOC_000318::chr1:4780155-4784203() 2280 GUUGAGGAUGAUUUUUAUUUaUUCAUAUCUGUCAACUGUGAUUUCCU--UGAUUAAACAGGuGAGUUUA 2346
                                               5778899******6666555499*****************9988774..55555544333323333333 PP
    ......................
    

    在这步,卡住了
    后续再继续...............
    [2019.8.20]

    参考:本地使用Rfam

    相关文章

      网友评论

          本文标题:Rfam本地安装教程

          本文链接:https://www.haomeiwen.com/subject/eskmsctx.html