RepeatMasker

作者: 6有才 | 来源:发表于2016-04-23 20:57 被阅读902次

    What

    • RepeatMasker是一款基于Library-based,通过相似性比对来识别重复序列,可以屏蔽序列中转座子重复序列和低复杂度序列(默认将其替换成N),几乎用于所有物种,是做基因组、非编码RNA的必备软件。在人类基因组分析当中,大约 56% 的序列会被mask;RepeatMasker在进行序列比对时可以选用常见的几种算法,包括nhmmer、cross_match、ABBlast/WUBlast、RMBlast 、Decypher(可以安装多个比对引擎,但每次只能使用其中一个)。

    • Repbase是由美国遗传信息研究所(GIRI)创建并维护,收录了转座子和其他重复序列及其注释信息。

    RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library derived from Repbase sequences ) and Repbase, a service of the Genetic Information Research Institute.

    在线服务

    • RepeatMasker提供了在线服务,将核酸序列或者FASTA文件上传,选择比对程序、速度/特异性、物种以及结果呈现形式,点击提交,几分钟之后即可得到结果,实乃一大利器。
    • Search Engine
      • abblast
      • rmblast
      • hmmer
      • cross_match
    • Speed/Sensitivity
    • rush
    • quick
    • default
    • slow
    • DNA source
    • Human
    • Mouse
    • Arabidopsis

    本地安装RepeatMasker

    本地安装RepeatMasker,除了需要RepeatMasker主程序外,还需要TRF(Tandem Repeats Finder)、序列搜索引擎(以RMBlast为例)以及Repbase数据库。

    wget http://tandem.bu.edu/trf/downloads/trf407b.linux
    sudo mv trf407b.linux /usr/local/bin/trf # 记住这个地址1
    sudo /usr/local/bin/trf
    
    • RMBlast
    wget ftp://ftp.ncbi.nlm.nih.gov/blast/executables/rmblast/2.2.28/ncbi-rmblastn-2.2.28-src.tar.gz
    tar -zvcf ncbi-rmblastn-2.2.28-src.tar.gz
    cd ncbi-rmblastn-2.2.28-src/c++
    ./configure --with-mt --prefix=/usr/local/rmblast --without-debug
    make
    sudo make install
    # 记住安装RMBlast的地址2, */ncbi-rmblastn-2.2.28-src/c++/GCC480-ReleaseMT64/bin
    
    • Repbase
      这个需要在官网注册才能下载,其中商业机构需要收费,非营利性组织可以免费使用,人工审批!也可以Google、百度上找资源,下载后解压备用。

    • RepeatMasker

    wget http://www.repeatmasker.org/RepeatMasker-open-4-0-6.tar.gz
    cd RepeatMasker
    perl configure
    <PRESS ENTER TO CONTINUE> # 回车继续
    Enter path [ ]: # 输入perl程序路径
    Enter path [ ]: # 输入RepeatMasker要安装的路径
    Enter path [ ]: # 输入TRF路径(地址1)
    
    Add a Search Engine: # 选择一个搜索引擎(需要事先安装好),并输入引擎路径(地址2)
    1. CrossMatch: [ Un-configured ]
    2. RMBlast - NCBI Blast with RepeatMasker extensions: [ Un-configured ]
    3. WUBlast/ABBlast (required by DupMasker): [ Un-configured ]
    4. HMMER3.1 & DFAM: [ Un-configured ]
    5. Done
    Do you want RMBlast to be your default # 设置默认搜索引擎
    search engine for Repeatmasker? (Y/N)  [ Y ]: 
    # 可以安装多个引擎,完成后按5
    Congratulations!  RepeatMasker is now ready to use. # 提示已经安装完成
    # RepeatMasker已经安装完成,下一步将之前下载解压的Repbase文件COPY到RepeatMasker安装路径下的Libraries文件夹中即可
    
    • Simple ues
    RepeatMasker -species human test.fa
    

    更多详细内容请期待后续更新!

    相关文章

      网友评论

      • 租房那些事儿:同求repbase谢谢!
      • sea200k:您好,看了您的文章很有帮助,不知道您这里有没有repbase 的下载途径呢?我在官网注册一直没有受到反馈,您能否共享一下这个database啊?万分感谢🙏

      本文标题:RepeatMasker

      本文链接:https://www.haomeiwen.com/subject/cwtfrttx.html