美文网首页
过滤核糖体RNA

过滤核糖体RNA

作者: 宗肃書 | 来源:发表于2021-01-05 15:05 被阅读0次
    1.下载安装
    cd /public/jychu/soft/
    wget http://bioinfo.lifl.fr/RNA/sortmerna/code/sortmerna-2.1-linux-64-multithread.tar.gz
    tar -xvf sortmerna-2.1-linux-64-multithread.tar.gz
    mv sortmerna-2.1b sortmerna-2.1
    cd sortmerna-2.1
    cp indexdb_rna sortmerna ../bin/
    export PATH="$PATH:/public/jychu/soft/bin"
    sortmerna -h    #检查是否安装成功
    
    2.为数据库建立索引
    indexdb_rna --ref \
    ./rRNA_databases/silva-bac-16s-id90.fasta,./index/silva-bac-16s-db:\
    ./rRNA_databases/silva-bac-23s-id98.fasta,./index/silva-bac-23s-db:\
    ./rRNA_databases/silva-arc-16s-id95.fasta,./index/silva-arc-16s-db:\
    ./rRNA_databases/silva-arc-23s-id98.fasta,./index/silva-arc-23s-db:\
    ./rRNA_databases/silva-euk-18s-id95.fasta,./index/silva-euk-18s-db:\
    ./rRNA_databases/silva-euk-28s-id98.fasta,./index/silva-euk-28s:\
    ./rRNA_databases/rfam-5s-database-id98.fasta,./index/rfam-5s-db:\
    ./rRNA_databases/rfam-5.8s-database-id98.fasta,./index/rfam-5.8s-db   #30min
    
    3.合并双端测序文件
    cd /public/jychu/Lishaomei/goose-cleandata-neck/norRNA
    gzip  -d *.fq.gz
    /public/jychu/soft/sortmerna-2.1/scripts/merge-paired-reads.sh neck-1-1.clean.fq neck-1-2.clean.fq merged_neck-1.clean.fq     #合并双端数据
    
    4.鉴定和过滤rRNA
    cd /public/jychu/soft/sortmerna-2.1
    nohup ./sortmerna --ref rRNA_databases/silva-bac-16s-id90.fasta,index/silva-bac-16s-db:rRNA_databases/silva-bac-23s-id98.fasta,index/silva-bac-23s-db:rRNA_databases/silva-arc-16s-id95.fasta,index/silva-arc-16s-db:rRNA_databases/silva-arc-23s-id98.fasta,index/silva-arc-23s-db:rRNA_databases/silva-euk-18s-id95.fasta,index/silva-euk-18s-db:rRNA_databases/silva-euk-28s-id98.fasta,index/silva-euk-28s:rRNA_databases/rfam-5s-database-id98.fasta,index/rfam-5s-db:rRNA_databases/rfam-5.8s-database-id98.fasta,index/rfam-5.8s-db --reads /public/jychu/Lishaomei/goose-cleandata-neck/norRNA/merged_neck-1.clean.fq --aligned /public/jychu/Lishaomei/goose-cleandata-neck/norRNA/merged_neck-1_aligned_rRNA --fastx --sam --num_alignments 1 --other /public/jychu/Lishaomei/goose-cleandata-neck/norRNA/merged_neck-1_filtered_non_rRNA --paired_in --log -v &
    

    运行成功的结果如下图

    image.png
    • 运行后的文件目录如下
    (base) [jychu@localhost norRNA]$ ls
    merged_neck-1_aligned_rRNA.fq   merged_neck-1_aligned_rRNA.sam         merged_scale-10.clean.fq  merged_scale-12.clean.fq  merged_scale-8.clean.fq
    merged_neck-1_aligned_rRNA.log  merged_neck-1_filtered_non_rRNA.fq.gz  merged_scale-11.clean.fq  merged_scale-7.clean.fq   merged_scale-9.clean.fq
    
    
    (base) [jychu@localhost norRNA]$ less -S merged_neck-1_aligned_rRNA.log
         Minimal SW score based on E-value = 61
        Index: index/rfam-5s-db
         Seed length = 18
         Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
         Gumbel lambda = 0.616694
         Gumbel K = 0.342032
         Minimal SW score based on E-value = 59
        Index: index/rfam-5.8s-db
         Seed length = 18
         Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
         Gumbel lambda = 0.617555
         Gumbel K = 0.343861
         Minimal SW score based on E-value = 57
        Number of seeds = 2
        Edges = 4 (as integer)
        SW match = 2
        SW mismatch = -3
        SW gap open penalty = 5
        SW gap extend penalty = 2
        SW ambiguous nucleotide = -3
        SQ tags are not output
        Number of threads = 1
        Reads file = /public/jychu/Lishaomei/goose-cleandata-neck/norRNA/merged_neck-1.clean.fq
    
     Results:
        Total reads = 37894820
        Total reads passing E-value threshold = 592626 (1.56%)
        Total reads failing E-value threshold = 37302194 (98.44%)
        Minimum read length = 150
        Maximum read length = 150
        Mean read length = 150
     By database:
        rRNA_databases/silva-bac-16s-id90.fasta             0.15%
        rRNA_databases/silva-bac-23s-id98.fasta             0.19%
        rRNA_databases/silva-arc-16s-id95.fasta             0.03%
        rRNA_databases/silva-arc-23s-id98.fasta             0.12%
        rRNA_databases/silva-euk-18s-id95.fasta             0.33%
        rRNA_databases/silva-euk-28s-id98.fasta             0.73%
        rRNA_databases/rfam-5s-database-id98.fasta          0.00%
        rRNA_databases/rfam-5.8s-database-id98.fasta                0.00%
    
     Wed Jan  6 12:35:55 2021
    
    

    相关文章

      网友评论

          本文标题:过滤核糖体RNA

          本文链接:https://www.haomeiwen.com/subject/lwxwoktx.html