1.下载安装
cd /public/jychu/soft/
wget http://bioinfo.lifl.fr/RNA/sortmerna/code/sortmerna-2.1-linux-64-multithread.tar.gz
tar -xvf sortmerna-2.1-linux-64-multithread.tar.gz
mv sortmerna-2.1b sortmerna-2.1
cd sortmerna-2.1
cp indexdb_rna sortmerna ../bin/
export PATH="$PATH:/public/jychu/soft/bin"
sortmerna -h #检查是否安装成功
2.为数据库建立索引
indexdb_rna --ref \
./rRNA_databases/silva-bac-16s-id90.fasta,./index/silva-bac-16s-db:\
./rRNA_databases/silva-bac-23s-id98.fasta,./index/silva-bac-23s-db:\
./rRNA_databases/silva-arc-16s-id95.fasta,./index/silva-arc-16s-db:\
./rRNA_databases/silva-arc-23s-id98.fasta,./index/silva-arc-23s-db:\
./rRNA_databases/silva-euk-18s-id95.fasta,./index/silva-euk-18s-db:\
./rRNA_databases/silva-euk-28s-id98.fasta,./index/silva-euk-28s:\
./rRNA_databases/rfam-5s-database-id98.fasta,./index/rfam-5s-db:\
./rRNA_databases/rfam-5.8s-database-id98.fasta,./index/rfam-5.8s-db #30min
3.合并双端测序文件
cd /public/jychu/Lishaomei/goose-cleandata-neck/norRNA
gzip -d *.fq.gz
/public/jychu/soft/sortmerna-2.1/scripts/merge-paired-reads.sh neck-1-1.clean.fq neck-1-2.clean.fq merged_neck-1.clean.fq #合并双端数据
4.鉴定和过滤rRNA
cd /public/jychu/soft/sortmerna-2.1
nohup ./sortmerna --ref rRNA_databases/silva-bac-16s-id90.fasta,index/silva-bac-16s-db:rRNA_databases/silva-bac-23s-id98.fasta,index/silva-bac-23s-db:rRNA_databases/silva-arc-16s-id95.fasta,index/silva-arc-16s-db:rRNA_databases/silva-arc-23s-id98.fasta,index/silva-arc-23s-db:rRNA_databases/silva-euk-18s-id95.fasta,index/silva-euk-18s-db:rRNA_databases/silva-euk-28s-id98.fasta,index/silva-euk-28s:rRNA_databases/rfam-5s-database-id98.fasta,index/rfam-5s-db:rRNA_databases/rfam-5.8s-database-id98.fasta,index/rfam-5.8s-db --reads /public/jychu/Lishaomei/goose-cleandata-neck/norRNA/merged_neck-1.clean.fq --aligned /public/jychu/Lishaomei/goose-cleandata-neck/norRNA/merged_neck-1_aligned_rRNA --fastx --sam --num_alignments 1 --other /public/jychu/Lishaomei/goose-cleandata-neck/norRNA/merged_neck-1_filtered_non_rRNA --paired_in --log -v &
运行成功的结果如下图
![](https://img.haomeiwen.com/i20853302/ca60dee4afb5c9fe.png)
- 运行后的文件目录如下
(base) [jychu@localhost norRNA]$ ls
merged_neck-1_aligned_rRNA.fq merged_neck-1_aligned_rRNA.sam merged_scale-10.clean.fq merged_scale-12.clean.fq merged_scale-8.clean.fq
merged_neck-1_aligned_rRNA.log merged_neck-1_filtered_non_rRNA.fq.gz merged_scale-11.clean.fq merged_scale-7.clean.fq merged_scale-9.clean.fq
(base) [jychu@localhost norRNA]$ less -S merged_neck-1_aligned_rRNA.log
Minimal SW score based on E-value = 61
Index: index/rfam-5s-db
Seed length = 18
Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
Gumbel lambda = 0.616694
Gumbel K = 0.342032
Minimal SW score based on E-value = 59
Index: index/rfam-5.8s-db
Seed length = 18
Pass 1 = 18, Pass 2 = 9, Pass 3 = 3
Gumbel lambda = 0.617555
Gumbel K = 0.343861
Minimal SW score based on E-value = 57
Number of seeds = 2
Edges = 4 (as integer)
SW match = 2
SW mismatch = -3
SW gap open penalty = 5
SW gap extend penalty = 2
SW ambiguous nucleotide = -3
SQ tags are not output
Number of threads = 1
Reads file = /public/jychu/Lishaomei/goose-cleandata-neck/norRNA/merged_neck-1.clean.fq
Results:
Total reads = 37894820
Total reads passing E-value threshold = 592626 (1.56%)
Total reads failing E-value threshold = 37302194 (98.44%)
Minimum read length = 150
Maximum read length = 150
Mean read length = 150
By database:
rRNA_databases/silva-bac-16s-id90.fasta 0.15%
rRNA_databases/silva-bac-23s-id98.fasta 0.19%
rRNA_databases/silva-arc-16s-id95.fasta 0.03%
rRNA_databases/silva-arc-23s-id98.fasta 0.12%
rRNA_databases/silva-euk-18s-id95.fasta 0.33%
rRNA_databases/silva-euk-28s-id98.fasta 0.73%
rRNA_databases/rfam-5s-database-id98.fasta 0.00%
rRNA_databases/rfam-5.8s-database-id98.fasta 0.00%
Wed Jan 6 12:35:55 2021
网友评论