在第二节中,我们采用了Trinity工具做了转录组数据的拼接,我一共是6个样本6个G的数据量,在我那个设置下跑了接近30多个小时就完成了拼接工作。
那么今天的工作就是通过RSeQC这个软件对拼接结果进行一个质量控制与可视化
这个软件主要是针对于一些临床RNAseq的数据以及有参考基因组的数据,但是对没有参考基因组的RNAseq数据就很多Tool没有办法使用。
首先,通过bowtie2对得到的Trinity拼接好的fasta格式进行构建Index
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ ll | sort -nk 7total 86667684
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:07 left.fa.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:07 right.fa.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:08 both.fa.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:19 .jellyfish_count.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:24 .jellyfish_dump.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:26 .jellyfish_histo.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 03:32 .iworm.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 03:32 .iworm_renamed.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 03:32 inchworm.K25.L25.DS.fa.finished
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 10:39 partitioned_reads.files.list.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 10:39 recursive_trinity.cmds.ok
-rw-rw-r-- 1 yeyt yeyt 9 Sep 14 01:08 both.fa.read_count
-rw-rw-r-- 1 yeyt yeyt 10 Sep 14 01:59 inchworm.kmer_count
-rw-rw-r-- 1 yeyt yeyt 2757 Sep 14 10:31 pipeliner.3855.cmds
-rw-rw-r-- 1 yeyt yeyt 22843 Sep 14 01:26 jellyfish.kmers.fa.histo
-rw-rw-r-- 1 yeyt yeyt 13366802 Sep 14 10:39 partitioned_reads.files.list
-rw-rw-r-- 1 yeyt yeyt 47753878 Sep 14 10:39 recursive_trinity.cmds
-rw-rw-r-- 1 yeyt yeyt 493022300 Sep 14 03:27 inchworm.K25.L25.DS.fa
-rw-rw-r-- 1 yeyt yeyt 11602365066 Sep 14 01:08 both.fa
-rw-rw-r-- 1 yeyt yeyt 26501596675 Sep 14 01:24 jellyfish.kmers.fa
-rw-rw-r-- 1 yeyt yeyt 49568938526 Sep 14 06:38 scaffolding_entries.sam
drwxrwxr-x 2 yeyt yeyt 4096 Sep 14 10:33 chrysalis/
drwxrwxr-x 3 yeyt yeyt 4096 Sep 14 01:03 insilico_read_normalization/
drwxrwxr-x 4 yeyt yeyt 4096 Sep 14 10:39 read_partitions/
-rw-rw-r-- 1 yeyt yeyt 0 Sep 15 20:51 align_stats.txt
-rw-rw-r-- 1 yeyt yeyt 62 Sep 15 20:51 bowtie2.bam
-rw-rw-r-- 1 yeyt yeyt 651 Sep 15 07:03 Trinity.timing
-rw-rw-r-- 1 yeyt yeyt 10213332 Sep 15 07:03 Trinity.fasta.gene_trans_map
-rw-rw-r-- 1 yeyt yeyt 47753878 Sep 15 07:03 recursive_trinity.cmds.completed
-rw-rw-r-- 1 yeyt yeyt 244740565 Sep 15 07:03 Trinity.fasta
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ bowtie2-build Trinity.fasta Trinity.fasta
Settings:
Output files: "Trinity.fasta.*.bt2"
Line rate: 6 (line is 64 bytes)
Lines per side: 1 (side is 64 bytes)
Offset rate: 4 (one in 16)
FTable chars: 10
Strings: unpacked
Max bucket size: default
Max bucket size, sqrt multiplier: default
Max bucket size, len divisor: 4
Difference-cover sample period: 1024
Endianness: little
Actual local endianness: little
Sanity checking: disabled
Assertions: disabled
Random seed: 0
Sizeofs: void*:8, int:4, long:8, size_t:8
Input files DNA, FASTA:
Trinity.fasta
Building a SMALL index
Reading reference sizes
Time reading reference sizes: 00:00:03
Calculating joined length
Writing header
Reserving space for joined string
Joining reference sequences
...
Exiting Ebwt::buildToDisk()
Returning from initFromVector
Wrote 103828770 bytes to primary EBWT file: Trinity.fasta.rev.1.bt2
Wrote 55572488 bytes to secondary EBWT file: Trinity.fasta.rev.2.bt2
Re-opening _in1 and _in2 as input streams
Returning from Ebwt constructor
Headers:
len: 222289920
bwtLen: 222289921
sz: 55572480
bwtSz: 55572481
lineRate: 6
offRate: 4
offMask: 0xfffffff0
ftabChars: 10
eftabLen: 20
eftabSz: 80
ftabLen: 1048577
ftabSz: 4194308
offsLen: 13893121
offsSz: 55572484
lineSz: 64
sideSz: 64
sideBwtSz: 48
sideBwtLen: 192
numSides: 1157761
numLines: 1157761
ebwtTotLen: 74096704
ebwtTotSz: 74096704
color: 0
reverse: 1
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ ll | sort -nk 7
total 86822504
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:07 left.fa.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:07 right.fa.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:08 both.fa.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:19 .jellyfish_count.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:24 .jellyfish_dump.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:26 .jellyfish_histo.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 03:32 .iworm.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 03:32 .iworm_renamed.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 03:32 inchworm.K25.L25.DS.fa.finished
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 10:39 partitioned_reads.files.list.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 10:39 recursive_trinity.cmds.ok
-rw-rw-r-- 1 yeyt yeyt 9 Sep 14 01:08 both.fa.read_count
-rw-rw-r-- 1 yeyt yeyt 10 Sep 14 01:59 inchworm.kmer_count
-rw-rw-r-- 1 yeyt yeyt 2757 Sep 14 10:31 pipeliner.3855.cmds
-rw-rw-r-- 1 yeyt yeyt 22843 Sep 14 01:26 jellyfish.kmers.fa.histo
-rw-rw-r-- 1 yeyt yeyt 13366802 Sep 14 10:39 partitioned_reads.files.list
-rw-rw-r-- 1 yeyt yeyt 47753878 Sep 14 10:39 recursive_trinity.cmds
-rw-rw-r-- 1 yeyt yeyt 493022300 Sep 14 03:27 inchworm.K25.L25.DS.fa
-rw-rw-r-- 1 yeyt yeyt 11602365066 Sep 14 01:08 both.fa
-rw-rw-r-- 1 yeyt yeyt 26501596675 Sep 14 01:24 jellyfish.kmers.fa
-rw-rw-r-- 1 yeyt yeyt 49568938526 Sep 14 06:38 scaffolding_entries.sam
drwxrwxr-x 2 yeyt yeyt 4096 Sep 14 10:33 chrysalis/
drwxrwxr-x 3 yeyt yeyt 4096 Sep 14 01:03 insilico_read_normalization/
drwxrwxr-x 4 yeyt yeyt 4096 Sep 14 10:39 read_partitions/
-rw-rw-r-- 1 yeyt yeyt 0 Sep 15 20:51 align_stats.txt
-rw-rw-r-- 1 yeyt yeyt 62 Sep 15 20:51 bowtie2.bam
-rw-rw-r-- 1 yeyt yeyt 651 Sep 15 07:03 Trinity.timing
-rw-rw-r-- 1 yeyt yeyt 10213332 Sep 15 07:03 Trinity.fasta.gene_trans_map
-rw-rw-r-- 1 yeyt yeyt 47753878 Sep 15 07:03 recursive_trinity.cmds.completed
-rw-rw-r-- 1 yeyt yeyt 244740565 Sep 15 07:03 Trinity.fasta
drwxrwxr-x 3 yeyt yeyt 4096 Sep 15 18:42 ../
drwxrwxr-x 5 yeyt yeyt 4096 Sep 15 20:51 ./
-rw-rw-r-- 1 yeyt yeyt 1984490 Sep 23 13:50 Trinity.fasta.3.bt2
-rw-rw-r-- 1 yeyt yeyt 55572480 Sep 23 13:50 Trinity.fasta.4.bt2
-rw-rw-r-- 1 yeyt yeyt 55572488 Sep 23 14:03 Trinity.fasta.2.bt2
-rw-rw-r-- 1 yeyt yeyt 55572488 Sep 23 14:16 Trinity.fasta.rev.2.bt2
-rw-rw-r-- 1 yeyt yeyt 103828770 Sep 23 14:03 Trinity.fasta.1.bt2
-rw-rw-r-- 1 yeyt yeyt 103828770 Sep 23 14:16 Trinity.fasta.rev.1.bt2
在最后生成的6个以bt2结尾的则是Index文件
接下来进行Bowtie2回贴并生成sam文件
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ bowtie2 -x Trinity.fasta -1 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/B251_1.P.fq.gz -2 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/B251_2.P.fq.gz -S B251.sam
#最后生成的以下文件log:
#回贴B251的双端测序结果
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ bowtie2 -x Trinity.fasta -1 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/B251_1.P.fq.gz -2 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/B251_2.P.fq.gz -S B251.sam
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = "en_US:en",
LC_ALL = (unset),
LC_PAPER = "zh_CN.UTF-8",
LC_ADDRESS = "zh_CN.UTF-8",
LC_MONETARY = "zh_CN.UTF-8",
LC_NUMERIC = "zh_CN.UTF-8",
LC_TELEPHONE = "zh_CN.UTF-8",
LC_IDENTIFICATION = "zh_CN.UTF-8",
LC_MEASUREMENT = "zh_CN.UTF-8",
LC_TIME = "zh_CN.UTF-8",
LC_NAME = "zh_CN.UTF-8",
LANG = "en_US.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
28213701 reads; of these:
28213701 (100.00%) were paired; of these:
3865337 (13.70%) aligned concordantly 0 times
2140365 (7.59%) aligned concordantly exactly 1 time
22207999 (78.71%) aligned concordantly >1 times
----
3865337 pairs aligned concordantly 0 times; of these:
134400 (3.48%) aligned discordantly 1 time
----
3730937 pairs aligned 0 times concordantly or discordantly; of these:
7461874 mates make up the pairs; of these:
2553395 (34.22%) aligned 0 times
273693 (3.67%) aligned exactly 1 time
4634786 (62.11%) aligned >1 times
95.47% overall alignment rate
#回贴B252的双端测序结果
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ bowtie2 -x Trinity.fasta -1 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/B252_1.P.fq.gz -2 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/B252_2.P.fq.gz -S B252.sam
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = "en_US:en",
LC_ALL = (unset),
LC_PAPER = "zh_CN.UTF-8",
LC_ADDRESS = "zh_CN.UTF-8",
LC_MONETARY = "zh_CN.UTF-8",
LC_NUMERIC = "zh_CN.UTF-8",
LC_TELEPHONE = "zh_CN.UTF-8",
LC_IDENTIFICATION = "zh_CN.UTF-8",
LC_MEASUREMENT = "zh_CN.UTF-8",
LC_TIME = "zh_CN.UTF-8",
LC_NAME = "zh_CN.UTF-8",
LANG = "en_US.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
24423445 reads; of these:
24423445 (100.00%) were paired; of these:
2755943 (11.28%) aligned concordantly 0 times
2003579 (8.20%) aligned concordantly exactly 1 time
19663923 (80.51%) aligned concordantly >1 times
----
2755943 pairs aligned concordantly 0 times; of these:
82738 (3.00%) aligned discordantly 1 time
----
2673205 pairs aligned 0 times concordantly or discordantly; of these:
5346410 mates make up the pairs; of these:
1943923 (36.36%) aligned 0 times
258490 (4.83%) aligned exactly 1 time
3143997 (58.81%) aligned >1 times
96.02% overall alignment rate
#回贴R251的双端测序结果
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ bowtie2 -x Trinity.fasta -1 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/R251_1.P.fq.gz -2 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/R251_2.P.fq.gz -S R251sam
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = "en_US:en",
LC_ALL = (unset),
LC_PAPER = "zh_CN.UTF-8",
LC_ADDRESS = "zh_CN.UTF-8",
LC_MONETARY = "zh_CN.UTF-8",
LC_NUMERIC = "zh_CN.UTF-8",
LC_TELEPHONE = "zh_CN.UTF-8",
LC_IDENTIFICATION = "zh_CN.UTF-8",
LC_MEASUREMENT = "zh_CN.UTF-8",
LC_TIME = "zh_CN.UTF-8",
LC_NAME = "zh_CN.UTF-8",
LANG = "en_US.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
24498964 reads; of these:
24498964 (100.00%) were paired; of these:
2605874 (10.64%) aligned concordantly 0 times
2058157 (8.40%) aligned concordantly exactly 1 time
19834933 (80.96%) aligned concordantly >1 times
----
2605874 pairs aligned concordantly 0 times; of these:
68645 (2.63%) aligned discordantly 1 time
----
2537229 pairs aligned 0 times concordantly or discordantly; of these:
5074458 mates make up the pairs; of these:
1920173 (37.84%) aligned 0 times
259673 (5.12%) aligned exactly 1 time
2894612 (57.04%) aligned >1 times
96.08% overall alignment rate
#回贴R252的双端测序结果
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ bowtie2 -x Trinity.fasta -1 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/R252_1.P.fq.gz -2 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/R252_2.P.fq.gz -S R252.sam
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = "en_US:en",
LC_ALL = (unset),
LC_PAPER = "zh_CN.UTF-8",
LC_ADDRESS = "zh_CN.UTF-8",
LC_MONETARY = "zh_CN.UTF-8",
LC_NUMERIC = "zh_CN.UTF-8",
LC_TELEPHONE = "zh_CN.UTF-8",
LC_IDENTIFICATION = "zh_CN.UTF-8",
LC_MEASUREMENT = "zh_CN.UTF-8",
LC_TIME = "zh_CN.UTF-8",
LC_NAME = "zh_CN.UTF-8",
LANG = "en_US.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
23929511 reads; of these:
23929511 (100.00%) were paired; of these:
3455581 (14.44%) aligned concordantly 0 times
1770888 (7.40%) aligned concordantly exactly 1 time
18703042 (78.16%) aligned concordantly >1 times
----
3455581 pairs aligned concordantly 0 times; of these:
132348 (3.83%) aligned discordantly 1 time
----
3323233 pairs aligned 0 times concordantly or discordantly; of these:
6646466 mates make up the pairs; of these:
2061887 (31.02%) aligned 0 times
216206 (3.25%) aligned exactly 1 time
4368373 (65.72%) aligned >1 times
95.69% overall alignment rate
#回贴W251的双端测序结果
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ bowtie2 -x Trinity.fasta -1 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/W251_1.P.fq.gz -2 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/W251_2.P.fq.gz -S W251.sam
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = "en_US:en",
LC_ALL = (unset),
LC_PAPER = "zh_CN.UTF-8",
LC_ADDRESS = "zh_CN.UTF-8",
LC_MONETARY = "zh_CN.UTF-8",
LC_NUMERIC = "zh_CN.UTF-8",
LC_TELEPHONE = "zh_CN.UTF-8",
LC_IDENTIFICATION = "zh_CN.UTF-8",
LC_MEASUREMENT = "zh_CN.UTF-8",
LC_TIME = "zh_CN.UTF-8",
LC_NAME = "zh_CN.UTF-8",
LANG = "en_US.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
25553075 reads; of these:
25553075 (100.00%) were paired; of these:
3705332 (14.50%) aligned concordantly 0 times
2003416 (7.84%) aligned concordantly exactly 1 time
19844327 (77.66%) aligned concordantly >1 times
----
3705332 pairs aligned concordantly 0 times; of these:
163553 (4.41%) aligned discordantly 1 time
----
3541779 pairs aligned 0 times concordantly or discordantly; of these:
7083558 mates make up the pairs; of these:
2021254 (28.53%) aligned 0 times
226959 (3.20%) aligned exactly 1 time
4835345 (68.26%) aligned >1 times
96.04% overall alignment rate
#回贴W252的双端测序结果
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ bowtie2 -x Trinity.fasta -1 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/W252_1.P.fq.gz -2 /home/yeyt/biodata/NH160034/NH160034/cleandata/assembly/W252_2.P.fq.gz -S W252.sam
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = "en_US:en",
LC_ALL = (unset),
LC_PAPER = "zh_CN.UTF-8",
LC_ADDRESS = "zh_CN.UTF-8",
LC_MONETARY = "zh_CN.UTF-8",
LC_NUMERIC = "zh_CN.UTF-8",
LC_TELEPHONE = "zh_CN.UTF-8",
LC_IDENTIFICATION = "zh_CN.UTF-8",
LC_MEASUREMENT = "zh_CN.UTF-8",
LC_TIME = "zh_CN.UTF-8",
LC_NAME = "zh_CN.UTF-8",
LANG = "en_US.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
24577100 reads; of these:
24577100 (100.00%) were paired; of these:
3173490 (12.91%) aligned concordantly 0 times
1898984 (7.73%) aligned concordantly exactly 1 time
19504626 (79.36%) aligned concordantly >1 times
----
3173490 pairs aligned concordantly 0 times; of these:
112017 (3.53%) aligned discordantly 1 time
----
3061473 pairs aligned 0 times concordantly or discordantly; of these:
6122946 mates make up the pairs; of these:
2060673 (33.65%) aligned 0 times
226885 (3.71%) aligned exactly 1 time
3835388 (62.64%) aligned >1 times
95.81% overall alignment rate
这个过程比较消耗时间,我们于此同时做个简单质量控制报告
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ $TRINITY_HOME/util/TrinityStats.pl Trinity.fasta > Trinitystats.log
#输出到Trinitystats.log文件
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ cat Trinitystats.log
################################
## Counts of transcripts, etc.
################################
Total trinity 'genes': 110851
Total trinity transcripts: 220498
Percent GC: 42.98
########################################
Stats based on ALL transcript contigs:
########################################
Contig N10: 4369
Contig N20: 3291
Contig N30: 2640
Contig N40: 2183
Contig N50: 1802
Median contig length: 542
Average contig: 1008.13
Total assembled bases: 222289920
#####################################################
## Stats based on ONLY LONGEST ISOFORM per 'GENE':
#####################################################
Contig N10: 3997
Contig N20: 2867
Contig N30: 2195
Contig N40: 1663
Contig N50: 1157
Median contig length: 364
Average contig: 686.86
Total assembled bases: 76139520
解释一下上面的结果。
首先做一个概括 拼接得到多少个基因,得到多少个转录本
然后平均的GC含量是多少
接下来做一个两个工作
一个是基于所有转录本的contig统计
一个是基于所有基因的统计
N50代表的是
接下来我们将把得到的sam结果转化成bam结果并进行排序以提供后期的分析文件
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ ls *sam | grep '25' |xargs -I [] echo 'samtools view -bS [] | samtools sort -o [].sorted.bam ' > samtoolssort.sh
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ cat samtoolssort.sh
samtools view -bS B251.sam | samtools sort -o B251.sam.sorted.bam
samtools view -bS B252.sam | samtools sort -o B252.sam.sorted.bam
samtools view -bS R251sam | samtools sort -o R251sam.sorted.bam
samtools view -bS R252.sam | samtools sort -o R252.sam.sorted.bam
samtools view -bS W251.sam | samtools sort -o W251.sam.sorted.bam
samtools view -bS W252.sam | samtools sort -o W252.sam.sorted.bam
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ bash samtoolssort.sh
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ bash samtoolssort.sh [bam_sort_core] merging from 41 files...
[bam_sort_core] merging from 36 files...
[bam_sort_core] merging from 36 files...
[bam_sort_core] merging from 35 files...
[bam_sort_core] merging from 38 files...
[bam_sort_core] merging from 36 files...
yeyt@ubuntu:~/biodata/NH160034/NH160034/cleandata/assembly/trinity_out_dir$ ll | sort -nk 7
total 234179528
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:07 left.fa.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:07 right.fa.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:08 both.fa.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:19 .jellyfish_count.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:24 .jellyfish_dump.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 01:26 .jellyfish_histo.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 03:32 .iworm.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 03:32 .iworm_renamed.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 03:32 inchworm.K25.L25.DS.fa.finished
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 10:39 partitioned_reads.files.list.ok
-rw-rw-r-- 1 yeyt yeyt 0 Sep 14 10:39 recursive_trinity.cmds.ok
-rw-rw-r-- 1 yeyt yeyt 9 Sep 14 01:08 both.fa.read_count
-rw-rw-r-- 1 yeyt yeyt 10 Sep 14 01:59 inchworm.kmer_count
-rw-rw-r-- 1 yeyt yeyt 2757 Sep 14 10:31 pipeliner.3855.cmds
-rw-rw-r-- 1 yeyt yeyt 22843 Sep 14 01:26 jellyfish.kmers.fa.histo
-rw-rw-r-- 1 yeyt yeyt 13366802 Sep 14 10:39 partitioned_reads.files.list
-rw-rw-r-- 1 yeyt yeyt 47753878 Sep 14 10:39 recursive_trinity.cmds
-rw-rw-r-- 1 yeyt yeyt 493022300 Sep 14 03:27 inchworm.K25.L25.DS.fa
-rw-rw-r-- 1 yeyt yeyt 11602365066 Sep 14 01:08 both.fa
-rw-rw-r-- 1 yeyt yeyt 26501596675 Sep 14 01:24 jellyfish.kmers.fa
-rw-rw-r-- 1 yeyt yeyt 49568938526 Sep 14 06:38 scaffolding_entries.sam
drwxrwxr-x 2 yeyt yeyt 4096 Sep 14 10:33 chrysalis/
drwxrwxr-x 3 yeyt yeyt 4096 Sep 14 01:03 insilico_read_normalization/
drwxrwxr-x 4 yeyt yeyt 4096 Sep 14 10:39 read_partitions/
-rw-rw-r-- 1 yeyt yeyt 0 Sep 15 20:51 align_stats.txt
-rw-rw-r-- 1 yeyt yeyt 62 Sep 15 20:51 bowtie2.bam
-rw-rw-r-- 1 yeyt yeyt 651 Sep 15 07:03 Trinity.timing
-rw-rw-r-- 1 yeyt yeyt 10213332 Sep 15 07:03 Trinity.fasta.gene_trans_map
-rw-rw-r-- 1 yeyt yeyt 47753878 Sep 15 07:03 recursive_trinity.cmds.completed
-rw-rw-r-- 1 yeyt yeyt 244740565 Sep 15 07:03 Trinity.fasta
drwxrwxr-x 3 yeyt yeyt 4096 Sep 15 18:42 ../
-rw-rw-r-- 1 yeyt yeyt 821 Sep 23 15:17 Trinitystats.log
-rw-rw-r-- 1 yeyt yeyt 1984490 Sep 23 13:50 Trinity.fasta.3.bt2
-rw-rw-r-- 1 yeyt yeyt 55572480 Sep 23 13:50 Trinity.fasta.4.bt2
-rw-rw-r-- 1 yeyt yeyt 55572488 Sep 23 14:03 Trinity.fasta.2.bt2
-rw-rw-r-- 1 yeyt yeyt 55572488 Sep 23 14:16 Trinity.fasta.rev.2.bt2
-rw-rw-r-- 1 yeyt yeyt 103828770 Sep 23 14:03 Trinity.fasta.1.bt2
-rw-rw-r-- 1 yeyt yeyt 103828770 Sep 23 14:16 Trinity.fasta.rev.1.bt2
-rw-rw-r-- 1 yeyt yeyt 400 Sep 24 13:22 samtoolssort.sh
-rw-rw-r-- 1 yeyt yeyt 3049959975 Sep 24 15:37 R252.sam.sorted.bam
-rw-rw-r-- 1 yeyt yeyt 3181086895 Sep 24 16:44 W252.sam.sorted.bam
-rw-rw-r-- 1 yeyt yeyt 3192193677 Sep 24 15:06 R251.sam.sorted.bam
-rw-rw-r-- 1 yeyt yeyt 3206939510 Sep 24 14:33 B252.sam.sorted.bam
-rw-rw-r-- 1 yeyt yeyt 3267705730 Sep 24 16:11 W251.sam.sorted.bam
-rw-rw-r-- 1 yeyt yeyt 3655386513 Sep 24 14:01 B251.sam.sorted.bam
-rw-rw-r-- 1 yeyt yeyt 20770276094 Sep 24 01:49 R252.sam
-rw-rw-r-- 1 yeyt yeyt 21235142607 Sep 24 02:03 B252.sam
-rw-rw-r-- 1 yeyt yeyt 21293400430 Sep 24 02:07 R251sam
-rw-rw-r-- 1 yeyt yeyt 21346715631 Sep 24 02:15 W252.sam
-rw-rw-r-- 1 yeyt yeyt 22197735984 Sep 24 02:29 W251.sam
-rw-rw-r-- 1 yeyt yeyt 24496840308 Sep 24 03:04 B251.sam
这样我们就得到了6个sort后的bam文件
采用以下工具
bam_stat.py
clipping_profile.py
inner_distance.py
read_duplication.py
read_GC.py
网友评论