目录
1.Module 1 - Introduction to RNA sequencing
- Installation
- Reference Genomes
- Annotations
- Indexing
- RNA-seq Data
- Pre-Alignment QC
2.Module 2 - RNA-seq Alignment and Visualization
- Adapter Trim
- Alignment
- IGV
- Alignment Visualization
- Alignment QC
3.Module 3 - Expression and Differential Expression
- Expression
- Differential Expression
- DE Visualization
- Kallisto for Reference-Free Abundance Estimation
4.Module 4 - Isoform Discovery and Alternative Expression
- Reference Guided Transcript Assembly
- de novo Transcript Assembly
- Transcript Assembly Merge
- Differential Splicing
- Splicing Visualization
5.Module 5 - De novo transcript reconstruction
- De novo RNA-Seq Assembly and Analysis Using Trinity
6.Module 6 - Functional Annotation of Transcripts
- Functional Annotation of Assembled Transcripts Using Trinotate
2.5 Alignment QC
使用samtools和FastQC来评估比对结果
使用samtools view
查看SAM/BAM比对文件的格式
samtools view -H UHR.bam
samtools view UHR.bam | head
尝试过滤BAM文件以排除某些标记。这可以通过samtools view -f -F
选项实现
-f INT required flag -F INT filtering flag
Try requiring that alignments are 'paired' and 'mapped in a proper pair' (=3). Also filter out alignments that are 'unmapped', the 'mate is unmapped', and 'not primary alignment' (=268)
samtools view -f 3 -F 268 UHR.bam | head
现在要求只对“PCR or optical duplicate”进行比对。有多少reads符合这个标准?为什么?
samtools view -f 1024 UHR.bam | head
使用samtools flagstat
获取比对的基本情况。比对到上reads的百分比是多少?
samtools flagstat UHR.bam
1174953 + 0 in total (QC-passed reads + QC-failed reads)
24539 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
1173741 + 0 mapped (99.90% : N/A)
1150414 + 0 paired in sequencing
575207 + 0 read1
575207 + 0 read2
1143860 + 0 properly paired (99.43% : N/A)
1148598 + 0 with itself and mate mapped
604 + 0 singletons (0.05% : N/A)
6 + 0 with mate mapped to a different chr
6 + 0 with mate mapped to a different chr (mapQ>=5)
samtools flagstat HBR.bam
793963 + 0 in total (QC-passed reads + QC-failed reads)
7597 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
793356 + 0 mapped (99.92% : N/A)
786366 + 0 paired in sequencing
393183 + 0 read1
393183 + 0 read2
783124 + 0 properly paired (99.59% : N/A)
785496 + 0 with itself and mate mapped
263 + 0 singletons (0.03% : N/A)
0 + 0 with mate mapped to a different chr
0 + 0 with mate mapped to a different chr (mapQ>=5)
SAM/BAM格式的详细信息可以在这里找到:http://samtools.sourceforge.net/SAM1.pdf
Using FastQC
你可以使用FastQC来执行基本的BAM文件的QC(Pre-Alignment QC)。输出非常类似于在fastq文件上运行FastQC。
网友评论