美文网首页
[samtools] 统计 bam 中比对情况

[samtools] 统计 bam 中比对情况

作者: Silver_42ac | 来源:发表于2020-04-06 11:07 被阅读0次

使用命令 samtools stat;的信息

#Usage: samtools stats [OPTIONS] file.bam
#            samtools stats [OPTIONS] file.bam chr:from-to


详情可以参考这里 samtools-stats

输出实际上有两部分,前面是@SN 开头的,这里是我们看信息,还可以根据基因组总长度, 计算比对的碱基 在基因组上的平均深度
depth=bases mapped/genome len

The SN section contains a series of counts, percentages, and averages, in a similar style to samtools flagstat, but more comprehensive.

raw total sequences - total number of reads in a file. Same number reported by samtools view -c.

filtered sequences - number of discarded reads when using -f or -F option.

sequences - number of processed reads.

is sorted - flag indicating whether the file is coordinate sorted (1) or not (0).

1st fragments - number of first fragment reads (flags 0x01 not set; or flags 0x01 and 0x40 set, 0x80 not set).

last fragments - number of last fragment reads (flags 0x01 and 0x80 set, 0x40 not set).

reads mapped - number of reads, paired or single, that are mapped (flag 0x4 or 0x8 not set).

reads mapped and paired - number of mapped paired reads (flag 0x1 is set and flags 0x4 and 0x8 are not set).

reads unmapped - number of unmapped reads (flag 0x4 is set).

reads properly paired - number of mapped paired reads with flag 0x2 set.

paired - number of paired reads, mapped or unmapped, that are neither secondary nor supplementary (flag 0x1 is set and flags 0x100 (256) and 0x800 (2048) are not set).

reads duplicated - number of duplicate reads (flag 0x400 (1024) is set).

reads MQ0 - number of mapped reads with mapping quality 0.

reads QC failed - number of reads that failed the quality checks (flag 0x200 (512) is set).

non-primary alignments - number of secondary reads (flag 0x100 (256) set).

total length - number of processed bases from reads that are neither secondary nor supplementary (flags 0x100 (256) and 0x800 (2048) are not set).

total first fragment length - number of processed bases that belong to first fragments.

total last fragment length - number of processed bases that belong to last fragments.

bases mapped - number of processed bases that belong to reads mapped.

bases mapped (cigar) - number of mapped bases filtered by the CIGAR string corresponding to the read they belong to. Only alignment matches(M), inserts(I), sequence matches(=) and sequence mismatches(X) are counted.

bases trimmed - number of bases trimmed by bwa, that belong to non secondary and non supplementary reads. Enabled by -q option.

bases duplicated - number of bases that belong to reads duplicated.

mismatches - number of mismatched bases, as reported by the NM tag associated wit a read, if present.

error rate - ratio between mismatches and bases mapped (cigar).

average length - ratio between total length and sequences.

average first fragment length - ratio between total first fragment length and 1st fragments.

average last fragment length - ratio between total last fragment length and last fragments.

maximum length - length of the longest read (includes hard-clipped bases).

maximum first fragment length - length of the longest first fragment read (includes hard-clipped bases).

maximum last fragment length - length of the longest last fragment read (includes hard-clipped bases).

average quality - ratio between the sum of base qualities and total length.

insert size average - the average absolute template length for paired and mapped reads.

insert size standard deviation - standard deviation for the average template length distribution.

inward oriented pairs - number of paired reads with flag 0x40 (64) set and flag 0x10 (16) not set or with flag 0x80 (128) set and flag 0x10 (16) set.

outward oriented pairs - number of paired reads with flag 0x40 (64) set and flag 0x10 (16) set or with flag 0x80 (128) set and flag 0x10 (16) not set.

pairs with other orientation - number of paired reads that don't fall in any of the above two categories.

pairs on different chromosomes - number of pairs where one read is on one chromosome and the pair read is on a different chromosome.

percentage of properly paired reads - percentage of reads properly paired out of sequences.

bases inside the target - number of bases inside the target region(s) (when a target file is specified with -t option).

percentage of target genome with coverage > VAL - percentage of target bases with a coverage larger than VAL. By default, VAL is 0, but a custom value can be supplied by the user with -g option.

相关文章

网友评论

      本文标题:[samtools] 统计 bam 中比对情况

      本文链接:https://www.haomeiwen.com/subject/kopvphtx.html